Search results for: Density based clustering
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 12085

Search results for: Density based clustering

11995 Cr Induced Magnetization in Zinc-Blende ZnO Based Diluted Magnetic Semiconductors

Authors: Bakhtiar Ul Haq, R. Ahmed, A. Shaari, Mazmira Binti Mohamed, Nisar Ali

Abstract:

The capability of exploiting the electronic charge and spin properties simultaneously in a single material has made diluted magnetic semiconductors (DMS) remarkable in the field of spintronics. We report the designing of DMS based on zinc-blend ZnO doped with Cr impurity. The full potential linearized augmented plane wave plus local orbital FP-L(APW+lo) method in density functional theory (DFT) has been adapted to carry out these investigations. For treatment of exchange and correlation energy, generalized gradient approximations have been used. Introducing Cr atoms in the matrix of ZnO has induced strong magnetic moment with ferromagnetic ordering at stable ground state. Cr:ZnO was found to favor the short range magnetic interaction that reflect tendency of Cr clustering. The electronic structure of ZnO is strongly influenced in the presence of Cr impurity atoms where impurity bands appear in the band gap.

Keywords: ZnO, Density functional theory, Diluted magnetic semiconductors, Ferromagnetic materials, FP-L(APW+lo).

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1835
11994 Pattern Recognition Using Feature Based Die-Map Clusteringin the Semiconductor Manufacturing Process

Authors: Seung Hwan Park, Cheng-Sool Park, Jun Seok Kim, Youngji Yoo, Daewoong An, Jun-Geol Baek

Abstract:

Depending on the big data analysis becomes important, yield prediction using data from the semiconductor process is essential. In general, yield prediction and analysis of the causes of the failure are closely related. The purpose of this study is to analyze pattern affects the final test results using a die map based clustering. Many researches have been conducted using die data from the semiconductor test process. However, analysis has limitation as the test data is less directly related to the final test results. Therefore, this study proposes a framework for analysis through clustering using more detailed data than existing die data. This study consists of three phases. In the first phase, die map is created through fail bit data in each sub-area of die. In the second phase, clustering using map data is performed. And the third stage is to find patterns that affect final test result. Finally, the proposed three steps are applied to actual industrial data and experimental results showed the potential field application.

Keywords: Die-Map Clustering, Feature Extraction, Pattern Recognition, Semiconductor Manufacturing Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3096
11993 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916
11992 Development of a Clustered Network based on Unique Hop ID

Authors: Hemanth Kumar, A. R., Sudhakar G, Satyanarayana B. S.

Abstract:

In this paper, Land Marks for Unique Addressing( LMUA) algorithm is develped to generate unique ID for each and every node which leads to the formation of overlapping/Non overlapping clusters based on unique ID. To overcome the draw back of the developed LMUA algorithm, the concept of clustering is introduced. Based on the clustering concept a Land Marks for Unique Addressing and Clustering(LMUAC) Algorithm is developed to construct strictly non-overlapping clusters and classify those nodes in to Cluster Heads, Member Nodes, Gate way nodes and generating the Hierarchical code for the cluster heads to operate in the level one hierarchy for wireless communication switching. The expansion of the existing network can be performed or not without modifying the cost of adding the clusterhead is shown. The developed algorithm shows one way of efficiently constructing the

Keywords: Cluster Dimension, Cluster Basis, Metric Dimension, Metric Basis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263
11991 Efficient Mean Shift Clustering Using Exponential Integral Kernels

Authors: S. Sutor, R. Röhr, G. Pujolle, R. Reda

Abstract:

This paper presents a highly efficient algorithm for detecting and tracking humans and objects in video surveillance sequences. Mean shift clustering is applied on backgrounddifferenced image sequences. For efficiency, all calculations are performed on integral images. Novel corresponding exponential integral kernels are introduced to allow the application of nonuniform kernels for clustering, which dramatically increases robustness without giving up the efficiency of the integral data structures. Experimental results demonstrating the power of this approach are presented.

Keywords: Clustering, Integral Images, Kernels, Person Detection, Person Tracking, Intelligent Video Surveillance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1487
11990 A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, M. Sharmila

Abstract:

Over the past epoch a rampant amount of work has been done in the data clustering research under the unsupervised learning technique in Data mining. Furthermore several algorithms and methods have been proposed focusing on clustering different data types, representation of cluster models, and accuracy rates of the clusters. However no single clustering algorithm proves to be the most efficient in providing best results. Accordingly in order to find the solution to this issue a new technique, called Cluster ensemble method was bloomed. This cluster ensemble is a good alternative approach for facing the cluster analysis problem. The main hope of the cluster ensemble is to merge different clustering solutions in such a way to achieve accuracy and to improve the quality of individual data clustering. Due to the substantial and unremitting development of new methods in the sphere of data mining and also the incessant interest in inventing new algorithms, makes obligatory to scrutinize a critical analysis of the existing techniques and the future novelty. This paper exposes the comparative study of different cluster ensemble methods along with their features, systematic working process and the average accuracy and error rates of each ensemble methods. Consequently this speculative and comprehensive analysis will be very useful for the community of clustering practitioners and also helps in deciding the most suitable one to rectify the problem in hand.

Keywords: Clustering, Cluster Ensemble methods, Co-association matrix, Consensus function, Median partition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2555
11989 Design and Implementation a New Energy Efficient Clustering Algorithm using Genetic Algorithm for Wireless Sensor Networks

Authors: Moslem Afrashteh Mehr

Abstract:

Wireless Sensor Networks consist of small battery powered devices with limited energy resources. once deployed, the small sensor nodes are usually inaccessible to the user, and thus replacement of the energy source is not feasible. Hence, One of the most important issues that needs to be enhanced in order to improve the life span of the network is energy efficiency. to overcome this demerit many research have been done. The clustering is the one of the representative approaches. in the clustering, the cluster heads gather data from nodes and sending them to the base station. In this paper, we introduce a dynamic clustering algorithm using genetic algorithm. This algorithm takes different parameters into consideration to increase the network lifetime. To prove efficiency of proposed algorithm, we simulated the proposed algorithm compared with LEACH algorithm using the matlab

Keywords: Wireless Sensor Networks, Clustering, Geneticalgorithm, Energy Consumption

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2834
11988 Analysis of Cooperative Learning Behavior Based on the Data of Students' Movement

Authors: Wang Lin, Li Zhiqiang

Abstract:

The purpose of this paper is to analyze the cooperative learning behavior pattern based on the data of students' movement. The study firstly reviewed the cooperative learning theory and its research status, and briefly introduced the k-means clustering algorithm. Then, it used clustering algorithm and mathematical statistics theory to analyze the activity rhythm of individual student and groups in different functional areas, according to the movement data provided by 10 first-year graduate students. It also focused on the analysis of students' behavior in the learning area and explored the law of cooperative learning behavior. The research result showed that the cooperative learning behavior analysis method based on movement data proposed in this paper is feasible. From the results of data analysis, the characteristics of behavior of students and their cooperative learning behavior patterns could be found.

Keywords: Behavior pattern, cooperative learning, data analyze, K-means clustering algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 760
11987 Graph-Based Text Similarity Measurement by Exploiting Wikipedia as Background Knowledge

Authors: Lu Zhang, Chunping Li, Jun Liu, Hui Wang

Abstract:

Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering. However, prevailing approaches based on Vector Space Model (VSM) more or less suffer from the limitation of Bag of Words (BOW), which ignores the semantic relationship among words. Enriching document representation with background knowledge from Wikipedia is proven to be an effective way to solve this problem, but most existing methods still cannot avoid similar flaws of BOW in a new vector space. In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents. Specifically, it is a unified graph model that exploits Wikipedia as background knowledge and synthesizes both document representation and similarity computation. The experimental results on two different datasets show that our approach significantly improves VSM-based methods in both text clustering and classification.

Keywords: Text classification, Text clustering, Text similarity, Wikipedia

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2061
11986 A Novel Microarray Biclustering Algorithm

Authors: Chieh-Yuan Tsai, Chuang-Cheng Chiu

Abstract:

Biclustering aims at identifying several biclusters that reveal potential local patterns from a microarray matrix. A bicluster is a sub-matrix of the microarray consisting of only a subset of genes co-regulates in a subset of conditions. In this study, we extend the motif of subspace clustering to present a K-biclusters clustering (KBC) algorithm for the microarray biclustering issue. Besides minimizing the dissimilarities between genes and bicluster centers within all biclusters, the objective function of the KBC algorithm additionally takes into account how to minimize the residues within all biclusters based on the mean square residue model. In addition, the objective function also maximizes the entropy of conditions to stimulate more conditions to contribute the identification of biclusters. The KBC algorithm adopts the K-means type clustering process to efficiently make the partition of K biclusters be optimized. A set of experiments on a practical microarray dataset are demonstrated to show the performance of the proposed KBC algorithm.

Keywords: Microarray, Biclustering, Subspace clustering, Meansquare residue model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1566
11985 Energy and Distance Based Clustering: An Energy Efficient Clustering Method for Wireless Sensor Networks

Authors: Mehdi Saeidmanesh, Mojtaba Hajimohammadi, Ali Movaghar

Abstract:

In this paper, we propose an energy efficient cluster based communication protocol for wireless sensor network. Our protocol considers both the residual energy of sensor nodes and the distance of each node from the BS when selecting cluster-head. This protocol can successfully prolong the network-s lifetime by 1) reducing the total energy dissipation on the network and 2) evenly distributing energy consumption over all sensor nodes. In this protocol, the nodes with more energy and less distance from the BS are probable to be selected as cluster-head. Simulation results with MATLAB show that proposed protocol could increase the lifetime of network more than 94% for first node die (FND), and more than 6% for the half of the nodes alive (HNA) factor as compared with conventional protocols.

Keywords: Clustering methods, energy efficiency, routing protocol, wireless sensor networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2099
11984 Innovation Development of Food Market of Kazakhstan

Authors: G.B. Nurlikhina, G.N. Azhimetova, R. M. Ashimova

Abstract:

Currently, one of the main directions is developing of development based on the clustering of economic operations of Kazakhstan, providing for the organization and concentration of production capacity in one region or the most optimal system. In the modern economic literature clustering is regarded as one of the most effective tools to ensure competitive businesses, and improve their business itself.

Keywords: A cluster, food market, innovation cluster.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2204
11983 A Balanced Cost Cluster-Heads Selection Algorithm for Wireless Sensor Networks

Authors: Ouadoudi Zytoune, Youssef Fakhri, Driss Aboutajdine

Abstract:

This paper focuses on reducing the power consumption of wireless sensor networks. Therefore, a communication protocol named LEACH (Low-Energy Adaptive Clustering Hierarchy) is modified. We extend LEACHs stochastic cluster-head selection algorithm by a modifying the probability of each node to become cluster-head based on its required energy to transmit to the sink. We present an efficient energy aware routing algorithm for the wireless sensor networks. Our contribution consists in rotation selection of clusterheads considering the remoteness of the nodes to the sink, and then, the network nodes residual energy. This choice allows a best distribution of the transmission energy in the network. The cluster-heads selection algorithm is completely decentralized. Simulation results show that the energy is significantly reduced compared with the previous clustering based routing algorithm for the sensor networks.

Keywords: Wireless Sensor Networks, Energy efficiency, WirelessCommunications, Clustering-based algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2600
11982 Agglomerative Hierarchical Clustering Using the Tθ Family of Similarity Measures

Authors: Salima Kouici, Abdelkader Khelladi

Abstract:

In this work, we begin with the presentation of the Tθ family of usual similarity measures concerning multidimensional binary data. Subsequently, some properties of these measures are proposed. Finally the impact of the use of different inter-elements measures on the results of the Agglomerative Hierarchical Clustering Methods is studied.

Keywords: Binary data, similarity measure, Tθ measures, Agglomerative Hierarchical Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3408
11981 Face Recognition Based On Vector Quantization Using Fuzzy Neuro Clustering

Authors: Elizabeth B. Varghese, M. Wilscy

Abstract:

A face recognition system is a computer application for automatically identifying or verifying a person from a digital image or a video frame. A lot of algorithms have been proposed for face recognition. Vector Quantization (VQ) based face recognition is a novel approach for face recognition. Here a new codebook generation for VQ based face recognition using Integrated Adaptive Fuzzy Clustering (IAFC) is proposed. IAFC is a fuzzy neural network which incorporates a fuzzy learning rule into a competitive neural network. The performance of proposed algorithm is demonstrated by using publicly available AT&T database, Yale database, Indian Face database and a small face database, DCSKU database created in our lab. In all the databases the proposed approach got a higher recognition rate than most of the existing methods. In terms of Equal Error Rate (ERR) also the proposed codebook is better than the existing methods.

Keywords: Face Recognition, Vector Quantization, Integrated Adaptive Fuzzy Clustering, Self Organization Map.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2194
11980 Granulation using Clustering and Rough Set Theory and its Tree Representation

Authors: Girish Kumar Singh, Sonajharia Minz

Abstract:

Granular computing deals with representation of information in the form of some aggregates and related methods for transformation and analysis for problem solving. A granulation scheme based on clustering and Rough Set Theory is presented with focus on structured conceptualization of information has been presented in this paper. Experiments for the proposed method on four labeled data exhibit good result with reference to classification problem. The proposed granulation technique is semi-supervised imbibing global as well as local information granulation. To represent the results of the attribute oriented granulation a tree structure is proposed in this paper.

Keywords: Granular computing, clustering, Rough sets, datamining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1673
11979 Modeling Peer-to-Peer Networks with Interest-Based Clusters

Authors: Bertalan Forstner, Dr. Hassan Charaf

Abstract:

In the world of Peer-to-Peer (P2P) networking different protocols have been developed to make the resource sharing or information retrieval more efficient. The SemPeer protocol is a new layer on Gnutella that transforms the connections of the nodes based on semantic information to make information retrieval more efficient. However, this transformation causes high clustering in the network that decreases the number of nodes reached, therefore the probability of finding a document is also decreased. In this paper we describe a mathematical model for the Gnutella and SemPeer protocols that captures clustering-related issues, followed by a proposition to modify the SemPeer protocol to achieve moderate clustering. This modification is a sort of link management for the individual nodes that allows the SemPeer protocol to be more efficient, because the probability of a successful query in the P2P network is reasonably increased. For the validation of the models, we evaluated a series of simulations that supported our results.

Keywords: Peer-to-Peer, model, performance, networkmanagement.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1262
11978 Categorical Clustering By Converting Associated Information

Authors: Dongmin Cai, Stephen S-T Yau

Abstract:

Lacking an inherent “natural" dissimilarity measure between objects in categorical dataset presents special difficulties in clustering analysis. However, each categorical attributes from a given dataset provides natural probability and information in the sense of Shannon. In this paper, we proposed a novel method which heuristically converts categorical attributes to numerical values by exploiting such associated information. We conduct an experimental study with real-life categorical dataset. The experiment demonstrates the effectiveness of our approach.

Keywords: Categorical, Clustering, Converting, Information

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1320
11977 Double Clustering as an Unsupervised Approach for Order Picking of Distributed Warehouses

Authors: Hsin-Yi Huang, Ming-Sheng Liu, Jiun-Yan Shiau

Abstract:

Planning the order picking lists for warehouses to achieve some operational performances is a significant challenge when the costs associated with logistics are relatively high, and it is especially important in e-commerce era. Nowadays, many order planning techniques employ supervised machine learning algorithms. However, to define features for supervised machine learning algorithms is not a simple task. Against this background, we consider whether unsupervised algorithms can enhance the planning of order-picking lists. A double zone picking approach, which is based on using clustering algorithms twice, is developed. A simplified example is given to demonstrate the merit of our approach.

Keywords: order picking, warehouse, clustering, unsupervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 448
11976 A Cuckoo Search with Differential Evolution for Clustering Microarray Gene Expression Data

Authors: M. Pandi, K. Premalatha

Abstract:

A DNA microarray technology is a collection of microscopic DNA spots attached to a solid surface. Scientists use DNA microarrays to measure the expression levels of large numbers of genes simultaneously or to genotype multiple regions of a genome. Elucidating the patterns hidden in gene expression data offers a tremendous opportunity for an enhanced understanding of functional genomics. However, the large number of genes and the complexity of biological networks greatly increase the challenges of comprehending and interpreting the resulting mass of data, which often consists of millions of measurements. It is handled by clustering which reveals the natural structures and identifying the interesting patterns in the underlying data. In this paper, gene based clustering in gene expression data is proposed using Cuckoo Search with Differential Evolution (CS-DE). The experiment results are analyzed with gene expression benchmark datasets. The results show that CS-DE outperforms CS in benchmark datasets. To find the validation of the clustering results, this work is tested with one internal and one external cluster validation indexes.

Keywords: DNA, Microarray, genomics, Cuckoo Search, Differential Evolution, Gene expression data, Clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438
11975 Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chin-Chao Chen, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja

Abstract:

A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, bioprocess modeling, clusteringhybrid regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1729
11974 Efficient Web Usage Mining Based on K-Medoids Clustering Technique

Authors: P. Sengottuvelan, T. Gopalakrishnan

Abstract:

Web Usage Mining is the application of data mining techniques to find usage patterns from web log data, so as to grasp required patterns and serve the requirements of Web-based applications. User’s expertise on the internet may be improved by minimizing user’s web access latency. This may be done by predicting the future search page earlier and the same may be prefetched and cached. Therefore, to enhance the standard of web services, it is needed topic to research the user web navigation behavior. Analysis of user’s web navigation behavior is achieved through modeling web navigation history. We propose this technique which cluster’s the user sessions, based on the K-medoids technique.

Keywords: Clustering, K-medoids, Recommendation, User Session, Web Usage Mining.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1358
11973 Using Data Mining for Learning and Clustering FCM

Authors: Somayeh Alizadeh, Mehdi Ghazanfari, Mohammad Fathian

Abstract:

Fuzzy Cognitive Maps (FCMs) have successfully been applied in numerous domains to show relations between essential components. In some FCM, there are more nodes, which related to each other and more nodes means more complex in system behaviors and analysis. In this paper, a novel learning method used to construct FCMs based on historical data and by using data mining and DEMATEL method, a new method defined to reduce nodes number. This method cluster nodes in FCM based on their cause and effect behaviors.

Keywords: Clustering, Data Mining, Fuzzy Cognitive Map(FCM), Learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1970
11972 IMDC: An Image-Mapped Data Clustering Technique for Large Datasets

Authors: Faruq A. Al-Omari, Nabeel I. Al-Fayoumi

Abstract:

In this paper, we present a new algorithm for clustering data in large datasets using image processing approaches. First the dataset is mapped into a binary image plane. The synthesized image is then processed utilizing efficient image processing techniques to cluster the data in the dataset. Henceforth, the algorithm avoids exhaustive search to identify clusters. The algorithm considers only a small set of the data that contains critical boundary information sufficient to identify contained clusters. Compared to available data clustering techniques, the proposed algorithm produces similar quality results and outperforms them in execution time and storage requirements.

Keywords: Data clustering, Data mining, Image-mapping, Pattern discovery, Predictive analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1449
11971 An Energy-Efficient Distributed Unequal Clustering Protocol for Wireless Sensor Networks

Authors: Sungju Lee, Jangsoo Lee , Hongjoong Sin, Seunghwan Yoo, Sanghyuck Lee, Jaesik Lee, Yongjun Lee, Sungchun Kim

Abstract:

The wireless sensor networks have been extensively deployed and researched. One of the major issues in wireless sensor networks is a developing energy-efficient clustering protocol. Clustering algorithm provides an effective way to prolong the lifetime of a wireless sensor networks. In the paper, we compare several clustering protocols which significantly affect a balancing of energy consumption. And we propose an Energy-Efficient Distributed Unequal Clustering (EEDUC) algorithm which provides a new way of creating distributed clusters. In EEDUC, each sensor node sets the waiting time. This waiting time is considered as a function of residual energy, number of neighborhood nodes. EEDUC uses waiting time to distribute cluster heads. We also propose an unequal clustering mechanism to solve the hot-spot problem. Simulation results show that EEDUC distributes the cluster heads, balances the energy consumption well among the cluster heads and increases the network lifetime.

Keywords: Wireless Sensor Network, Distributed UnequalClustering, Multi-hop, Lifetime.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2438
11970 Effective Keyword and Similarity Thresholds for the Discovery of Themes from the User Web Access Patterns

Authors: Haider A Ramadhan, Khalil Shihab

Abstract:

Clustering techniques have been used by many intelligent software agents to group similar access patterns of the Web users into high level themes which express users intentions and interests. However, such techniques have been mostly focusing on one salient feature of the Web document visited by the user, namely the extracted keywords. The major aim of these techniques is to come up with an optimal threshold for the number of keywords needed to produce more focused themes. In this paper we focus on both keyword and similarity thresholds to generate themes with concentrated themes, and hence build a more sound model of the user behavior. The purpose of this paper is two fold: use distance based clustering methods to recognize overall themes from the Proxy log file, and suggest an efficient cut off levels for the keyword and similarity thresholds which tend to produce more optimal clusters with better focus and efficient size.

Keywords: Data mining, knowledge discovery, clustering, dataanalysis, Web log analysis, theme based searching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1409
11969 K-Means for Spherical Clusters with Large Variance in Sizes

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.

Keywords: K-Means, Data Clustering, Cluster Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3241
11968 Clustering Protein Sequences with Tailored General Regression Model Technique

Authors: G. Lavanya Devi, Allam Appa Rao, A. Damodaram, GR Sridhar, G. Jaya Suma

Abstract:

Cluster analysis divides data into groups that are meaningful, useful, or both. Analysis of biological data is creating a new generation of epidemiologic, prognostic, diagnostic and treatment modalities. Clustering of protein sequences is one of the current research topics in the field of computer science. Linear relation is valuable in rule discovery for a given data, such as if value X goes up 1, value Y will go down 3", etc. The classical linear regression models the linear relation of two sequences perfectly. However, if we need to cluster a large repository of protein sequences into groups where sequences have strong linear relationship with each other, it is prohibitively expensive to compare sequences one by one. In this paper, we propose a new technique named General Regression Model Technique Clustering Algorithm (GRMTCA) to benignly handle the problem of linear sequences clustering. GRMT gives a measure, GR*, to tell the degree of linearity of multiple sequences without having to compare each pair of them.

Keywords: Clustering, General Regression Model, Protein Sequences, Similarity Measure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1517
11967 Improving RBF Networks Classification Performance by using K-Harmonic Means

Authors: Z. Zainuddin, W. K. Lye

Abstract:

In this paper, a clustering algorithm named KHarmonic means (KHM) was employed in the training of Radial Basis Function Networks (RBFNs). KHM organized the data in clusters and determined the centres of the basis function. The popular clustering algorithms, namely K-means (KM) and Fuzzy c-means (FCM), are highly dependent on the initial identification of elements that represent the cluster well. In KHM, the problem can be avoided. This leads to improvement in the classification performance when compared to other clustering algorithms. A comparison of the classification accuracy was performed between KM, FCM and KHM. The classification performance is based on the benchmark data sets: Iris Plant, Diabetes and Breast Cancer. RBFN training with the KHM algorithm shows better accuracy in classification problem.

Keywords: Neural networks, Radial basis functions, Clusteringmethod, K-harmonic means.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
11966 Density Estimation using Generalized Linear Model and a Linear Combination of Gaussians

Authors: Aly Farag, Ayman El-Baz, Refaat Mohamed

Abstract:

In this paper we present a novel approach for density estimation. The proposed approach is based on using the logistic regression model to get initial density estimation for the given empirical density. The empirical data does not exactly follow the logistic regression model, so, there will be a deviation between the empirical density and the density estimated using logistic regression model. This deviation may be positive and/or negative. In this paper we use a linear combination of Gaussian (LCG) with positive and negative components as a model for this deviation. Also, we will use the expectation maximization (EM) algorithm to estimate the parameters of LCG. Experiments on real images demonstrate the accuracy of our approach.

Keywords: Logistic regression model, Expectationmaximization, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1687