Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2

partitional clustering Related Publications

2 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Mohamed Benmohammed, Billel Kenidra

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: K-means algorithm, microarray data mining, biological pattern recognition, partitional clustering, centroid initialization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 732
1 A Computational Cost-Effective Clustering Algorithm in Multidimensional Space Using the Manhattan Metric: Application to the Global Terrorism Database

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

The increasing amount of collected data has limited the performance of the current analyzing algorithms. Thus, developing new cost-effective algorithms in terms of complexity, scalability, and accuracy raised significant interests. In this paper, a modified effective k-means based algorithm is developed and experimented. The new algorithm aims to reduce the computational load without significantly affecting the quality of the clusterings. The algorithm uses the City Block distance and a new stop criterion to guarantee the convergence. Conducted experiments on a real data set show its high performance when compared with the original k-means version.

Keywords: Pattern Recognition, K-Means Clustering, Manhattan distance, terrorism data analysis, partitional clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 762