Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 8

K-means algorithm Related Publications

8 An Improved K-Means Algorithm for Gene Expression Data Clustering

Authors: Mohamed Benmohammed, Billel Kenidra

Abstract:

Data mining technique used in the field of clustering is a subject of active research and assists in biological pattern recognition and extraction of new knowledge from raw data. Clustering means the act of partitioning an unlabeled dataset into groups of similar objects. Each group, called a cluster, consists of objects that are similar between themselves and dissimilar to objects of other groups. Several clustering methods are based on partitional clustering. This category attempts to directly decompose the dataset into a set of disjoint clusters leading to an integer number of clusters that optimizes a given criterion function. The criterion function may emphasize a local or a global structure of the data, and its optimization is an iterative relocation procedure. The K-Means algorithm is one of the most widely used partitional clustering techniques. Since K-Means is extremely sensitive to the initial choice of centers and a poor choice of centers may lead to a local optimum that is quite inferior to the global optimum, we propose a strategy to initiate K-Means centers. The improved K-Means algorithm is compared with the original K-Means, and the results prove how the efficiency has been significantly improved.

Keywords: K-means algorithm, microarray data mining, biological pattern recognition, partitional clustering, centroid initialization

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 732
7 Lane Detection Using Labeling Based RANSAC Algorithm

Authors: Ho-Youl Jung, Ju H. Park, Yeongyu Choi

Abstract:

In this paper, we propose labeling based RANSAC algorithm for lane detection. Advanced driver assistance systems (ADAS) have been widely researched to avoid unexpected accidents. Lane detection is a necessary system to assist keeping lane and lane departure prevention. The proposed vision based lane detection method applies Canny edge detection, inverse perspective mapping (IPM), K-means algorithm, mathematical morphology operations and 8 connected-component labeling. Next, random samples are selected from each labeling region for RANSAC. The sampling method selects the points of lane with a high probability. Finally, lane parameters of straight line or curve equations are estimated. Through the simulations tested on video recorded at daytime and nighttime, we show that the proposed method has better performance than the existing RANSAC algorithm in various environments.

Keywords: canny edge detection, K-means algorithm, RANSAC, inverse perspective mapping

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 573
6 A Parallel Implementation of k-Means in MATLAB

Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas

Abstract:

The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.

Keywords: Clustering, MATLAB, K-means algorithm, parallel computations

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 595
5 Customer Segmentation Model in E-commerce Using Clustering Techniques and LRFM Model: The Case of Online Stores in Morocco

Authors: Rachid Ait Daoud, Belaid Bouikhalene, Abdellah Amine, Rachid Lbibb

Abstract:

Given the increase in the number of e-commerce sites, the number of competitors has become very important. This means that companies have to take appropriate decisions in order to meet the expectations of their customers and satisfy their needs. In this paper, we present a case study of applying LRFM (length, recency, frequency and monetary) model and clustering techniques in the sector of electronic commerce with a view to evaluating customers’ values of the Moroccan e-commerce websites and then developing effective marketing strategies. To achieve these objectives, we adopt LRFM model by applying a two-stage clustering method. In the first stage, the self-organizing maps method is used to determine the best number of clusters and the initial centroid. In the second stage, kmeans method is applied to segment 730 customers into nine clusters according to their L, R, F and M values. The results show that the cluster 6 is the most important cluster because the average values of L, R, F and M are higher than the overall average value. In addition, this study has considered another variable that describes the mode of payment used by customers to improve and strengthen clusters’ analysis. The clusters’ analysis demonstrates that the payment method is one of the key indicators of a new index which allows to assess the level of customers’ confidence in the company's Website.

Keywords: Cluster Analysis, Customer Value, loyalty, K-means algorithm, LRFM model, Self-Organizing Maps method (SOM)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5127
4 Geographic Profiling Based on Multi-point Centrography with K-means Clustering

Authors: Jiaji Zhou, Le Liang, Long Chen

Abstract:

Geographic Profiling has successfully assisted investigations for serial crimes. Considering the multi-cluster feature of serial criminal spots, we propose a Multi-point Centrography model as a natural extension of Single-point Centrography for geographic profiling. K-means clustering is first performed on the data samples and then Single-point Centrography is adopted to derive a probability distribution on each cluster. Finally, a weighted combinations of each distribution is formed to make next-crime spot prediction. Experimental study on real cases demonstrates the effectiveness of our proposed model.

Keywords: K-means algorithm, geographic profiling, Centrography model

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1636
3 A New Evolutionary Algorithm for Cluster Analysis

Authors: B.Bahmani Firouzi, T. Niknam, M. Nayeripour

Abstract:

Clustering is a very well known technique in data mining. One of the most widely used clustering techniques is the kmeans algorithm. Solutions obtained from this technique depend on the initialization of cluster centers and the final solution converges to local minima. In order to overcome K-means algorithm shortcomings, this paper proposes a hybrid evolutionary algorithm based on the combination of PSO, SA and K-means algorithms, called PSO-SA-K, which can find better cluster partition. The performance is evaluated through several benchmark data sets. The simulation results show that the proposed algorithm outperforms previous approaches, such as PSO, SA and K-means for partitional clustering problem.

Keywords: Data Clustering, Particle Swarm Optimization (PSO), K-means algorithm, Hybrid evolutionary optimization algorithm, Simulated Annealing (SA)

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1880
2 Color Image Segmentation Using Competitive and Cooperative Learning Approach

Authors: Yinggan Tang, Xinping Guan

Abstract:

Color image segmentation can be considered as a cluster procedure in feature space. k-means and its adaptive version, i.e. competitive learning approach are powerful tools for data clustering. But k-means and competitive learning suffer from several drawbacks such as dead-unit problem and need to pre-specify number of cluster. In this paper, we will explore to use competitive and cooperative learning approach to perform color image segmentation. In competitive and cooperative learning approach, seed points not only compete each other, but also the winner will dynamically select several nearest competitors to form a cooperative team to adapt to the input together, finally it can automatically select the correct number of cluster and avoid the dead-units problem. Experimental results show that CCL can obtain better segmentation result.

Keywords: Cluster, competitive learning, K-means algorithm, Color image segmentation, competitive and cooperative learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1292
1 Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance

Authors: S. Deelers, S. Auwatanamongkol

Abstract:

In this paper, we propose an algorithm to compute initial cluster centers for K-means clustering. Data in a cell is partitioned using a cutting plane that divides cell in two smaller cells. The plane is perpendicular to the data axis with the highest variance and is designed to reduce the sum squared errors of the two cells as much as possible, while at the same time keep the two cells far apart as possible. Cells are partitioned one at a time until the number of cells equals to the predefined number of clusters, K. The centers of the K cells become the initial cluster centers for K-means. The experimental results suggest that the proposed algorithm is effective, converge to better clustering results than those of the random initialization method. The research also indicated the proposed algorithm would greatly improve the likelihood of every cluster containing some data in it.

Keywords: clustering algorithm, K-means algorithm, Datapartitioning, Initial cluster centers

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2507