MCOKE: Multi-Cluster Overlapping K-Means Extension Algorithm
Authors: Said Baadel, Fadi Thabtah, Joan Lu
Abstract:
Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold be defined a priori which can be difficult to determine by novice users.
Keywords: Data mining, k-means, MCOKE, overlapping.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1099282
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2760References:
[1] C.C. Aggarwal, C.K. Reddy. Data Clustering: Algorithms and Applications. CRC Press, 2014.
[2] A.K. Jain, R.C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.
[3] E. Boundaillier, G. Hebrail. Interactive interpretation of hierarchical clustering. Intell. Data Anal. 1998.
[4] O.A. Abbas. Comparisons between Data Clustering Algorithms. The International Arab Journal of Information Technology, Vol 5. No. 3. 2008.
[5] F. Höppner, F. Klawonn, R. Kruse, T. Runkler, Fuzzy Cluster Analysis: Methods for Classification, Data Analysis and Image Recognition, Wiley, 1999.
[6] B. S. Everitt, S. Landau, M. Leese, “Cluster Analysis”, Arnold Publishers, 2001
[7] A. Jaini. Data Clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8): pp. 651-666, 2010.
[8] E.R. Hruschkaet. al. A survey of Evolutionary Algorithms for Clustering. IEEE Trans. Vol. 39, pp. 133-155, 2009.
[9] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithm, Plenum Press, 1981.
[10] Y. Chen, H. Hu. An overlapping Cluster algorithm to provide nonexhaustive clustering. Presented at European Journal of Operational Research. pp. 762-780, 2006
[11] G. Cleuzious. An extended version of the k-means method for overlapping clustering. IEEE International Conference on Pattern Recognition. 2008
[12] K. Bache, M. Lichman. UCI Machine Learning Repository (http://archive.ics.uci.edu/ml). Irvine, CA: University of California, School of Information and Computer Science. 2013
[13] N. Abdelhamid, A. Ayesh, F. Thabtah. Phishing detection based Associative Classification data mining. Expert Systems with Applications Journal. Vol. 41 (13). 2014