Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31237
A Comprehensive Review on Different Mixed Data Clustering Ensemble Methods

Authors: S. Sarumathi, N. Shanthi, S. Vidhya, M. Sharmila


An extensive amount of work has been done in data clustering research under the unsupervised learning technique in Data Mining during the past two decades. Moreover, several approaches and methods have been emerged focusing on clustering diverse data types, features of cluster models and similarity rates of clusters. However, none of the single clustering algorithm exemplifies its best nature in extracting efficient clusters. Consequently, in order to rectify this issue, a new challenging technique called Cluster Ensemble method was bloomed. This new approach tends to be the alternative method for the cluster analysis problem. The main objective of the Cluster Ensemble is to aggregate the diverse clustering solutions in such a way to attain accuracy and also to improve the eminence the individual clustering algorithms. Due to the massive and rapid development of new methods in the globe of data mining, it is highly mandatory to scrutinize a vital analysis of existing techniques and the future novelty. This paper shows the comparative analysis of different cluster ensemble methods along with their methodologies and salient features. Henceforth this unambiguous analysis will be very useful for the society of clustering experts and also helps in deciding the most appropriate one to resolve the problem in hand.

Keywords: Clustering, Consensus Function, Cluster Ensemble methods, Median partition, Coassociation matrix

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1785


[1] S. Sarumathi, N. Shanthi, M. Sharmila. "A Review: Comparative Analysis of Different Categorical Data Clustering Ensemble Methods”. International Journal of Computer, Information Science and Engineering Vol:7 No:12, 2013.
[2] Sandro Vega-pons & Jose reuizShulcloper. "A Survey of Clustering Ensemble algorithms”. International Journal of Pattern Recognition and Artificial Intelligence Vol. 25, No. 3 337_372 2011.
[3] NatthakanIam-On, TossaponBoongoen, Simon Garrett, & Chris Price. "A Link based cluster ensemble approach for categorical data clustering”. IEEE Transactions on knowledge and data engineering, Vol. 24, No. 3, 2012.
[4] HarunPirim, DilipGautam, Tanmay ,Bhowmik, Andy D. Perkins, BurakEkşioglu, &AhmetAlkan,. "Performance of an ensemble clustering algorithm on biological datasets”. Mathematical and Computational Applications, Vol. 16, No. 1, pp. 87-96 2011.
[5] MajidMahrooghy, Nicolas H. Younan, Valentine G. Anantharaj, James Aanstoos, and ShantiaYarahmadian, "On the Use of a Cluster Ensemble Cloud Classification Technique in Satellite Precipitation Estimation”, IEEE journal of selected topics in applied earth observations and remote sensing, vol. 5, no. 5, october 2012.
[6] Y. Hong, K. L. Hsu, S. Sorooshian, and X. G. Gao, "Precipitation estimation from remotely sensed imagery using an artificial neural network cloud classification system,” J. Appl. Meteorol., vol. 43, pp. 1834–1852, 2004
[7] J. C. Grzegorz and W. F. Krajewski, "Comments on the window probability matching method for rainfall measurements with radar,” J. Appl. Meteorol., vol. 36, pp. 243–246, 1997.
[8] M. Mahrooghy, V. G. Anantharaj, N. H. Younan, W. A. Petersen, F. J.Turk, and J. Aanstoos, "Infrared satellite precipitation estimate using waveletbased cloud classification and radar calibration,” in Proc. 2010 IEEE Int. Geoscience and Remote Sensing Symp. (IGARSS), 2010, pp. 2345–2348.
[9] N. Iam-on, T. Boongoen, and S. Garrett, "LCE: A link-based cluster ensemble method for improved gene expression data analysis,” Bioinformatics, vol. 26, pp. 1513–1519, 2010
[10] R. Duda, P. E. Hart, and D. G. Stork, Pattern Classification. New York: Wiley-Interscience, 2000.
[11] D. A. Neumann and V. Norton, "Clustering and isolation in the consensus problem for partitions,” J. Classification, vol. 3, pp. 281–297, 1986.
[12] Jing Gao, Feng Liang, Wei Fan, Yizhou Sun, and Jiawei Han, "A Graph- Based Consensus Maximization Approach for Combining Multiple Supervised and Unsupervised Models” , IEEE transactions on knowledge and data engineering, vol. 25, no. 1, january 2013.
[13] D.P. Bertsekas, Non-Linear Programming, second ed. Athena Scientific, 1999.
[14] NatthakanIam-On, TossaponBoongoen, "Improved Link-Based Cluster Ensembles”, IEEE World Congress on Computational Intelligence June, 10-15, 2012.
[15] N Iam-on, "LCE: A link-based cluster ensemble method for improved gene expression data analysis,” Bioinformatics, vol. 26, no. 12, pp. 1513– 1519, 2010
[16] N. Iam-On, T. Boongoen, S. Garrett, and C. Price, "A link-based approach to the cluster ensemble problem,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2396– 2409, 2011.
[17] N. Iam-On, T. Boongoen, and S. Garrett, "Refining pairwise similarity matrix for cluster ensemble problem with cluster relations,” in Proceedings of Eleventh International Conference on Discovery Science, 2008,pp. 222–233.
[18] T. Boongoen, Q. Shen, and C. Price, "Disclosing false identity through hybrid link analysis,” AI and Law, vol. 18, no. 1, pp. 77–102, 2010.
[19] A. Ng, M. Jordan, and Y. Weiss, "On spectral clustering: Analysis and an algorithm,” Advances in Neural Information Processing Systems, vol. 14, pp. 849–856, 2001.
[20] Xiaoli Z. Fern and Carla E. Brodley, "Solving cluster ensemble problems by bipartite graph partitioning,” in Proceedings of International Conference on Machine Learning, 2004, pp. 36–43.
[21] Tsaipei Wang, "CA-Tree: A Hierarchical Structure for Efficient and Scalable Coassociation-Based Cluster Ensembles”, IEEE transactions on systems, man, and cybernetics—part b: cybernetics, vol. 41, no. 3, june 2011.
[22] T.-Y. Lv, S.-B. Huang, X.-Z. Zhang, and Z.-X. Wang, "Combining multiple clustering methods based on core group,” in Proc. 2nd Int. Conf.Semantics, Knowl., Grid, 2006, pp. 29–34.
[23] Yun Yang and Ke Chen, "Temporal Data Clustering via Weighted Clustering Ensemble with Different Representations”, IEEE transactions on knowledge and data engineering, vol. 23, no. 2, february 2011.
[24] A. Jain, M. Murthy, and P. Flynn, "Data Clustering: A Review,” ACM Computing Surveys, vol. 31, pp. 264-323, 1999.
[25] M. Halkidi, Y. Batistakis, and M. Varzirgiannis, "On Clustering Validation Techniques,” J. Intelligent Information Systems, vol. 17, pp. 107-145, 2001
[26] A. Strehl and J. Ghosh, "Cluster Ensembles—A Knowledge Reuse Framework for Combining Multiple Partitions,” J. Machine Learning Research, vol. 3, pp. 583-617, 2002.
[27] A. Fred and A. Jain, "Combining Multiple Clusterings Using Evidence Accumulation,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 6 pp. 835-850, June 2005.
[28] Sarumathi S, Shanthi N, Sharmila M. " A Comparative Analysis of Different Categorical Data Clustering Ensemble Methods in DataMining” International Journal of Computer Applications Vol 81, No.4 November 2013.