**Commenced**in January 2007

**Frequency:**Monthly

**Edition:**International

**Paper Count:**30184

##### K-Means for Spherical Clusters with Large Variance in Sizes

**Authors:**
A. M. Fahim,
G. Saake,
A. M. Salem,
F. A. Torkey,
M. A. Ramadan

**Abstract:**

**Keywords:**
K-Means,
Data Clustering,
Cluster Analysis.

**Digital Object Identifier (DOI):**
doi.org/10.5281/zenodo.1079848

**References:**

[1] R. Agrawal , J. Gehrke , D. Gunopulos, and P. Raghavan, "Automatic SubspaceClustering of High Dimensional Data for Data Mining Applications," Proc.ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998, pp. 94-105.

[2] M R. Anderberg Cluster Analysis for Applications, Academic Press, 1973.

[3] M. Ankerst , M. Breunig , H.P. Kriegel , and J. Sander, "OPTICS: Ordering Points to Identify the Clustering Structure," Proc. ACM SIGMOD Int. Con. Management of Data Mining, 1999, pp. 49-60.

[4] L. Bottou, and Y. Bengio, "Convergence properties of the k-mean algorithm," The MIT press, Cambridge, MA, 1995, pp. 585-592.

[5] P. S. Bradley , and U. M. Fayyad, "Refining Initial Points for K-Means Clustering," Proc. of the 15th International Conference on Machine Learning (ICML98), J. Shavlik (ed.). Morgan Kaufmann, San Francisco, 1998, pp. 91-99.

[6] S. Deelers, and S. Auwatanamongkol, "Enhancing K-Means Algorithm with Initial Cluster Centers Derived from Data Partitioning along the Data Axis with the Highest Variance," PWASET vol 26, 2007, pp. 323- 328.

[7] R.O. Duda , and P.E. Hart, Pattern Classification and Scene Analysis. John Wiley Sons, New York, 1973.

[8] M. Ester, H.P. Kriegel, J. Sander, and X. Xu, "A Density-based Algorithm for Discovering Clusters in Large Spatial Databases with Noise," Proc. 2nd Int. Conf. on Knowledge Discovery and Data Mining, AAAI Press, Portland, 1996, pp.226-231.

[9] A. M. Fahim, A. M. Salem ,F. A. Torkey, and M. Ramadan, "An efficient enhanced k-means clustering algorithm," Journal of Zhejiang University SCIENCE A, vol 7(10), 2006, pp. 1626-1633.

[10] U.M. Fayyad , G. Piatetsky-Shapiro , P. Smyth , and R. Uthurusamy Advances in Knowledge Discovery and Data Mining, AAAI/MIT Press, 1996.

[11] A. Gersho, and R.M. Gray Vector Quantization and Signal Compression, Kluwer Academic, Boston, 1992.

[12] S. Guha , R. Rastogi, and K. Shim, "CURE: An Efficient Clustering Algorithms for Large Databases," Proc. ACM SIGMOD Int. Conf. on Management of Data, Seattle, WA, 1998, pp. 73-84.

[13] V. Hautamaeki , S. Cherednichenko , I. Kaerkkaeinen , T. Kinnunen, and P. Fraenti, "Improving K-Means by Outlier Removal," SCIA 2005, LNCS 3540, 2005, pp. 978-987.

[14] V. Hautamaeki , I. Kaerkkaeinen, and P. Fraenti, "Outlier detection using k-nearest neighbourgraph," In: 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, United Kingdom, 2004, pp. 430-433.

[15] A. Hinneburg, and D. Keim, "An Efficient Approach to Clustering in Large Multimedia Databases with Noise," Proc. 4th Int. Conf. on Knowledge Discovery and Data Mining, New York City, NY. 1998.

[16] Z. Huang, "A fast clustering algorithm to cluster very large categorical data sets in data mining," Proceedings of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Dept. of Computer Science, The University of British Columbia, Canada, 1997, pp. 1-8.

[17] A. K. Jain, and R. C. Dubes, Algorithms for Clustering Data, Prentice Hall, 1988.

[18] L. Kaufman, and P. Rousseeuw, "Finding Groups in Data: An Introduction to Cluster Analysis," Wiley, 1990.

[19] J.B. MacQueen, "Some methods for classification and analysis of multivariate observations. Proc. 5th Symp. Mathematical Statistics and Probability, Berkelely, CA, Vol(1), 1967, pp. 281297.

[20] R.T. Ng, and J. Han, "Efficient and Effective Clustering Methods for Spatial Data Mining," Proc. 20th Int. Conf. on Very Large Data Bases. Morgan Kaufmann Publishers, San Francisco, CA, 1994, pp. 144-155.

[21] D. T. Pham , S. S. Dimov, and C. D. Nguyen, "Selection of k in K-means clustering," Mechanical Engineering Science, vol(219), 2004, pp. 103- 119.

[22] S. Ray, and R. H. Turi, "Determination of number of clusters in k-means clustering and application in colour image segmentation," In Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques, 1999, pp.137-143.

[23] G. Sheikholeslami, S. Chatterjee, and A. Zhang, "Wave-Cluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases," Proc. 24th Int. Conf. on Very Large Data Bases. New York, 1998, pp. 428-439.

[24] R. Sibson, "SLINK: an optimally efficient algorithm for the single-link cluster method," The Comp. Journal, 16(1), 1973, pp. 30-34.

[25] T. Zhang, R. Ramakrishnan, and M. Linvy, "BIRCH: An Efficient Data Clustering Method for Very Large Databases," Proc. ACM SIGMOD Int. Conf. on Management of Data. ACM Press, New York, 1996, pp. 103-114.