Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30127
A Parallel Implementation of k-Means in MATLAB

Authors: Dimitris Varsamis, Christos Talagkozis, Alkiviadis Tsimpiris, Paris Mastorocostas

Abstract:

The aim of this work is the parallel implementation of k-means in MATLAB, in order to reduce the execution time. Specifically, a new function in MATLAB for serial k-means algorithm is developed, which meets all the requirements for the conversion to a function in MATLAB with parallel computations. Additionally, two different variants for the definition of initial values are presented. In the sequel, the parallel approach is presented. Finally, the performance tests for the computation times respect to the numbers of features and classes are illustrated.

Keywords: K-means algorithm, clustering, parallel computations, MATLAB.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1132615

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 555

References:


[1] P. Sneath and R. R. Sokal, Numerical Taxonomy: The Principles and Practice of Numerical Classification. San Francisco: W.H. Freeman, 1973.
[2] A. Tsimpiris and D. Kugiumtzis, “Feature selection for classification of oscillating time series,” Expert Systems, vol. 29, no. 5, pp. 456–477, 2012.
[3] I. Guyon and A. Elisseeff, “An introduction to variable and feature selection,” J. Mach. Learn. Res., vol. 3, pp. 1157–1182, 2003.
[4] G. John, R. Kohavi, and K. Pfleger, “Irrelevant features and the subset selection problem,” in Proceedings of the Eleventh International Conference on Machine Learning. Morgan Kaufmann, 1994, pp. 121–129.
[5] D. Arthur and S. Vassilvitskii, “On the worst case complexity of the k-means method,” Stanford InfoLab, Technical Report 2005-34, 2005.
[6] ——, “How slow is the k-means method?” in Proceedings of the Twenty-second Annual Symposium on Computational Geometry, ser. SCG ’06, New York, NY, USA, 2006, pp. 144–153.
[7] P. Luszczek, “Parallel programming in matlab,” International Journal of High Performance Computing Applications, vol. 23, no. 3, pp. 277–283, 2009.
[8] G. Sharma and J. Martin, “Matlab : A language for parallel computing,” International Journal of Parallel Programming, vol. 37, pp. 3–36, 2009.
[9] D. N. Varsamis, P. A. Mastorocostas, A. K. Papakonstantinou, and N. P. Karampetakis, “A parallel searching algorithm for the insetting procedure in matlab parallel toolbox,” in Federated Conference on Computer Science and Information Systems (FedCSIS), 2012. IEEE, 2012, pp. 587–593.
[10] C. Moler, “Parallel matlab: Multiple processors and multiple cores,” The MathWorks News & Notes, 2007.
[11] C. Lin and L. Snyder, Principles of Parallel Programming. Boston, USA: Addison-Wesley, 2008.
[12] D. Varsamis, C. Talagkozis, P. Mastorocostas, E. Outsios, and N. Karampetakis, “The performance of the matlab parallel computing toolbox in specific problems,” in Advanced Information Science and Applications Volume I, 18th Int. Conf. on Circuits, Systems, Communications and Computers (CSCC 2014), July 17-21, 2014, Santorini Island, Greece, vol. 1, 2014, pp. 145–150.