Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31836
Journey on Image Clustering Based on Color Composition

Authors: Achmad Nizar Hidayanto, Elisabeth Martha Koeanan


Image clustering is a process of grouping images based on their similarity. The image clustering usually uses the color component, texture, edge, shape, or mixture of two components, etc. This research aims to explore image clustering using color composition. In order to complete this image clustering, three main components should be considered, which are color space, image representation (feature extraction), and clustering method itself. We aim to explore which composition of these factors will produce the best clustering results by combining various techniques from the three components. The color spaces use RGB, HSV, and L*a*b* method. The image representations use Histogram and Gaussian Mixture Model (GMM), whereas the clustering methods use KMeans and Agglomerative Hierarchical Clustering algorithm. The results of the experiment show that GMM representation is better combined with RGB and L*a*b* color space, whereas Histogram is better combined with HSV. The experiments also show that K-Means is better than Agglomerative Hierarchical for images clustering.

Keywords: Image clustering, feature extraction, RGB, HSV, L*a*b*, Gaussian Mixture Model (GMM), histogram, Agglomerative Hierarchical Clustering (AHC), K-Means, Expectation-Maximization (EM).

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2043


[1] Y. Chen, J.Z. Wang, R. Krovetz, "Content-Based Image Retrieval by Clustering", Proceedings of the 5th ACM SIGMM international workshop on Multimedia information retrieval, 2003, pp. 193-200.
[2] R. Liu, Y. Wang, T. Baba, Y. Uehara, D. Masumoto and S. Nagata, "SVM-Based Active Feedback in Image Retrieval Using Clustering and Unlabeled Data. LNCS, Computer Analysis of Images and Patterns", Springer Berlin / Heidelberg, Volume 4673/2007, August 2007, pp. 954- 961.
[3] J. Guan, G. Qiu, "Spectral images and features co-clustering with application to content-based image retrieval", In Proc. of IEEE Workshop on Multimedia Signal Processing, 2005.
[4] D. Kim, "Qcluster: Relevance Feedback Using Adaptive Clustering for Content-Based Image Retrieval", In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, 2003.
[5] S. Park, K. Seo, D. Jang, "Fuzzy Art-Based Image Clustering Method for Content-Based Image Retrieval", International Journal of Information Technology and Decision Making, 06(02), 2007.
[6] Y. Liu, X. Chen, C. Zhang, A. Sprague, "An Interactive Region-Based Image Clustering and Retrieval Platform", In Proc. of the IEEE International Conference on Multimedia and Expo, 2006, pp. 929-932.
[7] R. Fakouri, B. Zamani, M. Fathy, and B. Minaei, "Region-Based Image Clustering and Retrieval Using Fuzzy Similarity and Relevance Feedback", In Proc. Of the International Conference on Computer and Electrical Engineering, 2008.
[8] E. Margaretha, H.M. Manurung, "Multimedia Information Processing. Technical report", Faculty of Computer Science University of Indonesia, 2009.
[9] A. K. Jain, M. N. Murty, P. J. Flynn, "Data Clustering: A Review", in ACM Computer Survey, 1999, pp 264-323.
[10] J. Huang, S. R. Kumar, and R. Zabith, "An automatic hierarchical image classification scheme", In ACM Conference on multimedia, England, September 2008, pp. 219-228.
[11] C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Region-based image querying", In Proc. of the IEEE Workshop on Content-based Access of Image and Video libraries (CVPR'97), 1997, pp. 42-49.
[12] C. Carson, S. Belongie, H. Greenspan, and J. Malik, "Blobworld: Image segmentation using expectation-maximization and its application to image querying", IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(8):1026-1038, 2002.
[13] G. Sheikholeslami and A. Zhang, " Approach to clustering large visual databases using wavelet transform", In Proc. of SPIE conference on visual data exploration and analysis IV, volume 3017, San Jose, California, 1997.
[14] H. Greenspan, J. Goldberger, and L. Ridel, " A continuous probabilistic framework for image matching", Journal of Computer Vision and Image Understanding, 84:384-406, 2001.
[15] J. Chen, C.A. Bouman, and J.C. Dalton, "Hierarchical browsing and search of large image databases", IEEE transactions on Image Processing, 9(3):442-455, March 2000.
[16] G. Pass and R. Zabih, "Comparing images using joint histograms", Multimedia Systems, 7:234-240, 1999.
[17] M. Stricker and A. Dimai, "Spectral covariance and fuzzy regions for image indexing. Machine Vision and Applications", 10(2):66-73, 1997.
[18] J. Huang, S. R. Kumar, M. Mitra, W.-J. Zhu, and R. Zabih, "Image indexing using color correlograms", In Proc. of the IEEE Comp. Vis. And Patt. Rec., pp. 762-768, 1997.
[19] K. Barnard, P. Duygulu, and D. Forsyth, "Clustering art. In Computer Vision and Pattern Recognition (CVPR 2001)", Hawaii, December 2001.
[20] K. Barnard and D. Forsyth, "Learning the semantics of words and pictures. In International Conference on Computer Vision", volume 2, pp. 408-415, 2001.
[21] A. Vailaya, M. A. T. Figueiredo, A. K. Jain, and H.-J. Zhang, "Image Classification for Content-Based Indexing," IEEE Trans. Image Processing, vol. 10, no. 1, pp. 117-130, 2001.
[22] J. Z. Wang, J. Li, and G. Wiederhold, "SIMPLIcity: Semantics-Sensitive Integrated Matching for Picture LIbraries," IEEE Trans. Pattern Anal. Machine Intell., vol. 23, no. 9, pp. 947-963, 2001.
[23] N. Tishby, F. Pereira, and W. Bialek, "The information bottleneck method", In Proc. Of the 37-th Annual Allerton Conference on Communication, Control and Computing, pp. 368-377, 1999.
[24] N. Slonim and N. Tishby, :Agglomerative information bottleneck", In Proc. of Neural Information Processing Systems, pp. 617-623, 1999.
[25] N. Slonim, N. Friedman, and N. Tishby, "Unsupervised document classification using sequential information maximization", In Proc. of the 25-th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002.
[26] N. Slonim, R. Somerville, N. Tishby, and O. Lahav, "Objective classification of galaxy spectra using the information bottleneck method", 323:270-284, 2001.
[27] E. Schneidman, N. Slonim, N. Tishby, R. R. deRuyter van Steveninck, and W. Bialek, "Analysing neural codes using the information bottleneck method", In Advances in Neural Information Processing Systems, NIPS, 2001
[28] S. Gordon, "Unsupervised Image Clustering using Probabilistic Continuous Models and Information Theoretic Principle", Thesis, Universitas Tel-Aviv Israel, 2006.
[29] D. Cardani. Adventures in HSV Space. Available at /hsvspace.pdf. 2006.
[30] N. Vasconcelos and A.Lippman, "Feature representations for image retrieval: Beyond the color histogram", In Proc. of the Int. Conference on Multimedia and Expo, New York, August 2000.
[31] C. Adi, "Comparison of Agglomerative Hierarchical Clustering methods for Text Data", Thesis, Faculty of Computer Science, University of Indonesia.