Gökhan Silahtaroğlu
Clustering Categorical Data Using Hierarchies (CLUCDUH)
2006 - 2011
2009
3
8
International Journal of Computer and Information Engineering
https://publications.waset.org/pdf/922
https://publications.waset.org/vol/32
World Academy of Science, Engineering and Technology
Clustering large populations is an important problem
when the data contain noise and different shapes. A good clustering
algorithm or approach should be efficient enough to detect clusters
sensitively. Besides space complexity, time complexity also gains
importance as the size grows. Using hierarchies we developed a new
algorithm to split attributes according to the values they have and
choosing the dimension for splitting so as to divide the database
roughly into equal parts as much as possible. At each node we
calculate some certain descriptive statistical features of the data
which reside and by pruning we generate the natural clusters with a
complexity of O(n).
Open Science Index 32, 2009