Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 4

unsupervised learning Related Publications

4 Clustering Categorical Data Using the K-Means Algorithm and the Attribute’s Relative Frequency

Authors: Semeh Ben Salem, Sami Naouali, Moetez Sallami

Abstract:

Clustering is a well known data mining technique used in pattern recognition and information retrieval. The initial dataset to be clustered can either contain categorical or numeric data. Each type of data has its own specific clustering algorithm. In this context, two algorithms are proposed: the k-means for clustering numeric datasets and the k-modes for categorical datasets. The main encountered problem in data mining applications is clustering categorical dataset so relevant in the datasets. One main issue to achieve the clustering process on categorical values is to transform the categorical attributes into numeric measures and directly apply the k-means algorithm instead the k-modes. In this paper, it is proposed to experiment an approach based on the previous issue by transforming the categorical values into numeric ones using the relative frequency of each modality in the attributes. The proposed approach is compared with a previously method based on transforming the categorical datasets into binary values. The scalability and accuracy of the two methods are experimented. The obtained results show that our proposed method outperforms the binary method in all cases.

Keywords: Pattern Recognition, Knowledge Discovery, Clustering, unsupervised learning, k-means, categorical datasets

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2840
3 Evolutionary Computing Approach for the Solution of Initial value Problems in Ordinary Differential Equations

Authors: I. M. Qureshi, A. Junaid, M. A. Z. Raja

Abstract:

An evolutionary computing technique for solving initial value problems in Ordinary Differential Equations is proposed in this paper. Neural network is used as a universal approximator while the adaptive parameters of neural networks are optimized by genetic algorithm. The solution is achieved on the continuous grid of time instead of discrete as in other numerical techniques. The comparison is carried out with classical numerical techniques and the solution is found with a uniform accuracy of MSE ≈ 10-9 .

Keywords: Neural Networks, Numerical Methods, Evolutionary computing, unsupervised learning, Fitness evaluation function

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363
2 Fuzzy Relatives of the CLARANS Algorithm With Application to Text Clustering

Authors: Mohamed A. Mahfouz, M. A. Ismail

Abstract:

This paper introduces new algorithms (Fuzzy relative of the CLARANS algorithm FCLARANS and Fuzzy c Medoids based on randomized search FCMRANS) for fuzzy clustering of relational data. Unlike existing fuzzy c-medoids algorithm (FCMdd) in which the within cluster dissimilarity of each cluster is minimized in each iteration by recomputing new medoids given current memberships, FCLARANS minimizes the same objective function minimized by FCMdd by changing current medoids in such away that that the sum of the within cluster dissimilarities is minimized. Computing new medoids may be effected by noise because outliers may join the computation of medoids while the choice of medoids in FCLARANS is dictated by the location of a predominant fraction of points inside a cluster and, therefore, it is less sensitive to the presence of outliers. In FCMRANS the step of computing new medoids in FCMdd is modified to be based on randomized search. Furthermore, a new initialization procedure is developed that add randomness to the initialization procedure used with FCMdd. Both FCLARANS and FCMRANS are compared with the robust and linearized version of fuzzy c-medoids (RFCMdd). Experimental results with different samples of the Reuter-21578, Newsgroups (20NG) and generated datasets with noise show that FCLARANS is more robust than both RFCMdd and FCMRANS. Finally, both FCMRANS and FCLARANS are more efficient and their outputs are almost the same as that of RFCMdd in terms of classification rate.

Keywords: Data Mining, Cluster Analysis, unsupervised learning, Fuzzy Clustering, Relational Clustering, Medoid-Based Clustering

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1978
1 Improving Convergence of Parameter Tuning Process of the Additive Fuzzy System by New Learning Strategy

Authors: Thi Nguyen, Lee Gordon-Brown, Jim Peterson, Peter Wheeler

Abstract:

An additive fuzzy system comprising m rules with n inputs and p outputs in each rule has at least t m(2n + 2 p + 1) parameters needing to be tuned. The system consists of a large number of if-then fuzzy rules and takes a long time to tune its parameters especially in the case of a large amount of training data samples. In this paper, a new learning strategy is investigated to cope with this obstacle. Parameters that tend toward constant values at the learning process are initially fixed and they are not tuned till the end of the learning time. Experiments based on applications of the additive fuzzy system in function approximation demonstrate that the proposed approach reduces the learning time and hence improves convergence speed considerably.

Keywords: unsupervised learning, Additive fuzzy system, improving convergence, parameter learning process

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1216