Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting
Authors: Kemal Polat
In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1124023Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1297
 R. Hathaway and J. Bezdek, Fuzzy c-means clustering of incomplete data, IEEE Trans. Syst., Man, Cybern., 31(5), 2001, 735–744.
 B. Everitt, S. Landau, and M. Leese, Cluster Analysis. London: Arnold, 2001.
 J. Hartigan, Clustering Algorithms. New York: Wiley, 1975.
 Polat, K., Şahan, S., Güneş, S., A new method to medical diagnosis: Artificial immune recognition system (AIRS) with fuzzy weighted pre-processing and application to ECG arrhythmia, Expert Systems with Applications, 31(2), 2006, 264-269.
 Polat, K., Şahan, S., Güneş, S., Automatic detection of heart disease using an artificial immune recognition system (AIRS) with fuzzy resource allocation mechanism and k-nn (nearest neighbour) based weighting preprocessing, Expert Systems with Applications, 32(2), 2007, 625-631.
 Polat, K., Güneş, S., A hybrid medical decision making system based on principles component analysis, k-NN based weighted pre-processing and adaptive neuro-fuzzy inference system, Digital Signal Processing, 16(6), 2006, 913-921.
 Polat, K., Güneş, S., The effect to diagnostic accuracy of decision tree classifier of fuzzy and k-NN based weighted pre-processing methods to diagnosis of erythemato-squamous diseases, Digital Signal Processing, 16(6), 2006, 922-930.
 Polat K, Latifoğlu F, Kara S, Güneş S, Usage of Novel Similarity Based Weighting Method to Diagnose the Atherosclerosis from Carotid Artery Doppler Signals, Medical & Biological Eng. & Computing, 46, 2008, 353-362.
 Haralick and Shapiro, 1992 R.M. Haralick and L.G. Shapiro, Computer and robot vision Vol. 1, Addison-Wesley (1992).
 Dua, S., Singh, H., Thompson, H.W., Associative classification of mammograms using weighted rules, Expert Systems with Applications, 36(5), 2009, 9250-9259.
 Datasets used for classification comparison of results. http://www.phys.uni.torun.pl/kmk/projects/datasets.html (last accessed: 2016).
 Polat, K., Güneş, S., An improved approach to medical data sets classification: artificial immune recognition system with fuzzy resource allocation mechanism, Expert Systems, 24(4), 252-270 (2007).
 Besser, G.M., H.J. Bodansky and A.G. Cudworth (1988) Clinical Diabetes: An Illustrated Text, London: Gower Medical.
 www.cormactech.com/neunet, (last accessed: 2016).
 Ubeyli, E.D. Comparison of different classification algorithms in clinical decision-making, Expert Systems, 24(1), 2007, 17-31.
 UCI machine learning database, ftp://ftp.ics.uci.edu/pub/machine-learning-databases (last accessed: 2016).
 MacQueen, J. B., Some Methods for classification and Analysis of Multivariate Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, 1967, 1:281-297.
 Bezdek, J. C., Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, 1981, New York
 Yager, R. R., Filev, D. P., Generation of fuzzy rules by mountain clustering, IEEE Transactions on Systems, Man and Cybernetics, (1994), 24, 209–219.
 Chiu, S. L., Fuzzy model identification based on cluster estimation, Journal of Intelligent and Fuzzy Systems, (1994), 2.
 Xu R., Wunsch II D., Survey of Clustering Algorithms, IEEE Transactions on Neural Networks, 2005, 16(3) 645:678.
 Höppner, F., Klawonn, F., and Kruse, R., Fuzzy Cluster Analysis: Methods for Classification, Data Analysis, and Image Recognition. New York: Wiley, 1999.
 Guldemır, H., Sengur, A., Comparison of clustering algorithms for analog modulation classification, Expert Systems with Applications, 30(4), 2006, 642-649.
 V. Vapnik, 1995. The Nature of Statistical Learning Theory, Springer, New York.
 http://research.microsoft.com/~jplatt/svm.html (last arrived: 2016).
 Dasarathy, B. V., editor (1991) Nearest Neighbor (NN) Norms: NN Pattern Classification Techniques, ISBN 0-8186-8930-7.
 Shakhnarovich, G., Darrell, T., Indyk, P., Nearest-Neighbor Methods in Learning and Vision, The MIT Press, 2005, ISBN 0-262-19547-X
 http://www.resample.com/xlminer/help/k-NN/knn_intro.htm, (last accessed: 2016).