Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30174
Dataset Analysis Using Membership-Deviation Graph

Authors: Itgel Bayarsaikhan, Jimin Lee, Sejong Oh

Abstract:

Classification is one of the primary themes in computational biology. The accuracy of classification strongly depends on quality of a dataset, and we need some method to evaluate this quality. In this paper, we propose a new graphical analysis method using 'Membership-Deviation Graph (MDG)' for analyzing quality of a dataset. MDG represents degree of membership and deviations for instances of a class in the dataset. The result of MDG analysis is used for understanding specific feature and for selecting best feature for classification.

Keywords: feature, classification, machine learning algorithm.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1059453

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1048

References:


[1] H. Liu, J. Li, L. Wong, "A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns", Gene Informatics 13, 2002, pp51-60.
[2] S. Doraisamy, S. Golzari, N.M. Norowi, M.N.B Sulaiman, N.I. Udzir, "A Study on Feature selection and Classification Techniques for Automatic Genre Classification of Traditional Malay Music", Proc. of International Conference on Music Information Retrieval, 2008, pp331- 336.
[3] I. Guyon, A. Elisseeff, "An introduction to variable and feature selection", J. Mach. Learn. Res. 3, 2003, pp.1157-1182.
[4] R. Gilad-Bachrac, A. Navot, N. Tishby, "Margin based feature selection"theory and algorithms", Proceedings of the 21st International Conference on Machine Learning, 2004.
[5] K.H. Quah, C. Quek, "MCES: a novel Monte Carlo evaluative selection approach for objective feature selections", IEEE Trans. Neural Networks 18 (2), 2007.
[6] J. Dy, C.E. Brodley, "Feature selection for unsupervised learning", J. Mach. Learn. Res. 5, 2005, pp845-889 2005.
[7] K. Kira, L.A. Rendell, "A Practical Approach to Feature Selection", Proceedings of the Ninth International Conference on Machine Learning, 1992, pp249-256.
[8] W.S. Meisel, Computer-Oriented Approaches to Pattern Recognition, Academic Press, New York, 1972.
[9] S. Piramuthu, "The Housdorff Distance Measure for Feature Selection in Learning Applications", Proceedings of the 32nd Hawaii International Conference on System Sciencespp1-6, 1999.
[10] J. Liang, S. Yang, A. Winstanley, "Invariant Optimal Feature Selection: A Distance Discriminant and Feature Rranking Based Solution", The journal of the pattern recognition, 2008, pp1429-1439.
[11] K. Kira, and L.A. Rendell, "The feature selection problem: Traditional methods and a new algorithm", Proceedings of Ninth National Conference on Artificial Intelligence, 1992, pp129-134.
[12] Y. Sun and D. Wu, "A RELIEF Based Feature Extraction Algorithm", Proceedings of the 2008 SIAM International Conference on Data Mining, 2008, pp188-195.
[13] I. Kononenko, E. Simec, M. Robnik-Sikonja, "Overcoming the myopia of induction learning algorithms with RELIEFF", Applied Intelligence Vol7, 1, 1997, pp.39-55
[14] K. Nakai, Yeast Dataset, http://archive.ics.uci.edu/ml/datasets/Yeast.