Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30174
Correlation-based Feature Selection using Ant Colony Optimization

Authors: M. Sadeghzadeh, M. Teshnehlab

Abstract:

Feature selection has recently been the subject of intensive research in data mining, specially for datasets with a large number of attributes. Recent work has shown that feature selection can have a positive effect on the performance of machine learning algorithms. The success of many learning algorithms in their attempts to construct models of data, hinges on the reliable identification of a small set of highly predictive attributes. The inclusion of irrelevant, redundant and noisy attributes in the model building process phase can result in poor predictive performance and increased computation. In this paper, a novel feature search procedure that utilizes the Ant Colony Optimization (ACO) is presented. The ACO is a metaheuristic inspired by the behavior of real ants in their search for the shortest paths to food sources. It looks for optimal solutions by considering both local heuristics and previous knowledge. When applied to two different classification problems, the proposed algorithm achieved very promising results.

Keywords: Ant colony optimization, Classification, Datamining, Feature selection.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1082665

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2076

References:


[1] Hall, M.A. Correlation-based Feature selection for Machine Learning, Ph.D. Thesis, Department of Computer Science. Hamilton, New Zeland: The University of Waikato, 1999.
[2] Kohavi, R., and John, G. H, : The Wrapper Approach, Feature Selection for Knowledge Discovery and Data Mining, H. Liu and H. Motoda Eds., Kluwer Academic Publishers, 1998, 33-50.
[3] Almuallim, H. and Dietterich, T. G., Learning with many irrelevant features, Proceedings of the Ninth National Conference on Artificial Intelligence, AAAI Press, 1991, 547-552.
[4] Koller, D. and Sahami, M. , Toward optimal feature selection, Proceedings of the Thirteenth International Conference on Machine Learning, Morgan Kaufmann, 1996, 248-292.
[5] Kira, K. and Rendell, L. A practical approach to feature selection, Proceedings of the Ninth International Conferene on Machine Learning, Morgan Kaufmann, 1992, 249-256.
[6] Dash, M., Liu, H. , Consistency-based search in feature selection, Artificial Intelligence 151, Elsevier Pub., 2003, 155-176.
[7] Deng, K. and Moore, A., On the Greediness of Feature Selection Algorithms, International Conference of Machine Learning (ICML'98), Van den Herik H. J. and Weijters Eds T. , University Maastricht, The Netherlands, 1998.
[8] Skalak, D., Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms, Eleventh International MachineLearning Conference, Morgan Kaufmann, New Brunswick, NJ, 1994, 293-301.
[9] J. Klitter. Feature set search algorithms. In C. H. Chen, editor, Pattern Recognition and signal Processing. Sijhoff and Noordhoff, the Netherlands, 1978.
[10] P. Paudil, J. Novovicova, and J. Klittler. Floating search methods in feature selection. Pattern Recognition Letters, 15: 1994, 1119-1125.
[11] P.M. Narendra and K. Fukunaga, A branch and bound algorithm for feature subset selection. IEEE Transaction on Computers, C-26, 1977, 917-922.
[12] J. Yang and V. Honavar, Feature subset selection using a genetic algorithm, IEEE Transactions on Intelligent Systems, 13, 1998, 44-49.
[13] Leardi, R., Boggia, R. and Terrile, M., Genetic Algorithms as a Strategy for Feature Selection, Journal of Chemometrics, vol. 6, 1992, 267-281
[14] M. Sadeghzadeh and M.H. Shenasa, Correlation Based Feature Selection Using Genetic Algorithms, Proceedings of the Third International Conference on Modeling, Simulation and Applied Optimization, Sharjah, U.A.E. , 2009.
[15] M. Gletsos, S.G. Mougiakakou, G.K. Matsopoulos, K.S. Nikita, A. S. Nikita, and D. Kelekis. A Computer-Aided Diagnostic System to Characterized CT Focal Liver Lesions: Design and Optimization of a Neural Network Classifier, IEEE Transactions on Information Technology in Biomedicine, 7, 2003, 153-162.
[16] I.-S. Oh, J.-S. Lee, and B.-R. Moon, Hybrid Genetic Algorithms for Feature Selection, IEEE Transactions on Pattern Analysis and Machine Inteligence, 26, 2004, 1424-1437.
[17] M. Sadeghzadeh and M.H. Shenasa, Correlation Based Feature Selection Using Evolutionary Programming, Proceedings of the Twelfth IASTED International Conference on Artificial Intelligence and Soft Computing, Palma de Mallorca, Spain , 2008, 185-190.
[18] M. Dorigo, V. Maniezzo, and A. Colorni. Ant Systems: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 26: 1996, 29-41.
[19] T. Stutzle and M. Dorigo, The Ant Colony Optimization Metaheuristic: Algorithm, Applications, and Advances. In F. Glover and G. Kochenberger, editors, Handbook of Metaheuristics, Kluwer Academic Publishers, Norwell, MA, 2002.
[20] G. Di Caro and M. Dorigo, AntNet:Distributed stigmergetic control for communications networks. Journal of Artificial Intelligence Research, 9: 1998, 317-365.
[21] R. S. Parpinelli: H. S. Lopes, A. A. Freitas, Data mining with an ant colony optimization algorithm, IEEE Transactions on Evolutionary Computation,6, 2002, 321-332.
[22] G. Di Caro and M. Dorigo. AntNet: Distributed stigmergetic control for communications networks. Journal of Artificial Inteligence Research, 9, 1998, 317-365.
[23] Blake, C., Keogh, E., and Merz C. J., UCI Repository of Machine Learning Data Bases, University of California, Department of Information and Computer Science,Irvine,CA.,1998. (http://www.ics.uci.edu/~mlean/MLRepository.html)
[24] Langky, P. and Sage, S., Scaling to domains with irrelevant features, R. Greiner (ed) Computational Learning Theory and Natural Learning Systems,MIT Press, 1994.