Genetic Algorithm for Feature Subset Selection with Exploitation of Feature Correlations from Continuous Wavelet Transform: a real-case Application
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33132
Genetic Algorithm for Feature Subset Selection with Exploitation of Feature Correlations from Continuous Wavelet Transform: a real-case Application

Authors: G. Van Dijck, M. M. Van Hulle, M. Wevers

Abstract:

A genetic algorithm (GA) based feature subset selection algorithm is proposed in which the correlation structure of the features is exploited. The subset of features is validated according to the classification performance. Features derived from the continuous wavelet transform are potentially strongly correlated. GA-s that do not take the correlation structure of features into account are inefficient. The proposed algorithm forms clusters of correlated features and searches for a good candidate set of clusters. Secondly a search within the clusters is performed. Different simulations of the algorithm on a real-case data set with strong correlations between features show the increased classification performance. Comparison is performed with a standard GA without use of the correlation structure.

Keywords: Classification, genetic algorithm, hierarchicalagglomerative clustering, wavelet transform.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1330915

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1231

References:


[1] S. Raudys, and V. Pikelis, "On dimensionality, sample size, classification error, and complexity of classification algorithm in pattern recognition," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. PAMI-2, no. 3, pp. 242-252, May 1980.
[2] S. Raudys, and K. Jain, "Small sample size effects in statistical pattern recognition: recommendations for practitioners," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 13, no. 3, pp. 252-264, March 1991.
[3] I. Daubechies, Ten Lectures On Wavelets, CBMS Regional Conference Series in Applied Mathematics # 61, SIAM, 1992.
[4] G. Van Dijck, M. Wevers, and M. Van Hulle, "Corrosion time series classification using the continuous wavelet transform and MML density estimation," submitted for publication, International Conference on Computational Intelligence, ICCI 2004.
[5] G. John, R. Kohavi, and K. Pfleger, "Irrelevant features and the subset selection problem," in Machine Learning: Proc. of the Eleventh Int. Conf., Morgan Kauffman, 1994, pp. 121-129.
[6] R. Kohavi, and G. John, "Wrappers for feature subset selection," Artificial Intelligence, vol. 97, spec. issue on relevance, pp. 273-324, Dec. 1997.
[7] L. Citi, R. Poli, and F. Sepulveda, "An evolutionary approach to feature selection and classification in P300-based BCI," in Proc. of the 2nd Int. Brain-Computer Interface Workshop and Training Course, Graz, 2004
[8] E. Kalapanidas, and N. Avouris, "Feature selection using a genetic algorithm applied on an air quality forecasting problem," AI communications, vol. 16, no. 4, pp. 235-251, 2003.
[9] J. Yang and V. Honavar, "Feature subset selection using a genetic algorithm," IEEE Intelligent Systems, vol. 13, nr. 2, pp. 44-49, 1998.
[10] Z. Michalewicz, Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, 3rd edition, 1999.
[11] M. A.T. Figueirido, and A.K. Jain, "Unsupervised learning of finite mixture models," IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp.381-396, March 2002.
[12] R. O. Duda, P.E. Hart, and D.G. Stork, Pattern Classification, 2nd ed., Wiley-Interscience, 2000, pp. 550-559.
[13] G. W. Milligan, and M.C. Cooper, "An examination of procedures for detecting the number of clusters in a data set," Psychometrika 50(2), pp. 159-179, 1985.