A Hybrid Gene Selection Technique Using Improved Mutual Information and Fisher Score for Cancer Classification Using Microarrays
Authors: M. Anidha, K. Premalatha
Abstract:
Feature Selection is significant in order to perform constructive classification in the area of cancer diagnosis. However, a large number of features compared to the number of samples makes the task of classification computationally very hard and prone to errors in microarray gene expression datasets. In this paper, we present an innovative method for selecting highly informative gene subsets of gene expression data that effectively classifies the cancer data into tumorous and non-tumorous. The hybrid gene selection technique comprises of combined Mutual Information and Fisher score to select informative genes. The gene selection is validated by classification using Support Vector Machine (SVM) which is a supervised learning algorithm capable of solving complex classification problems. The results obtained from improved Mutual Information and F-Score with SVM as a classifier has produced efficient results.
Keywords: Gene selection, mutual information, Fisher score, classification, SVM.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1124645
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1155References:
[1] P. Ganesh Kumar and T. Aruldoss Albert Victorie, Multistage Mutual Information for Informative Gene Selection, Journal of Biological Systems, Vol. 19, No.4, pp 1–221,2011.
[2] P. E. H. R. O. Duda and D. G. Stork. Pattern Classification. Wiley-Interscience Publication, 2001.
[3] Saeys Y, Inza I, Larranaga P, A review of feature selection techniques in bioinformatics, Bioinformatics 23, pp 2507–2517, 2007.
[4] Sun-Yuan Kung, Man-Wai Mak, Feature Selection for Self-Supervised Classification with Applications to Microarray and Sequence Data, IEEE Journal of Selected Topics in Signal Processing, Vol. 2, No. 3, June 2008.
[5] Zhou X, Wang X, Dougherty ER, Nonlinear probit gene classification using mutual information and wavelet-based feature selection, J Biol Syst, Vol. 2, No. 3, pp.371–386, 2004.
[6] Fleuret F, Fast binary feature selection with conditional mutual information, J Mach Learn Res 5, pp 1531–1555, 2004.
[7] Xiaosheng Wang and Osamu gotoh, A Robust Gene selection Method for Microarray-based cancer Classification, Cancer Informatics, pp. 15–30,2010.
[8] Xiaosheng Wang and Osamu gotoh, Accurate Molecular Classification of Cancer using simple Rules, BMC Medical Genomics, pp 2:64,2009.
[9] Laiwan Chan, "Informative Gene Discovery for Cancer Classification from Microarray Expression Data", IEEE Workshop on Machine Learning for Signal Processing, vol.28, pp.393-398, Sept. 2005.
[10] T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, E.S. Lander, “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring” Science, 286, pp.531–537,1999.
[11] Quanquan Gu, Zhenhui Li and Jiawei Han, Generalized Fisher Score for Feature Selection, In Proc. of the 27th Conference on Uncertainty in Artificial Intelligence (UAI), Barcelona, Spain, 2011.
[12] Terrence S. Furey, Nello Cristianini, Nigel Duffy, David W. Bednarski, Michel Schummer, David Haussler, Support Vector Machine classification and validation of cancer tissue samples using microarray expression data, Bioinformatics, Vol. 16, No. 10, pp.906–914, 2000.
[13] Dina A. Salem, Rania Ahmed A. A. Abul Seoud, and Hesham A. Ali,” A New Gene Selection Technique Based on Hybrid Methods for Cancer Classification Using Microarrays”, International Journal of Bioscience, Biochemistry and Bioinformatics, Vol. 1, No. 4, November 2011.
[14] Ghaffari, Noushin, and Hisham Al-Mubaid. "A New Gene Selection Technique Using Feature Selection Methodology." In Computers and Their Applications, pp. 217-222. 2006.
[15] Hala Alshamlan, Ghada Badr, and Yousef Alohali1, mRMR-ABC: A Hybrid Gene Selection Algorithm for Cancer Classification Using Microarray Gene Expression Profiling, Hindawi Publishing Corporation BioMed Research International Volume ,2015.
[16] Li-Yeh Chuang, Cheng-Huei Yang, Jung-Chike Li and Cheng-Hong Yang, A Hybrid BPSO-CGA Approach for Gene Selection and Classification of Microarray Data, Journal of Computational Biology Volume 19, No 1, pp. 68–82,2012.
[17] Han F, Sun W, Ling Q-H, A Novel Strategy for Gene Selection of Microarray Data Based on Gene-to-Class Sensitivity Information. PLoS ONE, 2014