Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30296
Feature Subset Selection approach based on Maximizing Margin of Support Vector Classifier

Authors: Khin May Win, Nan Sai Moon Kham

Abstract:

Identification of cancer genes that might anticipate the clinical behaviors from different types of cancer disease is challenging due to the huge number of genes and small number of patients samples. The new method is being proposed based on supervised learning of classification like support vector machines (SVMs).A new solution is described by the introduction of the Maximized Margin (MM) in the subset criterion, which permits to get near the least generalization error rate. In class prediction problem, gene selection is essential to improve the accuracy and to identify genes for cancer disease. The performance of the new method was evaluated with real-world data experiment. It can give the better accuracy for classification.

Keywords: Support Vector Machines, Feature selection, microarray data, recursive featureelimination

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1085565

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1208

References:


[1] A.Jain and D.Zongker, "Feature Selection: Evaluation, application and small performance", IEEE Transaction on Pattern Analysis and Machine Intelligence,vol.19,no.2pp.153-158,1997.
[2] C.Emmanouilidis, A.Hunter,and J.MacIntyre,"A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator", in Proceedings of the 2000 Congress on Evolutionary Computation(CEC00).
[3] Cancer Program Data Set
[http://www.broad.mit.edu/cgibin/ cancer/datasets.cgi]
[4] Eisen, M.B and brown, P.O.(1999); DNA arrays for analysis of gene expression. Methods Enzymbol, 303: 179-205
[5] Furey TS, Cristianini N, Duffy N, Bednarski DW,Schummer M, Haussler D. (2000). Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics 16:906-914.
[6] Kim,H.D.and Cho,S.-B(2000):Genetic optimization of structure-adaptive self-oranization map for efficient classification. Proc. of International Conference on Soft Computing,34-39,World-Scientific Publishing.
[7] K.M. Win and Kham N.S.M, "Minimizing Essential Set Based Feature selection for Cancer Classification", ICCA2008, Yangon, Myanmar, Feb 14-15, 2008
[8] M.PBrown,W.N.Grundy,D.Lin,N.Cristianini, C.W. Sugnet,J. Ares,Manuel, and D.Haussel"Support Vector machine classification of microarray gene expression data",Universith of California,Santa Cruz,Tech.June 1999.
[9] P.Larranga and J.Lozano. Estimation of distribution Algorithm:A new Tool for Evolutionary Optimization. Kluwer Academic Publishers, Boston, USA, 2001
[10] R.Kohavi and G.H.John,"Wrappers for feature subset selection" Artificial Intelligent,vol.97,no.1-2,pp.273-324,1997.
[11] R.Gilad-Bachrah,A.Navot,N.Tishby,"Margin based feature selection theory and algorithms"in proceeding of the 21st International Conference on Machine Learning(ICML04).New York :ACM Press,2004.
[12] Shamir, R. and Shanran,R (2001):Algorithmic approaches to clustering gene expression data.Current Topic in Computational Biology.In Jiang,T.,Smith,T.,Xu,Y.and Zhang .M.Q.(eds),MIT press
[13] T.Marill and D.Green,"On the effectiveness of reporters in recognition systems",IEEE Transactions on Information Theory,vol.9,pp.11-17,1999
[14] T.Paul and H.Iba.Selection of the most useful subset of genes for gene expression -based classification. In Proceeding of the 2004 Congress onEvolutionary Computation (CEC 2004),pages 2076- 2083,Portland,Oregon,USA,2004.
[15] V.N Vapnik. The Nature of Statistical Learning Theory .Springer, New York, 1995.
[16] Xu J, Zhang X, Li Y: Kernel MSE algorithm: A unified framework for KFD, LS-SVM and KRR. In Proceedings of the International Joint Conference on Neural Networks: 15-19 July 2001 Washington, DC, IEEE;2001:1486-1491.
[17] Zhu J, Hastie T: Classification of gene microarrays by penalized logistic regression. Biostatistics 2004, 5:427-443.