Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31743
An SVM based Classification Method for Cancer Data using Minimum Microarray Gene Expressions

Authors: R. Mallika, V. Saravanan


This paper gives a novel method for improving classification performance for cancer classification with very few microarray Gene expression data. The method employs classification with individual gene ranking and gene subset ranking. For selection and classification, the proposed method uses the same classifier. The method is applied to three publicly available cancer gene expression datasets from Lymphoma, Liver and Leukaemia datasets. Three different classifiers namely Support vector machines-one against all (SVM-OAA), K nearest neighbour (KNN) and Linear Discriminant analysis (LDA) were tested and the results indicate the improvement in performance of SVM-OAA classifier with satisfactory results on all the three datasets when compared with the other two classifiers.

Keywords: Support vector machines-one against all, cancerclassification, Linear Discriminant analysis, K nearest neighbour, microarray gene expression, gene pair ranking.

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2363


[1] Chih-wei Hsu and chih jen Lin.2002, A Comparison of methods for multiclass Support vector machines, IEEE transactions on neural networks.
[2] Dudoit, S., Fridlyand, J., & T. P. Speed.2002. Comparison of discrimination methods for the classification of tumor using gene expression data. Journal of the American Statistical Association, 97, 77-87.
[3] Elena Marchiori, Michele Sebag.2005. Bayesian learning with local support vector Machines for cancer classification with gene expression data", Evo Workshops PP.74-83.
[4] T. R. Golub, D. K. Slonim and P. Tamayo et al.1999. Molecular classification of cancer:class discovery and prediction by gene expression, monitoring Science, 286:531-7.
[5] Juan Liu, Hitoshi Iba.2001. Selecting Informative Genes with Parallel Genetic algorithms in Tissue Classification, Genome Informatics 12: 14- 23.
[6] Li, Dardern, Weinberg, Levine, and Pedersen.2001. Gene assessment and sample classification for gene expression data using genetic algorithm/k-nearestneighbour method.Combinatorial Chemistry and High Throughput Screening,4(8):727-739.
[7] Lipo Wang, Feng Chu, and Wei Xie .2007. Accurate Cancer Classification Using Expressions of Very Few Genes, IEEE/ACM Transactions on computational biology and bioinformatics, vol. 4, no. 1, January-march.
[8] Li-Yeh Chuang,Chao-Hsuan Ke,Hsueh-Wei Chang,Cheng-hong Yang 2009. A two-stage Feature selection method for gene expression data, OMICS A journal of Integrative Biology, Volume.13, number 2.
[9] Mao Yong,Zhou Xiao-bo,PI Dao-ying,Sun You-xian et al.2005.Parameters selection in gene selection using Gaussian Kernel support vector machines by genetic algorithm, Journal of Zhejiang University SCIENCE 6B(10):961-973.
[10] Mingjun Song, Sanguthevar Rajasekaran 2007. A greedy correlationincorporated SVM-based Algorithm for gene selection, 21st International Conference on Advanced Information Networking and Applications workshop, IEEE.
[11] R. Tibshirani, T. Hastie, B. Narasimhan, G. Chu .2003. Class Prediction by Nearest Shrunken Centroids with Applications to DNA Microarrays, Statistical Science, vol. 18, pp. 104-117.
[12] O.Troyanskaya.2001. Missing values estimation methods for DNA Microarrays, Bioinformatics, vol.17, pp. 520-525.
[13] Tzu-Tsung Wong, Ching-Han Hsu.2008.Two-stage classification methods for microarray data, Science Direct, Expert Systems with Applications 34(2008) 375-383.
[14] Yoonkyung Lee, Cheo Koo Lee .2003. Classification of multiple cancer types by Multicategory support vector machines using gene expression data, Vol. 19 no. 9, Bioinformatics.