Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30526
Efficient Tuning Parameter Selection by Cross-Validated Score in High Dimensional Models

Authors: Yoonsuh Jung


As DNA microarray data contain relatively small sample size compared to the number of genes, high dimensional models are often employed. In high dimensional models, the selection of tuning parameter (or, penalty parameter) is often one of the crucial parts of the modeling. Cross-validation is one of the most common methods for the tuning parameter selection, which selects a parameter value with the smallest cross-validated score. However, selecting a single value as an ‘optimal’ value for the parameter can be very unstable due to the sampling variation since the sample sizes of microarray data are often small. Our approach is to choose multiple candidates of tuning parameter first, then average the candidates with different weights depending on their performance. The additional step of estimating the weights and averaging the candidates rarely increase the computational cost, while it can considerably improve the traditional cross-validation. We show that the selected value from the suggested methods often lead to stable parameter selection as well as improved detection of significant genetic variables compared to the tradition cross-validation via real data and simulated data sets.

Keywords: cross validation, parameter averaging, regularization parameter search, Parameter Selection

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1201


[1] J. Zhu and T. Hastie, “Classification of gene microarrays by penalized logistic regression,” Biostatistics, vol. 5, no. 3, pp. 427 – 443, 2004.
[2] L. Shen and E. C. Tan, “Dimension reduction-based penalized logistic regression for cancer classification using microarray data,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 2, no. 2, pp. 166 – 175, 2005.
[3] C. Li and H. Li, “Network-constrained regularization and variable selection for analysis of genomic data,” Bioinformatics, vol. 24, no. 9, pp. 1175 – 1182, 2008.
[4] W. Pan, B. Xie, and X. Shen, “Incorporating predictor network in penalized regression with application to microarray data,” Biometrics, vol. 66, pp. 474 – 484, 2010.
[5] G. Fort and S. Lambert-Lacroix, “Classification using partial least squares with penalized logistic regression,” Bioinformatics, vol. 21, no. 7, pp. 1104 – 1111, 2005.
[6] G. C. Cawley and N. L. C. Talbot, “Gene selection in cancer classification using sparse logistic regression with bayesian regularization,” Bioinformatics, vol. 22, no. 19, pp. 2348 – 2355, 2006.
[7] L. Waldron, M. Pintilie, M.-S. Tsao, F. A. Shepherd, C. Huttenhower, and I. Jurisica, “Optimized application of penalized regression methods to diverse genomic data,” Bioinformatics, vol. 27, no. 24, pp. 3399 – 3406, 2011.
[8] P. Breheny and J. Huang, “Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection,” The Annals of Applied Statistics, vol. 5, no. 457, pp. 232 – 253, 2011.
[9] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 58, no. 1, pp. 267 – 288, 1996.
[10] J. Friedman, T. Hastie, and R. Tibshirani, “Regularization paths for generalized linear models via coordinate descent,” Journal of Statistical Software, vol. 33, no. 1, pp. 1 – 22, 2008.
[Online]. Available:
[11] N. Simon, J. Friedman, T. Hastie, and R. Tibshirani, “Regularization paths for cox’s proportional hazards model via coordinate descent,” Journal of Statistical Software, vol. 39, no. 5, pp. 1 – 13, 2011.
[Online]. Available:
[12] M. Y. Park and T. Hastie, “L1 regularization path algorithm for generalized linear models,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 69, no. 4, pp. 659 – 677, 2007.
[13] R. Tibshirani and J. Taylor, “The solution path of the generalized lasso,” Annals of Statistics, vol. 39, no. 3, pp. 1335 – 1371, 2011.
[14] M. Stone, “Cross-validatory choice and the assessment of statistical predictions (with discussion),” Journal of the Royal Statistical Society. Series B (Methodological), vol. 36, no. 2, pp. 111 – 147, 1974.
[15] S. Geisser, “The predictive sample reuse method with applications,” Journal of the American Statistical Association, vol. 70, no. 350, pp. 320 – 328, 1975.
[16] L. J. Buturovi´c, “Pcp: a program for supervised classification of gene expression profiles,” Bioinformatics, vol. 22, no. 2, pp. 245 – 247, 2006.
[17] V. V. Belle, K. Pelckmans, S. V. Huffel, and J. A. K. Suykens, “Improved performance on high-dimensional survival data by application of survival-svm,” Bioinformatics, vol. 27, no. 1, pp. 87 – 94, 2011.
[18] A.-L. Boulesteix, C. Porzelius, and M. Daumer, “Microarray-based classification and clinical predictors: on combined classifiers and additional predictive value,” Bioinformatics, vol. 24, no. 15, pp. 1698 – 1706, 2008.
[19] W. Pan and X. Shen, “Penalized model-based clustering with application to variable selection,” Journal of Machine Learning Research, vol. 8, pp. 1145 – 1164, 2007.
[20] T. Hancock, I. Takigawa, and H. Mamitsuka, “Mining metabolic pathways through gene expression,” Bioinformatics, vol. 26, no. 17, pp. 2128 – 2135, 2010.
[21] S. Arlot and A. Celisse, “A survey of cross-validation procedures for model selection,” Statistics Surveys, vol. 4, pp. 40 – 79, 2010.
[22] B. Efron and R. Tibshirani, “Improvements on cross-validation: The .632+ bootstrap method,” Journal of the American Statistical Association, vol. 92, no. 438, pp. 548 – 560, 1997.
[23] U. Braga-Neto, R. Hashimoto, E. R. Dougherty, D. V. Nguyen, and R. J. Carroll, “Is cross-validation better than resubstitution for ranking genes?” Bioinformatics, vol. 20, no. 2, pp. 253 – 258, 2004.
[24] B. Scholk¨opf, K. Sung, C. Burges, T. P. F. Girosi, P. Niyogi, and V. Vapnik., “Comparing support vector machines with gaussian kernels to radial basis function classifiers,” IEEE Trans. Sign. Processing, vol. 45, pp. 2758 – 2765, 1997.
[25] E. Dimitriadou, K. Hornik, F. Leisch, D. Meyer, and A. Weingessel, “e1071: Misc functions of the department of statistics (e1071),” TU Wien,Version 1.5-11, Tech. Rep., 2005.
[26] A. Karatzoglou, A. Smola, K. Hornik, and A. Zeileis, “kernlab – an S4 package for kernel methods in R,” Journal of Statistical Software, vol. 11, no. 9, pp. 1 – 20, 2004.
[Online]. Available:
[27] Y. Guo, T. Hastie, and R. Tibshirani, “Regularized linear discriminant analysis and its application in microarrays,” Biostatistics, vol. 8, pp. 86 – 100, 2007.
[28] G. Schwarz, “Estimating the dimension of a model,” The Annals of Statistics, vol. 6, no. 2, pp. 461 – 464, 1978.
[29] H. Akaike, “A new look at the statistical model identification,” IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716 – 723, 1974.
[30] J. Chen and Z. Chen, “Extended bayesian information criteria for model selection with large model spaces,” Biometrika, vol. 95, no. 3, pp. 759 – 771, 2008.
[31] H. Wang, B. Li, and C. Leng, “Shrinkage tuning parameter selection with a diverging number of parameters,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 71, no. 3, pp. 671 – 683, 2009.
[32] J. Chen and Z. Chen, “Extended BIC for small-n-large-p sparse GLM,” Statistica Sinica, vol. 22, pp. 555 – 574, 2012.
[33] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55 – 67, 1970.
[34] A. Karatzoglou, D. Meyer, and K. Hornik, “Support vector machines in r,” Journal of Statistical Software, vol. 15, no. 9, pp. 1 – 28, 4 2006.
[35] T. Golub, D. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. Mesirov, H. Coller, M. Loh, J. Downing, C. Caligiuri, M.A.and Bloomfield, and E. Lander, “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.” Science, vol. 286, pp. 531 – 537, 1999.
[36] U. Alon, N. Barkai, D. Notterman, K. Gish, S. Mack, and J. Levine, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.” Proceedings of the National Academy of Sciences of the USA, vol. 96, pp. 6745 – 6750, 1999.
[37] A. Alizadeh, M. Eisen, R. Davis, C. Ma, I. Lossos, A. Rosenwald, J. Boldrick, H. Sabet, T. Tran, and X. e. a. Yu, “Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling.” Nature, vol. 403, no. 6769, pp. 503 – 511, 2000.
[38] J. Khan, J. Wei, M. Ringner, L. Saal, M. Ladanyi, F. Westermann, F. Berthold, M. Schwab, and C. e. a. Antonescu, “Classification and diagnostic prediction of cancer using gene expression profiling and artificial neural networks.” Nature Medicine, vol. 7, pp. 673 – 679, 2001.
[39] D. Witten and R. Tibshirani, “Penalized classification using fisher’s linear discriminant,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 73, no. 5, pp. 753 – 772, 2011.
[40] H. Zou, “The adaptive lasso and its oracle properties,” Journal of the American Statistical Association, vol. 101, no. 476, pp. 1418 – 1429, 2006.
[41] L. W. Hahn, M. D. Ritchie, and J. H. Moore, “Multifactor dimensionality reduction software for detecting genegene and geneenvironment interactions,” Bioinformatics, vol. 19, no. 3, pp. 376 – 382, 2003.
[42] C. Kooperberg, M. LeBlanc, J. Y. Dai, and I. Rajapakse, “Structures and assumptions: Strategies to harness gene x gene and gene x environment interactions in GWAS,” Statistical Science, vol. 24, no. 4, pp. 472 – 488, 2009.