A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32797
A Comparative Study of Additive and Nonparametric Regression Estimators and Variable Selection Procedures

Authors: Adriano Z. Zambom, Preethi Ravikumar

Abstract:

One of the biggest challenges in nonparametric regression is the curse of dimensionality. Additive models are known to overcome this problem by estimating only the individual additive effects of each covariate. However, if the model is misspecified, the accuracy of the estimator compared to the fully nonparametric one is unknown. In this work the efficiency of completely nonparametric regression estimators such as the Loess is compared to the estimators that assume additivity in several situations, including additive and non-additive regression scenarios. The comparison is done by computing the oracle mean square error of the estimators with regards to the true nonparametric regression function. Then, a backward elimination selection procedure based on the Akaike Information Criteria is proposed, which is computed from either the additive or the nonparametric model. Simulations show that if the additive model is misspecified, the percentage of time it fails to select important variables can be higher than that of the fully nonparametric approach. A dimension reduction step is included when nonparametric estimator cannot be computed due to the curse of dimensionality. Finally, the Boston housing dataset is analyzed using the proposed backward elimination procedure and the selected variables are identified.

Keywords: Additive models, local polynomial regression, residuals, mean square error, variable selection.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1129205

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 977

References:


[1] R. L. Eubank, Nonparametric Regression and Spline Smoothing, Statistics: A Series of Textbooks and Monographs, 1999.
[2] P. J. Green and B.W. Silverman, Nonparametric Regression and Generalized Linear Models: A roughness penalty approach, Chapman & Hall, 1994.
[3] S. Efromovich, Nonparametric Curve Estimation: Methods, Theory, and Applications, Springer Series in Statistics, 1999.
[4] D. Ruppert, M. P. Wand, U. Holst and O. Hssjer, Local Polynomial Variance-Function Estimation, Technometrics, 39, pp. 262-273, 1997.
[5] R. T. Rust, Flexible Regression, Journal of Marketing Research, 25, pp. 10-24, 1988.
[6] S. Durrleman and R. Simon, Flexible regression models with cubic splines, 8, pp. 551-561, 1989.
[7] C. J. Stone, Additive Regression and Other Nonparametric Models,The Annals of Statistics, 13, pp. 689-705, 1985.
[8] D. L. Donoho, High-dimensional data analysis: The curses and blessings of dimensionality, AMS Conference on Math and Challenges of the 21st Century.
[9] W. Hardle and E. Mammen, Comparing Nonparametric Versus Parametric Regression Fits, The Annals of Statistics, 21, 1926-1947, 1993.
[10] N. R. Draper and H. Smith, Applied Regression Analysis, 3rd Edition, Wiley.
[11] G. A. Davis and N. L. Nihan, Nonparametric Regression and Short-Term Freeway Traffic Forecasting, Journal of Transportation Engineering, 117, 1991.
[12] J. G. Staniswalis and J.J. Lee, Nonparametric Regression Analysis of Longitudinal Data, Journal of the American Statistical Association, 93, pp. 1403-1418, 1998.
[13] P. Constans and J.D. Hirst, Nonparametric Regression Applied to Quantitative Structure Activity Relationships, Journal of Chemical Information and Modeling, 40, pp 452-459, 2000.
[14] J. Qiu, H. Wang, D. Lin and B. He, Nonparametric regression-based failure rate model for electric power equipment using lifecycle data, Transmission and Distribution Conference and Exposition (T&D), 2016 IEEE/PES.
[15] E. A. Nadaraya, On Estimating Regression, Theory of Probability and its Applications, 9, pp. 141-142, 1964.
[16] G. S. Watson, Smooth regression analysis, Sankhya: The Indian Journal of Statistics, Series A, 26, 359-372, 1964.
[17] W. S. Cleveland, Robust Locally Weighted Regression and Smoothing Scatterplots, Journal of the American Statistical Association, 74, 829-836, 1979.
[18] W. S. Cleveland, LOWESS: A program for smoothing scatterplots by robust locally weighted regression, The American Statistician, 35, 1981.
[19] M. P. Wand and M.C Jones, Kernel Smoothing, Chapman & Hall, 1995.
[20] Fan, J. and Gijbels, I, Local Polynomial Modelling and its Applications, Boca Raton: Chapman and Hall, 1996.
[21] E. Masry, Multivariate Local Polynomial Regression for Time Series: Uniform Strong Consistency and Rates, Journal of Time Series Analysis, 17, pp. 571-599, 1996.
[22] D. Ruppert and M.P.. Wand, Multivariate Locally Weighted Least Squares Regression, The Annals of Statistics, 22, pp. 1346-1370, 1994.
[23] J. H. Friedman and W. Stuetzle, Projection Pursuit Regression, Journal of the American Statistical Association, 76, 817-823, 1981.
[24] T. J. Hastie and R.J. Tibshirani, Generalized Additive Models, Chapman & Hall, 1990.
[25] A. Buja, T. Hastie and R.Tibshirani, Linear Smoothers and Additive Models, The Annals of Statistics, 17, 453-555, 1989.
[26] J.D. Opsomer, Asymptotic Properties of Backfitting Estimators, Journal of Multivariate Analysis, 73, 166-179, 2000.
[27] C. M. Hurvich, J. S. Simonoff and C.-L. Tsai, Smoothing Parameter Selection in Nonparametric Regression Using an Improved Akaike Information Criterion, Journal of the Royal Statistical Society. Series B, 60, pp. 271-293, 1998.
[28] Boston Housing Dataset, available at https://archive.ics.uci.edu/ml/datasets/Housing.