{"title":"Neural Network Imputation in Complex Survey Design","authors":"Safaa R. Amer","volume":12,"journal":"International Journal of Computer and Information Engineering","pagesStart":4037,"pagesEnd":4043,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/14932","abstract":"
Missing data yields many analysis challenges. In case of complex survey design, in addition to dealing with missing data, researchers need to account for the sampling design to achieve useful inferences. Methods for incorporating sampling weights in neural network imputation were investigated to account for complex survey designs. An estimate of variance to account for the imputation uncertainty as well as the sampling design using neural networks will be provided. A simulation study was conducted to compare estimation results based on complete case analysis, multiple imputation using a Markov Chain Monte Carlo, and neural network imputation. Furthermore, a public-use dataset was used as an example to illustrate neural networks imputation under a complex survey design<\/p>\r\n","references":"[1] Paul D. Allison (1999). \"Multiple imputation for missing data: A\r\ncautionary tale\". Available:\r\nhttp:\/\/www.ssc.upenn.edu\/~allison\/MultInt99.pdf\r\n[2] S. Amer, V. Lesser, and R. Burton, \"Neural network imputation, a new\r\nfashion or a good tool: Linear neural network imputation,\" Proceedings\r\nof the Survey Research Section, American Statistical Association\r\nMeetings, 2003.\r\n[3] D.A. Binder, W. SUN, \"Frequency valid multiple imputation for surveys\r\nwith a complex design. Proceedings of the Section on Survey Research\r\nMethods\", American Statistical Association,, pp. 281-286, 1996.\r\n[4] C.M. Bishop, Neural networks for pattern recognition. Oxford:\r\nClarendon Press, 1995.\r\n[5] K.R.W. Brewer, and R.W. Mellor, \"The effect of sample structure on\r\nanalytical surveys,\" Australian Journal of Statistics, 15, pp. 145-152,\r\n1973.\r\n[6] E.M. Burns, \"Multiple imputation in a complex sample survey,\"\r\nProceedings of the Survey Research Methods Section of the American\r\nStatistical Association, pp. 233-238, 1989.\r\n[7] G. Casella, and R.L. Burger, Statistical inference. California: Duxbury\r\npress, 1990.\r\n[8] R.L. Chambers, and C.J. Skinner (eds.) Analysis of survey data. Chester:\r\nWiley, 2003.\r\n[9] W.G. Cochran, Sampling techniques, (3rd Edition). New York: Wiley,\r\n1977.\r\n[10] L.M. Collins, J. L. Schafer, and C-M. Kam, \"A comparison of inclusive\r\nand restrictive strategies in modern missing data procedures\",\r\nPsychological Methods, 6 (4), pp. 330-351, 2001.\r\n[11] I. P. Fellegi, and D. Holt. \"A systematic approach to automatic edit and\r\nimputation,\" Journal of the American Statistical Association, 71, pp. 17-\r\n35, 1976.\r\n[12] A.E. Gelman, J.B.Carlin, H.S. Stern, and D.B. Rubin. Bayesian data\r\nanalysis, London: Chapman & Hall, 1995.\r\n[13] A.E. Gelman and D.B. Rubin. \"Inference from iterative smulation using\r\nmultiple sequences,\" Statistical Science, 7, pp. 457-472, 1992.\r\n[14] S. Geman, and D. Geman. \"Stochastic relaxation, Gibbs distributions,\r\nand the Bayesian restoration of images,\" IEEE Transactions on Pattern\r\nAnalysis and Machine Intelligence, 6, pp. 721-741, 1984.\r\n[15] C.J. Geyer. \"Practical Markov Chain Monte Carlo,\" Statistical Science,\r\n7(4), 1992.\r\n[16] M.H. Hansen, W.N. Hurwitz, and W.G. Madow. Sampling survey\r\nmethods and theory, Vols. I and II. New York: Wiley, 1953.\r\n[17] T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical\r\nlearning: Data mining, inference, and prediction. Springer, New York,\r\n2001.\r\n[18] N.J. Horton and S.R. Lipsitz. \"Multiple imputation in practice:\r\nComparisons of software packages for regression models with missing\r\nvariables,\" The American Statistician, 5(3), 2001.\r\n[19] R.A. Jacobs, M.I. Jordan, S.J. Nolman, and G.E. Hinton. \"Adaptive\r\nmixtures of local experts,\" Neural Computation, 3, pp. 79-87(1991)..\r\n[20] L. Kish. Survey sampling, New York: Wiley, 1965.\r\n[21] Kish, L. \"The Hundred years- wars of survey sampling,\" Statistics in\r\nTransition, 2, pp. 813-830, 1995.\r\n[22] H. Lee, E. Rancourt, and C.E. S\u251c\u00f1rndal. \"Variance estimation from\r\nsurvey fata under single imputation,\" Survey Nonresponse, R.M. Groves,\r\nD.A. Dillman, J.L. Eltinge, and R.J.A. Little, (Eds). New York: John\r\nWiley and Sons, 2002.\r\n[23] Little, Roderick J.A. and Rubin, Donald B. Statistical analysis with\r\nmissing data, New Jersey: John Wiley & Sons, 2002.\r\n[24] S. L. Lohr. Sampling: Design and analysis, Duxbury Press, 1999.\r\n[25] P.C. Mahalanobis. \"Recent experiments in statistical sampling in the\r\nIndian Statistical Institute,\" Journal of the Royal Statistical Societ,, 109,\r\npp. 325-370, 1946.\r\n[26] D.A. Marker, D.R. Judkins, and M. Winglee. \"Large-scale imputation\r\nfor complex surveys.\" R.M. Groves, D.A.Dillman, J.L. Eltinge, and\r\nR.J.A Little, (Eds.) Survey Nonresponse, New York: John Wiley and\r\nSons, 2002.\r\n[27] National Center for Health Statistics. Data file documentation, National\r\nHealth Interview Survey, 2001 (machine readable file and\r\ndocumentation). National Center for Health Statistics, Centers for\r\nDisease Control and Prevention, Hyattsville, Maryland, 2002.\r\n[28] E. Rancourt, C.-E. S\u251c\u00f1rndal, and H. Lee. \"Estimation of the variance in\r\npresence of nearest neighbor imputation,\" Proceedings of the Section on\r\nSurvey Research Methods, American Statistical Association, pp. 888-\r\n893, 1994.\r\n[29] I. Rivals and L. Personnaz. \"Construction of confidence intervals for\r\nneural networks based on least squares estimation,\" Neural Networks,\r\n13, 463-484 (2000)..\r\n[30] D.B. Rubin. \"Formalizing subjective notions about the effect of nonrespondents\r\nin sample surveys,\" Journal of the American Statistical\r\nAssociation, 77, pp. 538-543, 1977.\r\n[31] C.-E. S\u251c\u00f1rndal, B. Swensson, and J. Wretman. Model assisted survey\r\nsampling, Springer-Verlag, 1991.\r\n[32] C.-E. S\u251c\u00f1rndal. \"Methods for estimating the precision of survey estimates\r\nwhen imputation has been used,\" Survey Methodology, 18, pp. 241-265,\r\n1992.\r\n[33] J.L. Schafer. Analysis of incomplete multivariate data. London:\r\nChapman and Hall, 1997.\r\n[34] J. Schimert, J.L. Schafer, T.M. Hesterberg, C. Fraley, and D.B.\r\nClarkson. Analyzing data with missing values in S-Plus. Seattle:\r\nInsightful Corp, 2000.\r\n[35] A.F.M. Smith and G.O. Roberts. \"Bayesian computation via the Gibbs\r\nsampler and related Markov Chain Monte Carlo methods,\" Journal of\r\nthe Royal Statistical Society, Series B, 5(1), 1992.\r\n[36] Vartivarian, S.L. and Little, R.J. (2003). \"Weighting adjustments for unit\r\nnonresponse with multiple outcome variables,\" The University of\r\nMichigan Department of Biostatistics (Working Paper Series: Working\r\nPaper 21.) Available: http:\/\/www.bepress.com\/umichbiostat\/paper21\r\n[37] R.S. Woodruff. \"A simple method for approximating the variance of a\r\ncomplicated estimate,\" Journal of the American Statistical Association,\r\n66, pp. 411-414, 1971.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 12, 2007"}