TY - JFULL AU - Geeta Sikka and Arvinder Kaur Takkar and Moin Uddin PY - 2010/3/ TI - Comparison of Imputation Techniques for Efficient Prediction of Software Fault Proneness in Classes T2 - International Journal of Computer and Information Engineering SP - 270 EP - 274 VL - 4 SN - 1307-6892 UR - https://publications.waset.org/pdf/9138 PU - World Academy of Science, Engineering and Technology NX - Open Science Index 38, 2010 N2 - Missing data is a persistent problem in almost all areas of empirical research. The missing data must be treated very carefully, as data plays a fundamental role in every analysis. Improper treatment can distort the analysis or generate biased results. In this paper, we compare and contrast various imputation techniques on missing data sets and make an empirical evaluation of these methods so as to construct quality software models. Our empirical study is based on NASA-s two public dataset. KC4 and KC1. The actual data sets of 125 cases and 2107 cases respectively, without any missing values were considered. The data set is used to create Missing at Random (MAR) data Listwise Deletion(LD), Mean Substitution(MS), Interpolation, Regression with an error term and Expectation-Maximization (EM) approaches were used to compare the effects of the various techniques. ER -