{"title":"Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs","authors":"Pilar Rey-del-Castillo, Jes\u00fas Carde\u00f1osa","volume":31,"journal":"International Journal of Computer and Information Engineering","pagesStart":1843,"pagesEnd":1851,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/7285","abstract":"
There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.<\/p>\r\n","references":"[1] J. L. Schafer, Analysis of Incomplete Data, Chapman & Hall, London,1997.\r\n[2] P. Allison, Missing Data, Sage Publications, Inc, 2002.\r\n[3] R. J. Little, and D. B. Rubin, Statistical Analysis with Missing Data, 2nd\r\ned. , John Wiley and Sons, New York, 2002.\r\n[4] A. P. Dempster, and D. B. Rubin, \"Incomplete data in sample surveys\"\r\nin W. G. Madow, I. Olkin, and D. B. Rubin, Eds., Sample Surveys, Vol.\r\nII: Theory and Annotated Bibliography, New York, Academic Press,1983.\r\n[5] S. Mitra, S. K. Pal, and P. Mitra, \"Data mining in soft computing framework: a survey\", IEEE Transactions on Neural Networks, vol. 13,\r\nissue 1, pp. 3-14, Jan. 2002.\r\n[6] P. K. Simpson, \"Fuzzy min-max neural networks- Part 1: classification\",\r\nIEEE Transactions on Neural Networks, vol. 3, Sep. 1992, pp. 776-786.\r\n[7] P. K. Simpson, \"Fuzzy min-max neural networks- Part 2: clustering\",\r\nIEEE Transactions on Fuzzy Systems, vol. 1, pp. 32-45, Feb. 1993.\r\n[8] D. R. Cox, Principles of Statistical Inference, Cambridge University\r\nPress, 2006.\r\n[9] J. Carde\u251c\u2592osa, and P. Rey-del-Castillo, \"A fuzzy control approach for\r\nvote estimation\", Proceedings of the Fifth International Conference on\r\nInformation Technologies and Applications, vol. 1. Varna, Bulgaria, June 2007.\r\n[10] M. Abdella, and T. Marwala, \"The Use of Genetic Algorithms and\r\nNeural Networks to Approximate Missing Data in Database\", ICCC\r\n2005, IEEE 3rd International Conference on Computational\r\nCybernetics, pp. 207-212, 2005.\r\n[11] F. V. Nelwamondo, S. Mohamed, and T. Marwala, \"Missing Data: A\r\nComparison of Neural Network and Expectation Maximization\r\nTechniques\", Current Science, vol. 93, no. 11, pp. 1514-1521, Dec. 2007.\r\n[12] P. Lingras, M. Zhong, and S. Sharma, \"Evolutionary Regression and\r\nNeural Imputations of Missing Values\", Soft Computing Applications in\r\nIndustry, Studies in Fuzziness and Soft Computing Series, vol. 226,\r\nSpringer, Berlin\/Heidelberg, pp. 151-163, 2008.\r\n[13] B. Gabrys, and A. Bargiela, \"General Fuzzy Min-Max Neural Network\r\nfor Clustering and Classification\", IEEE Transactions on Neural\r\nNetworks, vol. 11, pp. 769-783, May 2000.\r\n[14] B. Gabrys, \"Neuro-Fuzzy Approach to Processing Inputs with Missing\r\nValues in Pattern Recognition Problems\". International Journal of Approximate Reasoning, vol. 30, pp. 149-179, September 2002.\r\n[15] M. J. Greenacre, Theory and Applications of Correspondence Analysis,\r\nAcademic Press, London, 1984\r\n[16] T. J. Santner, and D. E. Duffy, \"A Note on A. Albert and J. A.\r\nAnderson-s Conditions for the Existence of Maximum Likelihood Estimates in Logistic Regression Models\", Biometrika, vol. 73, pp. 755-\r\n758, 1986.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 31, 2009"}