Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30184
Categorical Missing Data Imputation Using Fuzzy Neural Networks with Numerical and Categorical Inputs

Authors: Pilar Rey-del-Castillo, Jesús Cardeñosa

Abstract:

There are many situations where input feature vectors are incomplete and methods to tackle the problem have been studied for a long time. A commonly used procedure is to replace each missing value with an imputation. This paper presents a method to perform categorical missing data imputation from numerical and categorical variables. The imputations are based on Simpson-s fuzzy min-max neural networks where the input variables for learning and classification are just numerical. The proposed method extends the input to categorical variables by introducing new fuzzy sets, a new operation and a new architecture. The procedure is tested and compared with others using opinion poll data.

Keywords: Classifier, imputation techniques, fuzzy systems, fuzzy min-max neural networks.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1333518

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1348

References:


[1] J. L. Schafer, Analysis of Incomplete Data, Chapman & Hall, London,1997.
[2] P. Allison, Missing Data, Sage Publications, Inc, 2002.
[3] R. J. Little, and D. B. Rubin, Statistical Analysis with Missing Data, 2nd ed. , John Wiley and Sons, New York, 2002.
[4] A. P. Dempster, and D. B. Rubin, "Incomplete data in sample surveys" in W. G. Madow, I. Olkin, and D. B. Rubin, Eds., Sample Surveys, Vol. II: Theory and Annotated Bibliography, New York, Academic Press,1983.
[5] S. Mitra, S. K. Pal, and P. Mitra, "Data mining in soft computing framework: a survey", IEEE Transactions on Neural Networks, vol. 13, issue 1, pp. 3-14, Jan. 2002.
[6] P. K. Simpson, "Fuzzy min-max neural networks- Part 1: classification", IEEE Transactions on Neural Networks, vol. 3, Sep. 1992, pp. 776-786.
[7] P. K. Simpson, "Fuzzy min-max neural networks- Part 2: clustering", IEEE Transactions on Fuzzy Systems, vol. 1, pp. 32-45, Feb. 1993.
[8] D. R. Cox, Principles of Statistical Inference, Cambridge University Press, 2006.
[9] J. Carde├▒osa, and P. Rey-del-Castillo, "A fuzzy control approach for vote estimation", Proceedings of the Fifth International Conference on Information Technologies and Applications, vol. 1. Varna, Bulgaria, June 2007.
[10] M. Abdella, and T. Marwala, "The Use of Genetic Algorithms and Neural Networks to Approximate Missing Data in Database", ICCC 2005, IEEE 3rd International Conference on Computational Cybernetics, pp. 207-212, 2005.
[11] F. V. Nelwamondo, S. Mohamed, and T. Marwala, "Missing Data: A Comparison of Neural Network and Expectation Maximization Techniques", Current Science, vol. 93, no. 11, pp. 1514-1521, Dec. 2007.
[12] P. Lingras, M. Zhong, and S. Sharma, "Evolutionary Regression and Neural Imputations of Missing Values", Soft Computing Applications in Industry, Studies in Fuzziness and Soft Computing Series, vol. 226, Springer, Berlin/Heidelberg, pp. 151-163, 2008.
[13] B. Gabrys, and A. Bargiela, "General Fuzzy Min-Max Neural Network for Clustering and Classification", IEEE Transactions on Neural Networks, vol. 11, pp. 769-783, May 2000.
[14] B. Gabrys, "Neuro-Fuzzy Approach to Processing Inputs with Missing Values in Pattern Recognition Problems". International Journal of Approximate Reasoning, vol. 30, pp. 149-179, September 2002.
[15] M. J. Greenacre, Theory and Applications of Correspondence Analysis, Academic Press, London, 1984
[16] T. J. Santner, and D. E. Duffy, "A Note on A. Albert and J. A. Anderson-s Conditions for the Existence of Maximum Likelihood Estimates in Logistic Regression Models", Biometrika, vol. 73, pp. 755- 758, 1986.