Case-Based Reasoning: A Hybrid Classification Model Improved with an Expert's Knowledge for High-Dimensional Problems
Authors: Bruno Trstenjak, Dzenana Donko
Abstract:
Data mining and classification of objects is the process of data analysis, using various machine learning techniques, which is used today in various fields of research. This paper presents a concept of hybrid classification model improved with the expert knowledge. The hybrid model in its algorithm has integrated several machine learning techniques (Information Gain, K-means, and Case-Based Reasoning) and the expert’s knowledge into one. The knowledge of experts is used to determine the importance of features. The paper presents the model algorithm and the results of the case study in which the emphasis was put on achieving the maximum classification accuracy without reducing the number of features.
Keywords: Case based reasoning, classification, expert's knowledge, hybrid model.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1125399
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1418References:
[1] P Ravisankar, V Ravi, and I Bose, "Failure prediction of dotcom companies using neural network–genetic programming hybrids," Information Sciences, vol. 180, no. 8, pp. 1257–1267, 2010.
[2] Rudrajeet Pal, Karel Kupkab, Arun, P. Anejac, and Jiri Militkyd, "Business health characterization: A hybrid regression and support vector machine analysis," Expert Systems with Applications, vol. 49, pp. 48-59, 2016.
[3] Guozhong Fenga, Jianhua Guo, Bing-Yi Jingc, and Tieli Sun, "Feature subset selection using naive Bayes for text classification," Pattern Recognition Letters, vol. 65, pp. 109–115, 2015.
[4] Sunita Beniwal and Jitender Arora, "Classification and feature selection techniques in data mining," International Journal of Engineering Research & Technology, vol. 1, pp. 1-6, 2012.
[5] Y. An, J. Baek, S. Shin, M. Chang, and J. Park, "Classification of feature set using K-means clustering from Histogram Refinement method," in Networked Computing and Advanced Information Management, Gyeongju, 2008, pp. 320 - 324.
[6] E. Hullermeier and W. Cheng, "Preference-based CBR: General ideas and basic principles," in Proceedings of the Twenty-Third international joint conference on Artificial Intelligence, 2013, pp. 3012-3016.
[7] B. Trstenjak and D. Donko, "Predicting quality of web service using IKS hybrid model," in Telecommunications (BIHTEL), 2014 X International Symposium, Sarajevo, BiH, 2014, pp. 1 - 6.
[8] Andreas Janecek, Wilfried Gansterer, Michael Demel, and Gerhard Ecker, "On the relationship between feature selection and classification accuracy," in Journal of Machine Learning Research: Workshop and Conference Proceedings 4, 2008, pp. 90-105.
[9] B. Trstenjak and D. Donko, "Determining the impact of demographic features in predicting student success in Croatia," in Information and Communication Technology, Electronics and Microelectronics (MIPRO), Opatija, Croatia, 2014.
[10] M. Sahu, S. Sharma, V. Raj, N. Nagwani and S. Verma, "Impact of Ranked Ordered Feature List (ROFL) on classification with visual data mining techniques," in International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT), Chennai, Tamilnadu, India, 2016.
[11] G., A. Klein, How people make decisions. London, Cambridge: 1st MIT Press pbk, 1999.
[12] M., M. Richter and R. Weber, "Case-Based Reasoning: A Textbook," in Basic CBR Elements: Springer Science & Business Media, 2013, pp. 17-34.
[13] G. Vinodhini and R., M. Chandrasekaran, "Sentiment mining using SVM based hybrid classification model," Cyber Security and Computational Models, vol. 246, pp. 155-162, 2013.
[14] Nader Salari, Shamarina Shohaimi, Farid Najafi, Meenakshii Nallappan, and Isthrinayagy Karishnarajah, "A novel hybrid classification model of Genetic algorithms, modified k-Nearest Neighbor and developed Backpropagation Neural Network," PLOS ONE, vol. 9, no. 11, pp. 1-50, 2014.
[15] M Aci, C Inan, and M Avci, "A hybrid classification method of K nearest neighbor, bayesian methods and genetic algorithm," Expert Systems with Applications, vol. 37, pp. 5061–5067, 2010.
[16] M Seera and CP Lim, "A hybrid intelligent system for medical data classification," Expert Systems with Applications, vol. 41, pp. 2239–2249, 2014.
[17] Ye Shao, C,D Hou, and C,C Chiu, "Hybrid intelligent modeling schemes for heart disease classification," Applied Soft Computing, vol. 14, pp. 47–52, 2014.
[18] M Khashei, Hejazi, S Reza, and M Bijari, "A new hybrid artificial neural networks and fuzzy regression model for time series forecasting," Fuzzy Sets and Systems, vol. 159, pp. 769–786, 2008.
[19] Min Xiaa et al., "A hybrid method based on extreme learning machine and k-nearest neighbor for cloud classification of ground-based visible cloud image," Neurocomputing, vol. 160, pp. 238–249, 2015.
[20] Simon Bernard, Clément Chatelain, Sébastien Adama, and Robert Sabourin, "The multiclass ROC front method for cost-sensitive classification," Pattern Recognition, vol. 52, pp. 46-60, 2015.