Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32919
Integration of Educational Data Mining Models to a Web-Based Support System for Predicting High School Student Performance

Authors: Sokkhey Phauk, Takeo Okazaki


The challenging task in educational institutions is to maximize the high performance of students and minimize the failure rate of poor-performing students. An effective method to leverage this task is to know student learning patterns with highly influencing factors and get an early prediction of student learning outcomes at the timely stage for setting up policies for improvement. Educational data mining (EDM) is an emerging disciplinary field of data mining, statistics, and machine learning concerned with extracting useful knowledge and information for the sake of improvement and development in the education environment. The study is of this work is to propose techniques in EDM and integrate it into a web-based system for predicting poor-performing students. A comparative study of prediction models is conducted. Subsequently, high performing models are developed to get higher performance. The hybrid random forest (Hybrid RF) produces the most successful classification. For the context of intervention and improving the learning outcomes, a feature selection method MICHI, which is the combination of mutual information (MI) and chi-square (CHI) algorithms based on the ranked feature scores, is introduced to select a dominant feature set that improves the performance of prediction and uses the obtained dominant set as information for intervention. By using the proposed techniques of EDM, an academic performance prediction system (APPS) is subsequently developed for educational stockholders to get an early prediction of student learning outcomes for timely intervention. Experimental outcomes and evaluation surveys report the effectiveness and usefulness of the developed system. The system is used to help educational stakeholders and related individuals for intervening and improving student performance.

Keywords: Academic performance prediction system, prediction model, educational data mining, dominant factors, feature selection methods, student performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 941


[1] Jonhson, E. S., Smith, L. and Harris, M. L. How RTI works in secondary school. Thousand Oaks, CA, 2009.
[2] S. Slater, S. Joksimovic, V. Kovanovic, R. S. Baker, and D. Gasevic, “Tools for educational data mining,” Journal of Educational and Behavioral Statistics, vol. 10, Issue 3, pp. 85-106, 2017.
[3] G. Ackcapinar, G, M. N. Hasine, R. Majumdar, B. Flanagan, and H. Ogata, “Developing early system for spotting at-risk students by using eBook interaction logs,” Smart Learning Engineering, vol. 6, Issue 4, pp. 1-15, March 2019.
[4] OECD, Low-performing students: Why they fall behind and how help them succeed, PISA, OECD Publishing, Paris, 2016.
[5] Ministry of Education, Youth, and Sport (MoEYS). Education in Cambodia: Finding from Cambodia's Experience in PISA for Development. Phnom Penh: Author, 2018.
[6] MoEYS, Policies on Science, Technology, and Innovation, 2020-2030. Phnom Penh, Cambodia, 2019.
[7] Barnes, T., Dessmaris, M., Romero, C., & Ventura, S. (2009, July 1-3). Educational data mining 2009. Proceedings of the 2nd International Conference on Educational Data Mining, Cordoba, Spain.
[8] P. Thakar, A. Mehta, and Manisha, “Performance analysis and prediction in educational data mining,” International Journal of Computer Application, vol. 110, no. 15, pp. 60-68, 2015.
[9] A. Pena-Ayala, “Educational data mining: Survey and a data mining-based analysis of recent works,” Expert Systems with Application, vol. 41, pp. 1432-1462, 2014.
[10] C. Romero and S. Ventura. Educational data mining: A survey from 1995 to 2005. Expertise Systems with Application, 33(1), 135–146, 2006.
[11] C. Romero and S. Ventura, “Educational data mining: A Survey review of the state of the art,” IEEE Transaction on System, Man, and Cybernetics, Part C (Applications and Reviews), vol. 40, issue 6, pp. 601-618, 2010.
[12] C. Romero and S. Ventura, “Data mining in education,” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol. 3, issue 1, pp. 12-27, 2013.
[13] C. Romero, C. and S. Ventura. Educational data mining and learning analytics: An updated survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 40(6), e1355, 2020.
[14] P. J. M. Estrera, P. E. Natan, B. G. T. Rivera, and F. B. Colarte, “Student performance analysis for academic rankings using decision tree approach in the University of Science and Technology of Southern Philippine senior high school,” International Journal of Engineering and Technology, vol. 3, Issue 5, pp. 147-153, 2017.
[15] G. Dimic, D. Rancic, I Milentijevic, P. Spalevic, and K. Plecic, “Comparative study: Feature selection methods in blended learning environments,” Facta Universitatis, Series: Automatic Control and Robotics, vol. 16, no. 2, pp. 95-116, 2017.
[16] M. Zaffar and K.S. Savita, “A study of feature selection algorithms for predicting students’ academic performance,” International Journal of Advanced Computer Science and Applications, vol. 9, no. 5, 2018.
[17] A. A. Saa, M. Ai-Emran, and K. Shaalan, “Mining student information system records to predict students’ academic performance,” Springer Nature Switzerland AG 2020, AMLTA 2019, AISC 921, pp. 229-239, 2019.
[18] Y. H. Hu, C. L. Lo, and S. P. Shih, “Developing early warning systems to predict students’ online course learning performance,” Computers in Human Behavoirs, vol. 36, pp. 469-478, 2014.
[19] G. Ackapinar, A. Altun, and P. Askar, “Using learning analytics to develop early warning systems for at-risk students,” International Journal of Educational Technology in Higher Education, vol. 16, issue 40, pp. 1-20, 2019.
[20] S. Lee and J. Y. Chung, “The machine learning-based dropout early warning systems for improving performance of dropout prediction,” Journal of Applied Science, vol. 9, issue 15, pp. 3093-4016, 2019.
[21] C. Shearer, “The CRISP-DM model: the new blueprint for data mining,” J Data Warehousing (2000), vol. 5, pp. 13—22, 2020.
[22] A. Hellas et al., “Predicting academic performance: A systematic literature review,” Proceeding of Companion in Computer Science Education, Larnaca, Cyprus, pp. 175-199, July 2-4, 2018.
[23] P. Jinal and D. Kumar, “A review on dimensional reduction techniques,” International Journal of Computer Applications, vol. 173, no. 2, pp. 42-46, 2017.
[24] L. Ma et al., “Evaluation of feature selection methods for object-based land cover machine classifiers,” International Journal of Geo-Information, vol. 6, no. 51, 2017.
[25] S. Bassine, “Feature selection using an improved Chi-square for Arabic text classification,” Journal of King Saud University-Computer and Information Science, vol. 32, no. 2, pp. 225-231, 2020.
[26] D. H. Mazumder and R. Vilumuthu, “An enhanced feature selection filter for classification of microarray cancer data,” WILEY ETR Journal, vol. 41, no. 3, pp. 358-370, 2019.
[27] A. Bummert, X. Sun, B. Bischa, J. Rahnenfuhrer, and M. Lang, “Benchmark for filter methods for feature selection in high-dimensional classification data,” Computational Statistics & Data Analysis, vol. 143, 2020.
[28] C. F. Tsai and Y. C. Hsiao, “Combining multiple feature selection methods for stock prediction: union, intersection, and multi-intersection approaches,” Decis Support System, vol. 50, no. 1, 258-269, 2010.
[29] A. Thubaity, N. Abanumay, S. Al-Jerayyed, A. Alrukban, Z. Mannaa, “The effect of combining different feature selection methods on Arabic text classification,” IEEE: The 14th ACIS International Conference Software Engineering, Artificial Intelligent, Networking and Parallel/distributed Computing (SNPD), 211-216.
[30] P. Sokkhey and T. Okazaki, “Comparative study of prediction models for high school student performance in mathematics,” Journal of IEIE Transactions on Smart Processing and Computing, vol. 8, no. 5, pp. 394-404, 2019.
[31] P. Sokkhey and T. Okazaki, “Multi-models of educational data mining for predicting student performance: A case study of high schools in Cambodia,” vol. 9, no. 3, pp. 217-229, 2020.
[32] P. Sokkhey and T. Okazaki, “Hybrid machine learning algorithms for prediction academic performance,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 1, pp. 32–41, 2020.
[33] P. Sokkhey and T. Okazaki, “Development and optimization of deep belief networks for academic prediction with larger datasets,” Journal of IEIE Transactions on Smart Processing and Computing, (Accepted 20-April-2020.
[34] P. Sokkhey and T. Okazaki, “Developing web-based support system for predicting poor-performing students using educational data mining techniques,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 7, pp. 23–32, 2020.