Educational Data Mining: The Case of Department of Mathematics and Computing in the Period 2009-2018
Authors: M. Sitoe, O. Zacarias
Abstract:
University education is influenced by several factors that range from the adoption of strategies to strengthen the whole process to the academic performance improvement of the students themselves. This work uses data mining techniques to develop a predictive model to identify students with a tendency to evasion and retention. To this end, a database of real students’ data from the Department of University Admission (DAU) and the Department of Mathematics and Informatics (DMI) was used. The data comprised 388 undergraduate students admitted in the years 2009 to 2014. The Weka tool was used for model building, using three different techniques, namely: K-nearest neighbor, random forest, and logistic regression. To allow for training on multiple train-test splits, a cross-validation approach was employed with a varying number of folds. To reduce bias variance and improve the performance of the models, ensemble methods of Bagging and Stacking were used. After comparing the results obtained by the three classifiers, Logistic Regression using Bagging with seven folds obtained the best performance, showing results above 90% in all evaluated metrics: accuracy, rate of true positives, and precision. Retention is the most common tendency.
Keywords: Evasion and retention, cross validation, bagging, stacking.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 118References:
[1] M. A. Caldeira and A. M. Guerreiro, “Information’s Systems”. 2nd ed. Lisbon, FCA. 2004.
[2] C. Camilo and J. Silva, “Data Mining: Concepts. Tasks, Methods and Tools”. Goiânia: Ufg, 2009.
[3] A. Mcafee and E. Brynjolfsson. “Big Data: The Management Revolution” Harvard Business Review Brasil, São Paulo, v.21, October-October 2012. Available at: https://hbrbr.uol.com.br/edicoes-anteriores/outubro-2012/ Accessed on: 03/22/2019.
[4] L. P. Fávero. “KDD and Data Mining: Concepts Only”. Retrieved from IT FORUM: https://itforum.com.br/colunas/kdd-e-data-mining-mais-do-que-çou-conceitos/. April, 2019
[5] O. N. P. Cardoso and R.T.M. Machado, “Knowledge management using data mining: a case study at the Federal University of Lavras”. Journal of Public Administration, v. 42, no. 3, p. 495-528. 2008.
[6] S. Filho and R.L. Roberto, “Evasion in Brazilian higher education”. Research Notebooks. 2007.
[7] M. L. Gisi, “Higher Education in Brazil and the unequal character of access and permanence". Educational Dialogue, Curitiba, v. 6, no. 17, p. 97-11. 2006.
[8] Peuem. “Eduardo Mondlane University strategic plan”. Maputo”. UEM. 2017.
[9] M. Tsiakmaki, G. Kotsiantis and S. Ragos, “Implementing AutoML in Educational Data Mining for Prediction Task. Applied Science”. Patras. 2019.
[10] H. L. B. Da Rocha and J. O. Dos Santos. “School failure: Limits to citizenship”. Brazilian Journal of Education and Health, 5(4), 36-42. Retrieved from: http://www.gvaa.com.br/revista/index.php/REBES/article/view/4117 (Links). 2016.
[11] U. M. Fayyad, G. Piatetsky-Shapiro and P. Smyth. “From data mining to knowledge discovery in databases”. Artificial Intelligence Magazine, v. 17, no. 3, p. 37-54.1996. 1996.
[12] T. Lobato and E. Carvalho, E. “Proposal for an Ensemble Model for Credit Scoring”. Brazilian Journal of Development. 2021
[13] R. Rossi, and F. Perreira. “Study of Ensemble Techniques for Data Classification”. Mato Grosso do Sul. 2017.