Machine Learning Techniques in Bank Credit Analysis
Authors: Fernanda M. Assef, Maria Teresinha A. Steiner
Abstract:
The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.
Keywords: Artificial Neural Networks, ANNs, classifier algorithms, credit risk assessment, logistic regression, machine learning, support vector machines.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1284References:
[1] Fenerich, A. T., Steiner, M. T. A., Steiner Neto, P. J., Tochetto, E. A., Tsutsumi, D. P., Assef, F. M., dos Santos, B. S. 2018. Use of Machine Learning techniques in bank risk analysis. Proceedings of International Conference on Computers and Industrial Engineering (CIE48), Auckland, New Zealand, 2-5 December.
[2] Fahner, G. 2012. Estimating causal effects of credit decisions, International Journal of Forecasting, 28(1), pp. 248–260.
[3] Zhang Z., Gao G., Shi Y. 2014. Credit risk evaluation using multi-criteria optimization classifier with kernel, fuzzification and penalty factors. European Journal of Operational Research. 237(1), pp. 335-348.
[4] Luo, S., Kong, X., Nie, T. 2016. Spline Based Survival Model for Credit Risk Modelling, European Journal of Operational Research, 253(3), pp. 869 – 879.
[5] Bolton, C. 2009. Logistic Regression and its Application in Credit Scoring, Universidade de Pretoria, 238p.
[6] Redondo, J. M., Ortin, F. 2017. A Saas Framework for Credit Risk Analysis Services. IEEE América Latina Transactions, 15(3) pp. 474-481.
[7] Acevedo, Y. V. N., Quintero, J. F. L., Marín, C. E. M., González, C. 2016. Business Rules Model for the Automation in the Receipt of Credit Applications by Financial Institutions Based on ArchiMate. IEEE América Latina Transactions, 14(6), pp. 2801-2806.
[8] Diakoulaki, D., Mavrotas, G., Papayannakis, L. 1992. A multicriteria approach for evaluating the performance of industrial firms, Omega, vol. 20, no. 4, pp. 467 – 474, 1992.
[9] Messier, W. F., Hansen, J. V. 1985. Inducing Rules for expert system development an example using default and bankruptcy data, Management Science, 9, pp. 253 – 266.
[10] Bryant, K. 2001. ALEES: an agricultural loan evaluation expert system, Expert Systems with Applications, 21, pp. 75 – 85.
[11] Zhang, D., Zhou, X., Leung, S. C., Zheng, J. 2010. Vertical bagging decision trees model for credit scoring, Expert Systems with Applications, 37(12), pp. 7838 – 7843.
[12] Desai, V., Crook, J., Overstreet Jr, G. 1996. A comparison of neural networks and linear scoring models in credit union environment, European Journal of Operational Research, 95, pp. 24–37.
[13] West, D. 2000. Neural Network Credit Scoring Models, Computers & Operations Research,27 (11), pp. 1131–1152.
[14] Pavlenko, T., Chernyak, O. 2010. Credit risk modeling using Bayesian networks, International Journal of Intelligent Systems, 25(4) pp. 326 – 344.
[15] Gestel, T. V., Baesens, B., Garcia, J. Dijcke, P. V. 2003. A support vector machine approach to credit scoring, Journal of Banking & Finance, vol. 2, pp. 73 – 82.
[16] Belloti T., Crook, J. 2009. Support Vector Machines for credit scoring and discovery of significant features, Expert Systems with Applications, 36(2), pp. 3302 – 3308.
[17] Zhang, Z., Gao, G., Shi, Y. 2014. Credit risk evaluation using multi-criteria optimization classifier with kernel, fuzzyfication and penalty factors, European Journal of Operational Research, 237(1), pp. 335 – 348.
[18] Shanmugam, R., Johnson, C. 2007. At a crossroad of data envelopment and principal component analysis, Omega, vol. 35(4), pp. 351 – 364.
[19] Iazzolino, G., Bruni, M. E., Beraldi, P. 2013. Using DEA and financial ratings for credit evaluation: an empirical analysis, Applied Economics Letters, 20, (14), pp. 1310 – 1317.
[20] Loterman, G., Brown, I., Martens, D., Mues, C., Baesens, B. 2012. Benchmarking regression algorithms for loss given default modeling, International Journal of Forecasting, 28, pp. 161 – 170.
[21] Zhao, Z., Xu, S., Kang, B. H., Kabir, M. M. J., Liu, Y., Wasinger, R. 2015. Investigation and improvement of multi-layer per-ceptron neural networks for credit scoring. Expert Systems with Applications. 42 (7), pp. 3508-3516.
[22] García, V., Marqués, A. I., Sánchez, J. S. 2014. An insight into the experimental design for credit risk and corporate bankruptcy prediction systems. Journal of Intelligent Information Systems. 44 (1), pp. 159-189.
[23] Kao, L. -J., Chiu, C. -C., Chiu, F. -Y. 2012. A Bayesian latent variable model with classification and regression tree approach for behavior and credit scoring, Knowledge Based Systems,36, pp. 245 – 252.
[24] Abdou, H. A., Tsafack, M. D. D., Ntim, C. G., Baker, R. D. 2016. Predicting creditworthiness in retail banking with limited scoring data, Knowledge Based Systems, 103, pp. 89 - 103.
[25] Bravo, C., Thomas, L. C., Weber, R. 2015. Improving credit scoring by differentiating defaulter behavior, Journal of Operations Research Society, 66, pp.771 – 781.
[26] Steiner, M. T. A., Nievola, J. C., Soma, N. Y., Shimizu, T., & Steiner Neto, P. J. 2007. Extração de regras de classificação a partir de redes neurais para auxílio à tomada de decisão na concessão de crédito bancário. Pesquisa Operacional, 27(3), pp. 407-426.