Customer Churn Prediction Using Four Machine Learning Algorithms Integrating Feature Selection and Normalization in the Telecom Sector
Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh
Abstract:
A crucial part of maintaining a customer-oriented business in the telecommunications industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years, which has made it more important to understand customers’ needs in this strong market. For those who are looking to turn over their service providers, understanding their needs is especially important. Predictive churn is now a mandatory requirement for retaining customers in the telecommunications industry. Machine learning can be used to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.
Keywords: Machine Learning, Gradient Boosting, Logistic Regression, Churn, Random Forest, Decision Tree, ROC, AUC, F1-score.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 408References:
[1] H. Chen, R. H. Chiang, and V. C. Storey, “Business intelligence and analytics: From big data to big impact,” MIS quarterly, pp. 1165–1188, 2012.
[2] T. Landis and S. Philips, “Customer retention marketing vs. customer acquisition marketing,” Apr 2022.
[Online]. Available: https://www.outboundengine.com/blog/ customer-retention-marketing-vs-customer-acquisition-marketing/
[3] S. Yadav, A. Jain, and D. Singh, “Early prediction of employee attrition using data mining techniques,” in 2018 IEEE 8th International Advance Computing Conference (IACC), 2018, pp. 349–354.
[4] B. J. Ng), “Classification analysis on telco customer churn,” Nov 2021.
[Online]. Available: https://medium.com/swlh/ classification-analysis-on-telco-customer-churn-a01599ad28d7
[5] C.-P. Wei and I.-T. Chiu, “Turning telecommunications call details to churn prediction: a data mining approach,” Expert systems with applications, vol. 23, no. 2, pp. 103–112, 2002.
[6] K. Coussement and D. Van den Poel, “Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques,” Expert systems with applications, vol. 34, no. 1, pp. 313–327, 2008.
[7] H. Abbasimehr, M. Setak, and M. Tarokh, “A neuro-fuzzy classifier for customer churn prediction,” International Journal of Computer Applications, vol. 19, no. 8, pp. 35–41, 2011.
[8] H. Jain, A. Khunteta, and S. Srivastava, “Churn prediction in telecommunication using logistic regression and logit boost,” Procedia Computer Science, vol. 167, pp. 101–112, 2020.
[9] P. Lalwani, M. K. Mishra, J. S. Chadha, and P. Sethi, “Customer churn prediction system: a machine learning approach,” Computing, pp. 1–24, 2021.
[10] M. Karanovic, M. Popovac, S. Sladojevic, M. Arsenovic, and D. Stefanovic, “Telecommunication services churn prediction-deep learning approach,” in 2018 26th Telecommunications Forum (TELFOR). IEEE, 2018, pp. 420–425.
[11] G. Esteves and J. Mendes-Moreira, “Churn perdiction in the telecom business,” in 2016 Eleventh International Conference on Digital Information Management (ICDIM). IEEE, 2016, pp. 254–259.
[12] H. Jain, A. Khunteta, and S. P. Shrivastav, “Telecom churn prediction using seven machine learning experiments integrating features engineering and normalization,” 2021.
[13] P. Swetha, S. Usha, and S. Vijayanand, “Evaluation of churn rate using modified random forest technique in telecom industry,” in 2018 3rd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT). IEEE, 2018, pp. 2492–2497.
[14] O. company, “Customer signature for churn analysis - dataset by bob-wakefield,” Sep 2019.
[Online]. Available: https://data.world/ bob-wakefield/call-center-data
[15] C. MAWER, “The value of exploratory data analysis,” Sep 2017.
[Online]. Available: https://www.svds.com/ value-exploratory-data-analysis/
[16] s. babu, “Eda for machine learning: Exploratory data analysis in python,” Apr 2021.
[Online]. Available: https://www.analyticsvidhya.com/blog/ 2021/04/rapid-fire-eda-process-using-python-for-ml-implementation/
[17] A. Ultsch, “Emergent self-organising feature maps used for prediction and prevention of churn in mobile phone markets,” Journal of Targeting, Measurement and Analysis for Marketing, vol. 10, no. 4, pp. 314–324, 2002.
[18] G. Dougherty, Pattern recognition and classification: an introduction. Springer Science & Business Media, 2012.
[19] D. Singh and B. Singh, “Investigating the impact of data normalization on classification performance,” Applied Soft Computing, vol. 97, p. 105524, 2020.
[20] A. Al Marouf, M. K. Hasan, and H. Mahmud, “Comparative analysis of feature selection algorithms for computational personality prediction from social media,” IEEE Transactions on Computational Social Systems, vol. 7, no. 3, pp. 587–599, 2020.
[21] V. R, “Feature selection - correlation and p-value,” Apr 2022.
[Online]. Available: https://towardsdatascience.com/ feature-selection-correlation-and-p-value-da8921bfb3cf
[22] K. Pearson, “X. on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling,” The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 50, no. 302, pp. 157–175, 1900.
[23] M. Nikulin, “Chi-squared test for normality,” in Proceedings of the International Vilnius Conference on Probability Theory and Mathematical Statistics, vol. 2, no. 1, 1973, pp. 119–122.
[24] R. Bevans, “An introduction to t-tests,” Dec 2020.
[Online]. Available: https://www.scribbr.com/statistics/t-test/
[25] A. A, “Statistical tests: Feature selection using statistical tests,” Jun 2021.
[Online]. Available: https://www.analyticsvidhya.com/blog/2021/ 06/feature-selection-using-statistical-tests/
[26] J. Han, M. Kamber, and J. Pei, “Data mining: Concepts and techniques third edition
[m],” The Morgan Kaufmann Series in Data Management Systems, vol. 5, no. 4, pp. 83–124, 2011.
[27] J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics, pp. 1189–1232, 2001.
[28] J. Hoare, “Gradient boosting explained - the coolest kid on the machine learning block - displayr,” https://www.displayr.com/ gradient-boosting-the-coolest-kid-on-the-machine-learning-block/, (Accessed on 05/05/2022).
[29] U. S¸ . G¨ursoy, “Customer churn analysis in telecommunication sector,” I˙stanbul U¨ niversitesi I˙s¸letme Faku¨ltesi Dergisi, vol. 39, no. 1, pp. 35–49, 2010.
[30] H. Sharma and S. Kumar, “A survey on decision tree algorithms of classification in data mining,” International Journal of Science and Research (IJSR), vol. 5, no. 4, pp. 2094–2097, 2016.
[31] P. Yadav, “Decision tree in machine learning — by prince yadav — towards data science,” https://towardsdatascience.com/ decision-tree-in-machine-learning-e380942a4c96, 11 2018, (Accessed on 05/05/2022).
[32] K. Singh, “How to dealing with imbalanced classes in machine learning,” Oct 2020.
[Online]. Available: https://www.analyticsvidhya. com/blog/2020/10/improve-class-imbalance-class-weights/
[33] H. He and Y. Ma, Imbalanced learning: foundations, algorithms, and applications. Wiley-IEEE Press, 2013.
[34] J. Brownlee, “Roc curves and precision-recall curves for imbalanced classification,” https://machinelearningmastery.com/ roc-curves-and-precision-recall-curves-for-imbalanced-classification/, Jan 2020, (Accessed on 05/05/2022).