Optimizing Data Evaluation Metrics for Fraud Detection Using Machine Learning
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32804
Optimizing Data Evaluation Metrics for Fraud Detection Using Machine Learning

Authors: Jennifer Leach, Umashanger Thayasivam

Abstract:

The use of technology has benefited society in more ways than one ever thought possible. Unfortunately, as society’s knowledge of technology has advanced, so has its knowledge of ways to use technology to manipulate others. This has led to a simultaneous advancement in the world of fraud. Machine learning techniques can offer a possible solution to help decrease these advancements. This research explores how the use of various machine learning techniques can aid in detecting fraudulent activity across two different types of fraudulent datasets, and the accuracy, precision, recall, and F1 were recorded for each method. Each machine learning model was also tested across five different training and testing splits in order to discover which split and technique would lead to the most optimal results.

Keywords: Data science, fraud detection, machine learning, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 662

References:


[1] James, Gareth, et al. An Introduction to Statistical Learning with Applications in R. Springer, 2021.
[2] William Ezekiel and Umashanger Thayasivam. ”A Comparison of Supervised Learning Techniques for Clustering” Neural Information Processing Vol. 9489 (2015) p. 476 - 483
[3] Fabrizio Carcillo, Yann-A¨el Le Borgne, Olivier Caelen, Yacine Kessaci, Fr´ed´eric Obl´e, Gianluca Bontempi, Combining unsupervised and supervised learning in credit card fraud detection, Information Sciences, Volume 557, 2021, Pages 317-331, ISSN 0020-0255, https://doi.org/10.1016/j.ins.2019.05.042. (https://www.sciencedirect.com/science/article/pii/S0020025519304451)
[4] Bilen, A., & Ahmet Bedri O¨ zer. (2021). Cyber-attack method and perpetrator prediction using machine learning algorithms. PeerJ Computer Science, doi:http://dx.doi.org/10.7717/peerj-cs.475
[5] Siddhant Bagga, Anish Goyal, Namita Gupta, Arvind Goyal, Credit Card Fraud Detection using Pipeling and Ensemble Learning, Procedia Computer Science, Volume 173, 2020, Pages 104-112, ISSN 1877-0509, https://doi.org/10.1016/j.procs.2020.06.014. (https://www.sciencedirect.com/science/article/pii/S1877050920315167)
[6] Serhiy Hnatyshyn, Umashanger Thayasivam, Vasil Hnatyshin and Curtis White. ”Machine learning algorithms for metabolomics applications” LondonIdentification and Data Processing Methods in Metabolomics (2015) p. 96 - 110 Available at: http://works.bepress.com/umashanger-thayasivam/12/
[7] Hajjami, S. , Malki, J. , Bouju, A. , Berrada, M.. ”Machine Learning Facing Behavioral Noise Problem in an Imbalanced Data Using One Side Behavioral Noise Reduction: Application to a Fraud Detection”. World Academy of Science, Engineering and Technology, Open Science Index 171, International Journal of Computer and Information Engineering (2021), 15(3), 194 - 205.
[8] Bilen, Abdulkadir and Ahmet Bedri O¨ zer. 2021. ”Cyber-Attack Method and Perpetrator Prediction using Machine Learning Algorithms.” PeerJ Computer Science (Apr 09). doi:http://dx.doi.org.ezproxy.rowan.edu/10.7717/peerj-cs.475. http://ezproxy.rowan.edu/login?qurl=https%3A%2F%2Fwww.proquest.com %2Fscholarly-journals%2Fcyber-attack-method-perpetrator-prediction -using%2Fdocview%2F2510490837%2Fse-2%3Faccountid%3D13605.
[9] Fahrmeir, L. and Tutz, G. (1994), Multivariate Statistical Modelling Based on Generalized Linear Models, Springer.
[10] Nisbet, R., Elder, J. and Miner, G. (2011), Handbook of Statistical Analysis and Data Mining Applications, Academic Press.
[11] Tuff’ery, S. (2011), Data Mining and Statistics for Decision Making, Wiley.