credit cards Recently Published Documents

Total documents.

  • Latest Documents
  • Most Cited Documents
  • Contributed Authors
  • Related Sources
  • Related Keywords

Bank Revolving Credit as a Channel of Monetary Policy Targeting Interest Rates

This paper investigates the implication of bank revolving credit in the form of credit card loans as a channel of monetary policy targeting the federal funds rate since 1980. Credit cards have become increasingly popular and a necessity for many transactions and purchases in the United States. The revolving credit nature of credit card loans makes them an instant tool for consumer loans that can facilitate consumption. Using instrumental variable and two-stage least squares (2SLS) methodology, we analyze the implication of credit card loans to modern monetary policy that targets interest rates.

Debt, Financial Vulnerability, and Repayment Behaviour in Older Canadian Households

Earlier research has documented that debt at older ages has increased significantly in Canada over the period from 1999 to 2016. In this article, we explore the consequences of a growing proportion of older Canadian households experiencing financial vulnerability. After controlling for household characteristics, we find among older households that a high debt-to-asset ratio and very low liquid wealth are significantly and positively associated with skipping or delaying a mortgage or non-mortgage debt payment and with usually paying the minimum amount or less on credit cards in the previous year. The debt-to-income ratio, however, is not an important indicator of financial vulnerability for older households.

The Impact of the Ongoing Pandemic on Digital Finance Transactions: An Empirical Analysis

The ongoing pandemic has resulted in a disruption of the life of all citizens and impacted all the spheres, more so the financial system because the Pandemic and its aftermath has shut all economic activity except those which as per the government directives are considered the most essential. This has deeply impacted private consumption, external trade as well as investment in the economy. Accordingly, both in retail stores and e-commerce orders, a common strand is that many of the consumers are now paying bills via digital payment mechanisms and taking contactless delivery of goods wherever possible. “Digital financial transaction systems, e-wallets and apps, online transactions using e-banking, usage of Plastic money (Debit and Credit Cards), etc. have recorded a substantial increase in demand during the crisis”. The objective of the present paper is to examine and analyze the digital finance transactions in selected cities during the ongoing pandemic

Trend and dynamics of card payment system in India

Payment system in India has undergone a dramatic change in recent years. The payment through cards, using both debit and credit cards, is one of the early innovations in the modern payment system in the country. Several intermediaries are involved in the effective functioning of card payment mechanism. As a result, the card payment infrastructure has grown remarkably well across India. The volume of payments made through these devices as well as the value of card payments increased rapidly in the last two decades. Among the commercial banks, the State Bank of India dominates in the maintenance of ATM infrastructure, the issue of cards and in the volume and value of card transaction. The private sector banks dominate in the installation of POS terminals and HDFC bank tops in the POS credit card transaction. However, the recent trend shows that the transaction through cards as a percentage of total retail electronic payments has been declining in India, as other retail payments platforms have become popular.

DEVELOPMENT OF ELECTRONIC TRADE IN AZERBAIJAN AND SOLUTIONS TO THE PROBLEMS IN THIS FIELD

The article provides information on the establishment and development of e-commerce in Azerbaijan, emphasizing that the scale of this field will expand in our country in a short time. Information was provided on the number of payment cards in Azerbaijan in 2016-2020, the volume of non-cash payments, transactions with debit and credit cards, transactions per ATM and one POS-terminal. The article also notes the volume of transactions carried out by foreigners visiting Azerbaijan through bank cards in January-October 2021 and e-commerce in Azerbaijan in January-October 2019-2021.

DOES ECONOMIC POLICY UNCERTAINTY REDUCE FINANCIAL INCLUSION?

This study investigates whether the level of economic policy uncertainty (EPU) would reduce the level of financial inclusion. It was predicted that a high level of EPU could have a negative effect on the level of financial inclusion. It was argued that a high level of EPU would discourage financial institutions from providing basic financial services to low end customers and unbanked adults, and this would lead to a decrease in the level of financial inclusion. Using a sample of 22 countries, the study found that the level of EPU did not have a significant impact on financial inclusion. None of the nine indicators of financial inclusion were found to have a significant direct relationship with EPU. However, there was some evidence that the combined effect of a high level of EPU and high nonperforming loans could reduce financial inclusion, particularly through bank branch contraction and a reduction in the use of electronic payments. Furthermore, the use of formal accounts and credit cards would increase in times of high credit supply and when there was a high level of EPU.

Secure and Fraud Proof Online Payment System for Credit Cards

Financial literacy and the use of credit cards in mexico, rfid-based automated supermarket self-billing system.

Supermarkets are large retail stores operated on a self-service basis. They sell a range of goods from agricultural produce to electronics with tagged prices. They are coupled with numerous advantages like supporting advanced means of payment like cheques, credit cards, smart store electronic cards and mobile money, offering transportation incentives and discounts. The study aimed at coming up with an RFID-Based billing system through automation. The methods and materials used included document reviews, observational experiments, system design, implementation and testing based on current situations in the supermarket business. Findings showed that there are several weaknesses with the existing systems and the new system could ably uphold the time resource, efficiency improvement of both workers and customers, and it is secure, cost-effective, and time-saving especially from queues. The widely implemented system can improve the revenue gap and possibly rejuvenate the national or international economy to a large extent.

CONSUMER PREFERENCES AND REGULATIONS IN CREDIT CARD MARKETS: EVIDENCE FROM TURKEY

In this paper, we analyze the demand side of the credit card market. Using unique survey data and a discrete choice model, we uncover consumer preferences for all price and nonprice features of credit cards. Our results provide evidence for an alternative explanation for the credit card pricing puzzles. We show that consumers view credit cards as highly differentiated products with both bank-level and card-level nonprice features. When selecting their credit cards, they predominantly prioritize these nonprice features over prices. Although private banks charge higher prices for their credit card services than other banks, the majority of consumers choose them as issuers due to their bank-level and card-level nonprice features. Consumers who prioritize prices tend to choose the credit cards of participation or public banks. Widespread branch/automated teller machine networks as bank-level features and installments, bonuses/rewards/miles and the prestige of the card as card-level features are particularly effective in consumers’ decisions to choose private banks as issuers. Such strong preferences for nonprice features seem to furnish private banks with market power. Hence, we argue that underlying issuers’ market power is also this differentiated nature of credit cards, for which regulatory measures are not self-evident.

Export Citation Format

Share document.

research paper for credit card

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

Enter the email address you signed up with and we'll email you a reset link.

  • We're Hiring!
  • Help Center

paper cover thumbnail

Consumers and credit cards: A credit cards: A review of the empirical literature review of the empirical literature

Profile image of Cliff A Robb

Research in the area of consumer credit card abundance of literature in the business, psychology, and public policy fields. 1960s, the work revolved around descriptive characteristics and evolved as scholars probed deeper by investigating relationships between credit cards and psychological constructs, and the need for consumer policy. While the scope of credit card research has broadened, there is a need to pause and reflect on what we actually know about the phenomenon, given its proclivity in society. This paper identifies the empirical research conducted over the past four decades in order to provide insights and recommendations for additional research. A total of 537 refereed journal articles from 8 databases were reviewed and evaluate to credit cards, with a final working 2012. Emerging trends are identified and suggestions for future research are provided. Research in the area of consumer credit card attitude and behavior has provided an abundance of literature in the business, psychology, and public policy fields. Beginning in the 1960s, the work revolved around descriptive characteristics and evolved as scholars probed onships between credit cards and psychological constructs, and the While the scope of credit card research has broadened, there is a need to pause and reflect on what we actually know about the phenomenon, given its proclivity in This paper identifies the empirical research conducted over the past four decades in order to provide insights and recommendations for additional research. A total of 537 refereed journal articles from 8 databases were reviewed and evaluated within specific parameters related with a final working sample of 103 journal articles published between 1969 and 2012. Emerging trends are identified and suggestions for future research are provided. attitude and behavior has provided an Beginning in the 1960s, the work revolved around descriptive characteristics and evolved as scholars probed onships between credit cards and psychological constructs, and the While the scope of credit card research has broadened, there is a need to pause and reflect on what we actually know about the phenomenon, given its proclivity in This paper identifies the empirical research conducted over the past four decades in order to provide insights and recommendations for additional research. A total of 537 refereed thin specific parameters related published between 1969 and 2012. Emerging trends are identified and suggestions for future research are provided.

Related Papers

research paper for credit card

Jing Jian Xiao

International Journal of Consumer Studies

Simon R James

Home Economics Research Journal

Sharon Danes

International Journal of Bank Marketing

Charles Blankson , Audhesh Paswan , Kwabena Boakye

Jean-charles Chebat , Michel Laroche , K. Fam

KONG YIN MEI

Mediterranean Journal of Social Sciences

Anita Ciunova Shuleska

Credit cards have become an important part of everyday life without which lot of people can not imagine their life. The aim of this paper is to reveal the demographic, socio-economic and behavioral differences in credit cards attitudes in Macedonia. First, attitudes toward payment cards were examined by employing factor analysis. The reliability of the scale was examined using the Cronbach' alpha. The respondents were administered the 12-item version of the credit card attitude scale and asked questions regarding their demographic, socio-economic and behavioral characteristics. ANOVA test was used to reveal the gender and age (demographic) differences, income and household type (socio-economic) differences and behavioral (number of credit cards owned, period of ownership, payment of balance and usage frequency) differences in components of credit cards attitudes. The results of factor analysis identified three subscales of short credit card attitude scales while ANOVA showed sig...

Credit card unhealthy practices have been a world-wide challenge in the global business environment for years. The effect of default hits not only the victim, but also the banks, credit card companies and merchants. The objective of this paper is to examine the relationship amongst practices, attitudes, problems and risks related to credit card usage. A literature review on prior studies has indicated that there is a methodological gap to be filled in this area. Novelty is achieved by the usage of partial least square (PLS) model in answering the hypotheses. Multilevel method analysis using PLS allows for efficiency, convergence and power when investigating the causal effects in the two-level data, ensuring that the support for hypothesis is much more acceptable. Out of the 150 total survey questionnaires distributed, 114 were returned and used. Face to face data collection method was employed to enhance the response rate. Prior to collecting the data, the content of the survey ques...

Journal of Economic Psychology

Pamela Turner

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

RELATED PAPERS

Revista de Gestão

Israel José dos Santos Felipe

International Business Review

Erdener Kaynak

Proceedings of 5th SCF International Conference on “Economics and Social Impacts of Globalization and Future European Union ” , 2018

Şadi Taha Süngü

Shirkah: Journal of Economics and Business

Amanj Ahmed

Thenmalar Suresh Kumar

edibe betül karbay

Dr. G Thouseef Ahamed

Prof. M. Sadiq Sohail

Dorcas Kerre , Justus Mulwa Munyoki

Journal of Financial Services Marketing

Bruce A. Huhmann

Rüştü Yayar

Journal of Business and Social Review in Emerging Economies

Areeba Khan

Cliff A Robb

Economic Growth centre Working …

Faculty of Business and Management

Afiq Baharin

SHS Web of Conferences

samiaji santoso

jack jackson

Journal of Comparative International Management

Afshan Ahmed

Inoussa Boubacar

Judith Fischer

Brian Kennedy

Tạp chí Khoa học

Young Consumers: Insight and Ideas for Responsible Marketers

Tania Veludo

  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

Credit Card Fraud Detection: A Systematic Review

  • Conference paper
  • First Online: 18 January 2020
  • Cite this conference paper

research paper for credit card

  • C. Victoria Priscilla 8 &
  • D. Padma Prabha 9  

Part of the book series: Learning and Analytics in Intelligent Systems ((LAIS,volume 9))

Included in the following conference series:

  • International Conference on Information, Communication and Computing Technology

1058 Accesses

6 Citations

Due to the tremendous growth of technology, digitalization has become the key aspect in the banking sector. As online transaction increases, the fraud rate grows simultaneously. Even though many techniques are available to identify the fraudulent transaction, the fraudsters adapt their own paradigm. This review intends to present the research studies accomplished on Credit Card Fraud Detection (CCFD) by highlighting the challenge of class imbalance and the various Machine Learning techniques, it also extends the efficient evaluation metrics particularly for CCFD. As the dataset is more sensitive and less available we have outlined the web sources of available datasets and trending software tools used in the deployment of CCFD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

research paper for credit card

Credit Card Fraud Detection Using Machine Learning

research paper for credit card

Comparative analysis of binary and one-class classification techniques for credit card fraud data

research paper for credit card

Comparative Analysis on Fraud Detection in Credit Card Transaction Using Different Machine Learning Algorithms

The Nilson Report 2019. https://nilsonreport.com/publication_newsletter_archive_issue.php?issue=1146 . Accessed 03 June 2019

Prakash, A., Chandrasekar, C.: An optimized multiple semi-hidden Markov model for credit card fraud detection. Indian J. Sci. Technol. 8 (2), 176–182 (2015)

Article   Google Scholar  

Wang, H., Zhu, P., Zou, X., Qin, S.: An ensemble learning framework for credit card fraud detection based on training set partitioning and clustering. In: IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People and Smart City Innovations, SmartWorld/UIC/ATC/ScalCom/CBDCom/IoP/SCI 2018, pp. 94–98 (2018)

Google Scholar  

Wang, S., Liu, C., Gao, X., Qu, H., Xu, W.: Session-based fraud detection in online e-commerce transactions using recurrent neural networks. In: Altun, Y., et al. (eds.) Machine Learning and Knowledge Discovery in Databases. Lecture Notes in Computer Science, pp. 241–252. Springer, Cham (2017)

Chapter   Google Scholar  

Pozzolo, A.D., Boracchi, G., Caelen, O., Alippi, C., Bontempi, G.: Credit card fraud detection: a realistic modeling and a novel learning strategy. IEEE Trans. Neural Netw. Learn. Syst. 29 (8), 3784–3797 (2018)

Sorournejad, S., Zojaji, Z., Atani, R.E., Monadjemi, A.H.: A survey of credit card fraud detection techniques: data and technique oriented perspective (2016)

Abdallah, A., Maarof, M.A., Zainal, A.: Fraud detection system: a survey. J. Netw. Comput. Appl. 68 , 90–113 (2016)

Dal Pozzolo, A., Caelen, O., Le Borgne, Y.A., Waterschoot, S., Bontempi, G.: Learned lessons in credit card fraud detection from a practitioner perspective. Expert Syst. Appl. 41 (10), 4915–4928 (2014)

Saia, R., Carta, S.: Evaluating the benefits of using proactive transformed-domain-based techniques in fraud detection tasks. Future Gener. Comput. Syst. 93 , 18–32 (2019)

Kumari, P., Mishra, S.P.: Analysis of credit card fraud detection using fusion classifiers. In: Behera, H., Nayak, J., Naik, B., Abraham, A. (eds.) Computational Intelligence in Data Mining, vol. 711, pp. 111–122. Springer, Singapore (2019)

Noghani, F.F., Moattar, M.-H.: Ensemble classification and extended feature selection for credit card fraud detection. J. AI Data Min. 5 (2), 235–243 (2017)

Akila, S., Srinivasulu Reddy, U.: Cost-sensitive Risk Induced Bayesian Inference Bagging (RIBIB) for credit card fraud detection. J. Comput. Sci. 27 , 247–254 (2018)

Awoyemi, J.O., Adetunmbi, A.O., Oluwadare, S.A.: Credit card fraud detection using machine learning techniques: a comparative analysis. In: Proceedings IEEE International Conference Computing Networking Informatics, ICCNI 2017, January 2017, pp. 1–9 (2017)

Wang, C., Han, D.: Credit card fraud forecasting model based on clustering analysis and integrated support vector machine. Cluster Comput. 0123456789 , 1–6 (2018)

Van Vlasselaer, V., et al.: APATE: a novel approach for automated credit card transaction fraud detection using network-based extensions. Decis. Support Syst. 75 , 38–48 (2015)

Jurgovsky, J., et al.: Sequence classification for credit-card fraud detection. Expert Syst. Appl. 100 , 234–245 (2018)

Xuan, S., Liu, G., Li, Z., Zheng, L., Wang, S., Jiang, C.: Random forest for credit card fraud detection. In: ICNSC 2018 - 15th IEEE International Conference on Networking, Sensing and Control, pp. 1–6 (2018)

Rushin, G., Stancil, C., Sun, M., Adams, S., Beling, P.: Horse race analysis in credit card fraud—deep learning, logistic regression, and Gradient Boosted Tree. In: 2017 Systems and Information Engineering Design Symposium (SIEDS), pp. 117–121 (2017)

Roy, A., Sun, J., Mahoney, R., Alonzi, L., Adams, S., Beling, P.: Deep learning detecting fraud in credit card transactions. In: 2018 Systems and Information Engineering Design Symposium (SIEDS), pp. 129–134 (2018)

Akosa, J.: Predictive accuracy: a misleading performance measure for highly imbalanced data. In: Proceedings of the SAS Global Forum (2017)

Carneiro, N., Figueira, G., Costa, M.: A data mining based system for credit-card fraud detection in e-tail. Decis. Support Syst. 95 , 91–101 (2017)

Zareapoor, M., Shamsolmoali, P.: Application of credit card fraud detection: based on bagging ensemble classifier. Procedia Comput. Sci. 48 (C), 679–685 (2015)

Zhang, Y., Liu, G., Zheng, L., Yan, C., Jiang, C.: A novel method of processing class imbalance and its application in transaction fraud detection. In: 2018 IEEE/ACM 5th International Conference on Big Data Computing Applications and Technologies, vol. 1, pp. 152–159 (2018)

de Sá, A.G.C., Pereira, A.C.M., Pappa, G.L.: A customized classification algorithm for credit card fraud detection. Eng. Appl. Artif. Intell. 72 , 21–29 (2018)

Nami, S., Shajari, M.: Cost-sensitive payment card fraud detection based on dynamic random forest and k-nearest neighbors. Expert Syst. Appl. 110 , 381–392 (2018)

Kim, E., et al.: Champion-challenger analysis for credit card fraud detection: hybrid ensemble and deep learning. Expert Syst. Appl. 128 , 214–224 (2019)

Fu, K., Cheng, D., Tu, Y., Zhang, L.: Credit card fraud detection using convolutional neural networks. In: Hirose, A., Ozawa, S., Doya, K., Ikeda, K., Lee, M., Liu, D. (eds.) Neural Information Processing, pp. 483–490. Springer, Cham (2016)

Zareapoor, M., Yang, J.: A novel strategy for mining highly imbalanced data in credit card transactions. Intell. Autom. Soft Comput. 1–7 (2017)

Pumsirirat, A., Yan, L.: Credit card fraud detection using deep learning based on auto-encoder and restricted Boltzmann machine. Int. J. Adv. Comput. Sci. Appl. 9 (1), 18–25 (2018)

Seeja, K.R., Zareapoor, M.: FraudMiner: a novel credit card fraud detection model based on frequent itemset mining. Sci. World J. 2014 , 1–10 (2014)

Hegazy, M., Madian, A., Ragaie, M.: Enhanced fraud miner: credit card fraud detection using clustering data mining techniques. Egypt. Comput. Sci. 40 (03), 72–81 (2016)

Dai, Y., Yan, J., Tang, X., Zhao, H., Guo, M.: Online credit card fraud detection: a hybrid framework with big data technologies. In: Proceedings of 15th IEEE International Conference Trust Security and Privacy in Computing and Communication, 10th IEEE International Conference Big Data Science and Engineering, 14th IEEE International Symposium Parallel Distributed Processing, pp. 1644–1651 (2016)

Batani, J.: An adaptive and real-time fraud detection algorithm in online transactions. Int. J. Comput. Sci. Bus. Inform. 17 , 1–12 (2017)

Behera, T.K., Panigrahi, S.: Credit card fraud detection: a hybrid approach using fuzzy clustering & neural network. In: Proceedings of 2015 2nd IEEE International Conference on Advances in Computing and Communication Engineering, ICACCE 2015, pp. 494–499 (2015)

Jain, R., Gour, B., Dubey, S.: A hybrid approach for credit card fraud detection using rough set and decision tree technique. Int. J. Comput. Appl. 139 (10), 1–6 (2016)

Kamaruddin, S., Ravi, V.: Credit card fraud detection using big data analytics: use of PSOAANN based one-class classification. In: Proceedings of International Conference on Informatics Analytics – ICIA 2016, pp. 1–8 (2016)

Santos, L.J.S., Ocampo, S.R.: Bayesian method with clustering algorithm for credit card transaction fraud detection. Rom. Stat. Rev. 1 , 103–120 (2018)

Hassan, D.: The impact of false negative cost on the performance of cost sensitive learning based on Bayes minimum risk: a case study in detecting fraudulent transactions. Int. J. Intell. Syst. Appl. 9 (2), 18–24 (2017)

MathSciNet   Google Scholar  

Yee, O.S., Sagadevan, S., Malim, N.: Credit card fraud detection using machine learning as data mining technique. J. Telecommun. Electron. Comput. Eng. 10 (1–4), 23–27 (2018)

Nur-E-Arefin, M., Islam, M.S.: Application of computational intelligence to identify credit card fraud. In: 2018 International Conference on Innovation in Engineering and Technology, ICIET 2018, pp. 1–6 (2018)

Tran, P.H., Tran, K.P., Huong, T.T., Heuchenne, C., HienTran, P., Le, T.M.H.: Real time data-driven approaches for credit card fraud detection, pp. 6–9 (2018)

Askari, S.M.S., Hussain, M.A.: Credit card fraud detection using fuzzy ID3. In: Proceeding - IEEE International Conference on Computing Communication and Automation ICCCA 2017, January 2017, pp. 446–452 (2017)

Artikis, A., et al.: A prototype for credit card fraud management: industry paper. In: Proceedings of the 11th ACM International Conference on Distributed and Event-Based Systems, pp. 249–260 (2017)

https://en.wikipedia.org/w/index.php?title=Ensemble_learning&oldid=896385411

Patil, S., Nemade, V., Soni, P.K.: Predictive modelling for credit card fraud detection using data analytics. Procedia Comput. Sci. 132 , 385–395 (2018)

Lakshmi, S., Kavila, S.D.: Machine learning for credit card fraud detection system. Int. J. Appl. Eng. Res. 13 (24), 16819–16824 (2018)

Zhang, Y., Liu, G., Luan, W., Yan, C., Jiang, C.: Application of SIRUS in credit card fraud detection. In: International Conference on Computational Social Networks, pp. 66–78 (2018)

Su, C.-H., et al.: A ensemble machine learning based system for merchant credit risk detection in merchant MCC misuse. J. Data Sci. 17 (1) (2019)

https://en.wikipedia.org/w/index.php?title=Special:CiteThisPage&page=Deep_learning&id=899278872#Wikipedia_talk_pages . Accessed 04 June 2019

Fiore, U., De Santis, A., Perla, F., Zanetti, P., Palmieri, F.: Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf. Sci. 479 , 448–455 (2017)

Raza, M., Qayyum, U.: Classical and deep learning classifiers for anomaly detection. In: Proceedings 2019 16th International Bhurban Conference on Applied Sciences and Technology, IBCAST 2019, pp. 614–618 (2019)

Niimi, A.: Deep learning for credit card data analysis. In: 2015 World Congress on Internet Security (WorldCIS), pp. 73–77 (2015)

Salo, F., Injadat, M., Nassif, A.B., Shami, A., Essex, A.: Data mining techniques in intrusion detection systems: a systematic literature review. IEEE Access 6 , 56046–56058 (2018)

https://en.wikipedia.org/wiki/Sensitivity_and_specificity . Accessed 05 June 2019

https://en.wikipedia.org/wiki/False_positive_rate . Accessed 05 June 2019

Tharwat, A.: Classification assessment methods. Appl. Comput. Inform. (2018, in press)

https://en.wikipedia.org/wiki/Accuracy_and_precision . Accessed 05 June 2019

Download references

Author information

Authors and affiliations.

Department of Computer Science, SDNB Vaishnav College for Women, University of Madras, Chennai, India

C. Victoria Priscilla

Department of Computer Applications, Madras Christian College, University of Madras, Chennai, India

D. Padma Prabha

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to C. Victoria Priscilla .

Editor information

Editors and affiliations.

University of Technology Sydney, Sydney, Australia

Lakhmi C. Jain

CSIE Department, National Dong Hwa University, New Taipei City, Taiwan

Sheng-Lung Peng

Al-Balqa’ Applied University, Salt, Jordan

Basim Alhadidi

Department of Computer Science, Brainware University, Kolkata, West Bengal, India

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Cite this paper.

Priscilla, C.V., Prabha, D.P. (2020). Credit Card Fraud Detection: A Systematic Review. In: Jain, L., Peng, SL., Alhadidi, B., Pal, S. (eds) Intelligent Computing Paradigm and Cutting-edge Technologies. ICICCT 2019. Learning and Analytics in Intelligent Systems, vol 9. Springer, Cham. https://doi.org/10.1007/978-3-030-38501-9_29

Download citation

DOI : https://doi.org/10.1007/978-3-030-38501-9_29

Published : 18 January 2020

Publisher Name : Springer, Cham

Print ISBN : 978-3-030-38500-2

Online ISBN : 978-3-030-38501-9

eBook Packages : Intelligent Technologies and Robotics Intelligent Technologies and Robotics (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research
  • Open access
  • Published: 25 February 2022

A machine learning based credit card fraud detection using the GA algorithm for feature selection

  • Emmanuel Ileberi 1 ,
  • Yanxia Sun 1 &
  • Zenghui Wang 2  

Journal of Big Data volume  9 , Article number:  24 ( 2022 ) Cite this article

69k Accesses

81 Citations

Metrics details

The recent advances of e-commerce and e-payment systems have sparked an increase in financial fraud cases such as credit card fraud. It is therefore crucial to implement mechanisms that can detect the credit card fraud. Features of credit card frauds play important role when machine learning is used for credit card fraud detection, and they must be chosen properly. This paper proposes a machine learning (ML) based credit card fraud detection engine using the genetic algorithm (GA) for feature selection. After the optimized features are chosen, the proposed detection engine uses the following ML classifiers: Decision Tree (DT), Random Forest (RF), Logistic Regression (LR), Artificial Neural Network (ANN), and Naive Bayes (NB). To validate the performance, the proposed credit card fraud detection engine is evaluated using a dataset generated from European cardholders. The result demonstrated that our proposed approach outperforms existing systems.

Introduction

In the last decade, there has been an exponential growth of the Internet. This has sparked the proliferation and increase in the use of services such as e-commerce, tap and pay systems, online bills payment systems etc. As a consequence, fraudsters have also increased activities to attack transactions that are made using credit cards. There exists a number of mechanisms used to protect credit cards transactions including credit card data encryption and tokenization [ 1 ]. Although such methods are effective in most of the cases, they do not fully protect credit card transactions against fraud.

Machine Learning (ML) is a sub-field of Artificial Intelligence (AI) that allows computers to learn from previous experience (data) and to improve on their predictive abilities without explicitly being programmed to do so [ 2 ]. In this work we implement Machine Learning (ML) methods for credit card fraud detection. Credit card fraud is defined as a fraudulent transaction (payment) that is made using a credit or debit card by an unauthorised user [ 3 ]. According to the Federal Trade Commission (FTC), there were about 1579 data breaches amounting to 179 million data points whereby credit card fraud activities were the most prevalent [ 4 ]. Therefore, it is crucial to implement an effective credit card fraud detection method that is able to protect users from financial loss. One of the key issues with applying ML approaches to the credit card fraud detection problem is that most of the published work are impossible to reproduce. This is because credit card transactions are highly confidential. Therefore, the datasets that are used to develop ML models for credit card fraud detection contain anonymized attributes. Furthermore, credit card fraud detection is a challenging task because of the constantly changing nature and patterns of the fraudulent transactions [ 5 ]. Additionally, existing ML models for credit card fraud detection suffer from a low detection accuracy and are not able to solve the highly skewed nature of credit card fraud datasets. Therefore, it is essential to develop ML models that can perform optimally and that can detect credit card fraud with a high accuracy score.

This research focuses on the application of the following supervised ML algorithms for credit card fraud detection: Decision Tree (DT) [ 7 ], Random Forest (RF) [ 8 ], Artificial Neural Network (ANN) [ 12 ], Naive Bayes (NB) [ 11 ] and Logistic Regression (LR) [ 6 ]. ML systems are trained and tested using large datasets. In this work, a credit card fraud dataset generated from European credit cardholders is utilized. Oftentimes, these datasets may have many attributes that could have a negative impact on the performance of the classifiers during the training process. To solve the issue of a high feature dimension space, we implement a feature selection algorithm that is based on the Genetic Algorithm (GA) [ 25 ] using the RF method in its fitness function. The RF method is used in the GA fitness function because it can handle a large number of input variables, it can automatically handle missing values, and because it is not affected by noisy data [ 9 ].

The reminder of this paper is structured as follows. The second section provides an overview of the classifiers that are used in this research. Section III provides a literature review of similar work. Section IV provides the details of the dataset used in this research. Section V outlines the GA algorithm. Section VI. explains the architecture of the proposed system. We conduct the experiments in Section VII. The conclusion is presented in Section VIII.

Classifiers

Logistic regression.

The Logistic Regression (LR) classifier, sometimes referred to as the Logit classifier, is a supervised ML method that is generally used for binary classification tasks [ 6 ]. LR is a special type of linear regression whereby a linear function is fed to the logit function.

where the value of q will be between 0 and 1. q is the probability that determines the prediction of a given class. The closer q is to 1, the more accurately it predicts a particular class.

Decision trees and random forest

Decision Tree (DT) is a supervised ML based approach that is utilized to solve regression and classification tasks. A DT contains the following types of nodes: root node, decision node and leaf node. The root node is the starting point of the algorithm. The decision node is a point whereby a choice is made in order to split the tree. A leaf node represents a final decision [ 7 ]. The RF method conducts its predictions by using an ensemble of DTs [ 8 ]. In the RF, a decision is reached by majority vote. The following is a mathematical definition of the RF [ 10 ]:

Given a number of trees k , a RF is defined as, RF = \({ \{g(X, \theta _k) \} }\) , where \(\{\theta _k \}\) represents independent identically distributed trees that cast a vote on input vector X . The label with the most votes is the prediction.

Naive Bayes

The Naive Bayes (NB) is a supervised ML technique that is based on Bayes’ theorem. The NB method assumes the independence of each pair of attributes when provided with the dependant variable (the class). In this research, the Gaussian NB (GNB) classifier was used. With the GNB, we assume that the probability of the attributes is Gaussian as explained in Equation ( 3 ).

where \(\beta _y\) and \(\alpha _y\) are computed using the maximum probability.

Artificial Neural Network

Artificial Neural Network (ANN) is a supervised ML method that is inspired from the inner workings of the human brain. The simplest ANN have the following basic structure: an input layer, one hidden layer and an output layer. The input layer size is based on the number of features in a given dataset. The hidden layer size can be varied based on the complexity of a task and the output layer size depends on the type of problems to be solved. The most basic component of an ANN is a node or neuron. In this research, we consider feed forward ANNs. Therefore, the information flows in one direction (from its input to its output) through a neuron [ 12 ]. Figure  1 depicts a graphical representation of a simple ANN with 3 nodes in the input layer, a hidden layer with 4 nodes and an output layer with 1 node.

figure 1

Related work

In ref. [ 13 ], the authors implemented a credit card fraud detection system using several ML algorithms including logistic regression (LR), decision tree (DT), support vector machine (SVM) and random forest (RF). These classifiers were evaluated using a credit card fraud detection dataset generated from European cardholders in 2013. In this dataset, the ratio between non-fraudulent and fraudulent transactions is highly skewed; therefore, this is a highly imbalanced dataset. The researcher used the classification accuracy to assess the performance of each ML approach. The experimental outcomes showed that the LR, DT, SVM and RF obtained the following accuracy scores: 97.70%, 95.50%, 97.50% and 98.60%, respectively. Although these outcomes are good, the authors suggested that the implementation of advanced pre-processing techniques could have a positive impact on the performance of the classifiers.

Varmedja et al. [ 14 ] proposed a credit card fraud detection method using ML The authors used a credit card fraud dataset sourced from Kaggle [ 19 ]. This dataset contains transactions made within 2 days by European credit card holders. To deal with the class imbalance problem present in the dataset, the researcher implemented the Synthetic Minority Oversampling Technique (SMOTE) oversampling technique. The following ML methods were implemented to assess the efficacy of the proposed method: RF, NB, and multilayer perceptron (MLP). The experimental results demonstrated that the RF algorithm performed optimally with a fraud detection accuracy of 99.96%. The NB and the MLP methods obtained accuracy scores of 99.23% and 99.93%, respectively. The authors concede that more research should be conducted to implement a feature selection method that could improve on the accuracy of other ML methods.

Khatri et al. [ 15 ] conducted a performance analysis of ML techniques for credit card fraud detection. In this research, the authors considered the following ML approaches: DT, k-Nearest Neighbor (KNN), LR, RF and NB. To assess the performance of each ML method, the authors used a highly imbalanced dataset that was generated from European cardholders. One of the main performance metric that was used in the experiments is the precision which was obtained by each classifier. The experimental outcomes showed that the DT, KNN, LR, and RF obtained precisions of 85.11%, 91.11%, 87.5%, 89.77%, 6.52%, respectively.

Awoyemi et al. [ 16 ] presented a comparison analysis of different ML methods on the European cardholders credit card fraud dataset. In this research, the authors used an hybrid sampling technique to deal with the imbalanced nature of the dataset. The following ML were considered: NB, KNN, and LR. The experiments were carried out using a Python based ML framework. The accuracy was the main performance metric that was utilized to assess the effectiveness of each ML approach. The experimental results demonstrated that the NB, LR,and KNN achieved the following accuracies, respectively: 97.92%, 54.86%, and 97.69%. Although the NB and KNN performed relatively well, the authors did not explore the possibility to implement a feature selection method.

In ref. [ 4 ] the authors utilized several ML learning based methods to solve the issue of credit card fraud. In this work, the researchers used the European credit cardholder fraud dataset. To deal with the highly imbalanced nature of this dataset, the authors employed the SMOTE sampling technique. The following ML methods were considered: DT, LR, and Isolation Forest (IF). The accuracy was one of the main performance metrics that was considered. The results showed that the DT, LR, and IF obtained the accuracy scores of 97.08%, 97.18%, and 58.83%, respectively.

Manjeevan et al. [ 17 ] implemented an intelligent payment card fraud detection system using the GA for feature selection and aggregation. The authors implemented several machine learning algorithms to validate the effectiveness of their proposed method. The results demonstrated that the GA-RF obtained an accuracy of 77.95%, the GA-ANN achieved an accuracy of 81.82%, and the GA-DT attained an accuracy of 81.97%.

Research methodology

In this research, we use a dataset that includes credit card transactions that were made by European cardholders for 2 days in September 2013. This dataset contains 284807 transactions in total in which 0.172% of the transactions are fraudulent. The dataset has the following 30 features ( V1 ,.., V28 ), Time and Amount . All the attributes within the dataset are numerical. The last column represents the class (type of transaction) whereby the value of 1 denotes a fraudulent transaction and the value of 0 otherwise. The features V1 to V28 are not named for data security and integrity reasons [ 19 ]. This dataset has been used in ref. [ 4 , 13 , 14 , 16 ] and one of the key issues that we discovered is the low detection accuracy score that was obtained by those models because of the highly imbalanced nature of the dataset. In order to solve the issue of class imbalance, we applied the Synthetic Minority Oversampling Technique (SMOTE) method in the Data-Preprocessing phase of the proposed framework in Fig.  5 [ 18 ]. The SMOTE method works by picking samples that are close to each other within the feature space, drawing a line between the data points in the feature space and creating a new instance of the minority class at a point along the line.

Feature selection

Feature selection (FS) is a crucial step when implementing machine learning methods. This is partly because the dataset used during the training and testing processes may have a large feature space that may negatively impact the overall performance of the models. The choice of which FS method to use depends on the kind of problem a researcher is trying to solve. The following paragraph provides an overview of instances where using a FS method improved on the performance of ML models.

Kasongo [ 20 ] implemented a GA-based FS in order to increase the performance of ML based models applied to the domain of intrusion detection systems. The results demonstrated that the application of GA improved the performance of the RF classifier with an Area Under the Curve (AUC) of 0.98. Mienye [ 21 ] et al. implemented a particle swarm optimization (PSO) technique to increase the performance of stacked sparse autoencoder network (SSAE) coupled with the softmax unit for heart disease prediction. The PSO technique was used to improve the feature learning capability of SSAE by optimally tuning its parameters. The results demonstrated that the PSO-SSAE achieved an accuracy of 97.3% on the Framingham heart disease dataset. Hemavathi et al. [ 22 ] implemented an effective FS method in an integrated environment using enhanced principal component analysis (EPCA). The results demonstrated that using the EPCA yields optimal results in supervised and unsupervised environments. Pouramirarsalani et al. [ 23 ] implemented a FS method using hybrid FS and GA for fraud detection in an e-banking environment. The experimental results demonstrated that using a FS method on a financial fraud datasets has a positive impact on the overall performance of the models that were used. In ref. [ 24 ], the authors implemented the GA-based FS method in conjunction with NB, SVM and RF algorithms for credit card fraud detection. The experimental output demonstrated that the RF yielded a better performance in comparison to the NB and SVM.

Genetic algorithm feature selection

The Genetic Algorithm (GA) is a type of Evolutionary inspired Algorithm (EA) that is often used to solve a number of optimization tasks with a reduced computational overhead. EAs generally possess the following attributes [ 25 , 26 ]:

Population EAs approaches maintain a sample of possible solutions called population .

Fitness A solution within the population is called an individual . Each individual is characterized by a gene representation and a fitness measure.

Variation The individual evolves through mutations that are inspired from the biological gene evolution.

In this study, the RF approach is used as the fitness method inside the GA. Further, the RF method is employed because it resolves the problem of over-fitting that is generally encountered when using regular Decision Trees (DTs). Moreover, RF performs well with both continuous and categorical attributes and RF are known to perform optimally on datasets that have a class imbalance problem. Additionally, the RF is a rule-based approach; therefore, the normalising of data is not required [ 27 ]. The alternative to the RF include tree-based ML algorithms such as Extra-Trees and Extreme Gradient Boosting [ 28 , 29 ]. The fitness method is defined a function that receives a candidate solution (a feature vector) and determines whether it is fit or not. The measure of fitness is determined by the accuracy that is yielded by a particular attribute vector in the testing process of the RF method within the GA. Algorithm 1 provides more details about the implementation of RF in the GA.

Algorithm 1 denotes the pseudo code implementation of the fitness function that was used in the GA. This algorithm consists of 6 main steps. In step 1, the data (20% of the full Credit Card Fraud dataset) is divided into a training ( \(F_{train}\) and \(y_{train}\) ) and testing ( \(F_{test}\) and \(y_{test}\) ) subsets. In Step 2, an instance of the RF classifier is instantiated. In Step 3, the RF instance is trained using the training set. In Step 4, the resulting model is then evaluated using the testing data \(y_{test}\) . In Step 5, the predictions are stored in \(y_{pred}\) . In the last step, the evaluation process is conducted using \(y_{pred}\) . During the evaluation procedure, the accuracy is used as the main performance metric. The most optimal model is one that yields the highest accuracy score.

Algorithm 2 is a pseudo code that represents the computation process of a candidate feature vector. In the initialization phase, the clean Credit Card Fraud dataset is loaded. In the second phase, we define all the variables that will be used in the computation procedure of a candidate feature vector. This includes the following: a list, A , that will store the names of all the features that are present in the Credit Card Fraud dataset; y represents the target variable; B denotes an empty array that will store the most optimal feature names. k represents the total number of iterations required to compute a candidate feature vector. Once the definition phase is completed; in Step 1, we generate the initial population (feature names) and store them in A . In Step 2 and Step 3, Algorithm 2 is computed. The fitness value, q is generated in Step 4. q determines whether a candidate feature vector is optimal or not. If a candidate feature vector is not optimal; we compute the crossover ( k -point crossover, where \(k=1\) ), the mutation, the fitness (from Step 6 to Step 10). This process is conducted iteratively till the algorithm converges. The convergence point is decided once the maximum accuracy has been reached over k iterations.

figure a

The main steps of the GA that was adapted to our case study are depicted in Fig.  2 . This flowchart represents the compact version of the implementation of the pseudo code in Algorithm 1 and Algorithm 2 [ 30 ].

figure 2

GA flowchart

After the implementation of the GA (Algorithm 1 and Algorithm 2) on the credit card fraud dataset, we obtained the 5 optimal feature vectors ( \(v_1\) to \(v_5\) ) that are shown in Table  1 . These vectors contain the feature names that represents the most optimal attributes that will be used to assess the effectiveness of our proposed method.

Fraud detection framework

The architecture of the proposed methodology is depicted in Fig.  3 . The initial step is computed in the Normalize Inputs block whereby the training dataset is normalized using the min-max scaling method in Equation ( 4 ) [ 31 ]. The scaling process is done to ensure that all the input values are within a predefined range. The GA algorithm is implemented in the GA Feature Selection block using the normalized data from the Normalize Inputs block. At each iteration of the GA Feature Selection block , the GA generates a candidate attribute vector \(v_n\) that is used to train the models in the Training block represented by the Training data and Train the models blocks. The same vector is also used to test the trained models using the test data. The testing process is conducted using the Trained Model block using the Test Data . For a given model, the testing process is conducted for each \(v_n\) until the desired results are obtained.

where f is a feature in the dataset.

figure 3

Architecture of the proposed framework

Performance metrics

The research presented in this paper is modeled as a ML binary classification task. Therefore, we use the accuracy (AC) that was obtained on the test data as the main performance metric. Additionally, for each model, we compute the recall (RC), the precision (PR) and the F1-Score (F-Measure) [ 32 ]. To assess the classification quality of each model, we further plot the Area Under the Curve (AUC). The AUC is a metric that reveals how effective a classifier is for a given classification task. The value of the AUC varies between 0 and 1 whereby an efficient classifier would have an AUC value close to 1 [ 33 ].

True positive (TP): attacks/intrusions that are accurately flagged as attacks.

True Negative (TN): normal traffic patterns/traces that are successfully categorized as normal.

False positive (FP): legitimate network traces that are incorrectly labeled as intrusive.

False Negative (FN): attacks/intrusions that are incorrectly classified as non-intrusive.

Experiments

Experimental configuration.

The experimental processes were conducted on Google Colab [ 34 ]. The compute specifications are as follows: Intel(R) Xeon(R), 2.30GHz, 2 Cores. The ML framework used in this research is the Scikit-Learn [ 35 ].

Results and discussions

The experiments were carried out in two folds. In the first step, a classification process was conducted using \(F=\{v_1,v_2,v_3,v_4,v_5 \}\) . For each feature vector in F , the following methods were trained and tested: RF, DT, ANN, NB and LR. The results are depicted in Tables  2 , 3 , 4 , 5 , 6 . As shown in Table  2 , both the ANN and the RF algorithms obtained the highest test accuracy (TAC) of 99.94% using \(v_1\) . However, the RF method obtained the best results in terms of precision. In Table  3 , the results that were obtained using \(v_2\) demonstrate that the best model is the RF approach with an accuracy of 99.93%. In Table  4 , the RF method also obtained the best fraud detection accuracy of 99.94% using \(v_3\) . Table  5 presents the results that were achieved by \(v_4\) whereby the DT obtained an accuracy of 99.1% and a precision of 81.17%. Table  6 depicts the outcomes that were obtained when using \(v_5\) . In this case, the RF attained a fraud detection accuracy of 99.98% and precision of 95.34%. In comparison to the results obtained by \(v_1\) , \(v_2\) , \(v_3\) and \(v_4\) ; \(v_5\) obtained the best results. Moreover, looking at the outcomes presented in Tables  2 , 3 , 4 , 5 , 6 , the NB method under performed in terms of Recall, Precision and F1-Score.

As an initial validation of the proposed method, we ran further experiments using the full feature vector and a feature vector that was generated using a random approach random_vec = { V2, V3, V4, V5, V6, V7, V8, V9, V11, V12, V13, V16, V17, V18, V19, V20, V21, V22, V23, V25, V26, V28, Amount}. The result are listed in Tables  7 and 8 . In both instances, we observed serve drop in the performance our the models in comparison to the models that were coupled with the GA (Tables  2 , 3 , 4 , 5 , 6 ).

Furthermore, we computed the AUC of each vector in F . These results are depicted in Figs.  4 , 5 , 6 , 7 , 8 . In Fig.  4 ( \(v_1\) ), the best performing models in terms of the quality of classification are the RF, NB, and LR with the AUCs of 0.96, 0.97, and 0.97, respectively. In the instance of \(v_5\) (Fig  8 ), the RF and NB obtained the highest AUCs of 0.95 and 0.96. Moreover, a comparison analysis is presented in Table  7 . This comparison reveals that the GA feature selection approach presented in this paper as well as most of the proposed ML methods that were implemented outperformed the existing techniques that are proposed in [ 4 , 13 , 14 , 16 ].For instance, the GA-RF proposed in this research obtained an accuracy that is 2.28% higher than the LR in [ 13 ]. The GA-DT proposed in this work yielded a fraud detection accuracy that is 4.42% higher than the DT model presented in [ 14 ]. The GA-LR obtained an accuracy that is 2.41% higher than the SVM model presented in [ 13 ]. The GA-NB proposed in this research achieved an accuracy that is 1.75% higher than the KNN model proposed in [ 16 ]. Additionally, the GA-DT presented in this research achieved an accuracy that is 17.23% greater than the accuracy obtained in [ 17 ]. In terms of classification accuracy, the most optimal classifier is the RF (implemented with \(v_5\) ). This model achieved a noteworthy credit card fraud detection accuracy of 99.98%.

figure 4

AUC results for \(v_1\)

figure 5

AUC results for \(v_2\)

figure 6

AUC results for \(v_3\)

figure 7

AUC results for \(v_4\)

figure 8

AUC results for \(v_5\)

Experiments on synthetic dataset

To validate the efficiency of our proposed method, we conducted more experiments using a publicly available synthetic dataset that contains the following features: V = \(\{\) User, Card, Year, Month, Day, Time, Amount, Use Chip, Merchant Name, Merchant City, Merchant State, Zip, MCC, Errors, Is Fraud \(\}\) , where Is Fraud denotes the target variable. This dataset contained 24357143 legitimate credit card transactions and 29757 fraudulent ones [ 36 ]. In the experiments, we considered the following methods: RF, DT, ANN, NB, and LR. We first processed the dataset through the framework in Fig.  5 . The GA module selected the features represented by \(v_0\) in Table  8 . These were the features that were used during the training and testing processes of the ML models. Table  9 provides the details of the results that were obtained after the experiments converged. The GA-ANN and the GA-DT achieved accuracies of 100%. These results are backed by AUCs of 0.94 and 1, respectively. The other models that performed remarkably well are the GA-RF and the GA-LR with accuracies of 99.95% and 99.96%. However, the GA-LR yielded a low AUC of 0.63 (Table 10 ).

Moreover, Fig.  7 depicts the ROC curves of the ML models that were considered in the experiments. The result demonstrated that the RF and the DT models achieved an AUC of 1. This indicates that models were perfect at detecting fraudulent activities (Table 11 ).

In this research, a GA based feature selection method in conjunction with the RF, DT, ANN, NB, and LR was proposed. The GA was implemented with the RF in its fitness function. The GA was further applied to the European cardholders credit card transactions dataset and 5 optimal feature vectors were generated. The experimental results that were achieved using the GA selected attributes demonstrated that the GA-RF (using \(v_5\) ) achieved an overall optimal accuracy of 99.98%. Furthermore, other classifiers such as the GA-DT achieved a remarkable accuracy of 99.92% using \(v_1\) . The results obtained in this research were superior to those achieved by existing methods. Moreover, we implemented our proposed framework on a synthetic credit card fraud dataset to validate the results that were obtained on the European credit card fraud dataset. The experimental outcomes showed that the GA-DT obtained an AUC of 1 and an accuracy of 100%. Seconded by the GA-ANN with an AUC of 0.94 and an accuracy of 100%. In future works, we intend to use more datasets to validate our framework.

Availability of data and materials

The datasets used during the current study are available a Kaggle, https://www.kaggle.com/mlg-ulb/creditcardfraud . Synthetic Credit Card Fraud Dataset, https://ibm.ent.box.com/v/tabformer-data/folder/130747715605 .

Iwasokun GB, Omomule TG, Akinyede RO. Encryption and tokenization-based system for credit card information security. Int J Cyber Sec Digital Forensics. 2018;7(3):283–93.

Article   Google Scholar  

Burkov A. The hundred-page machine learning book. 2019;1:3–5.

Google Scholar  

Maniraj SP, Saini A, Ahmed S, Sarkar D. Credit card fraud detection using machine learning and data science. Int J Eng Res 2019; 8(09).

Dornadula VN, Geetha S. Credit card fraud detection using machine learning algorithms. Proc Comput Sci. 2019;165:631–41.

Thennakoon, Anuruddha, et al. Real-time credit card fraud detection using machine learning. In: 2019 9th international conference on cloud computing, data science & engineering (Confluence). IEEE; 2019.

Robles-Velasco A, Cortés P, Muñuzuri J, Onieva L. Prediction of pipe failures in water supply networks using logistic regression and support vector classification. Reliab Eng Syst Saf. 2020;196:106754.

Liang J, Qin Z, Xiao S, Ou L, Lin X. Efficient and secure decision tree classification for cloud-assisted online diagnosis services. IEEE Trans Dependable Secure Comput. 2019;18(4):1632–44.

Ghiasi MM, Zendehboudi S. Application of decision tree-based ensemble learning in the classification of breast cancer. Comput in Biology and Medicine. 2021;128:104089.

Lingjun H, Levine RA, Fan J, Beemer J, Stronach J. Random forest as a predictive analytics alternative to regression in institutional research. Pract Assess Res Eval. 2020;23(1):1.

Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.

Ning B, Junwei W, Feng H. Spam message classification based on the Naive Bayes classification algorithm. IAENG Int J Comput Sci. 2019;46(1):46–53.

Katare D, El-Sharkawy M. Embedded system enabled vehicle collision detection: an ANN classifier. In: 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC); 2019. p. 0284–0289.

Campus K. Credit card fraud detection using machine learning models and collating machine learning models. Int J Pure Appl Math. 2018;118(20):825–38.

Varmedja D, Karanovic M, Sladojevic S, Arsenovic M, Anderla A. Credit card fraud detection-machine learning methods. In: 18th international symposium INFOTEH-JAHORINA (INFOTEH); 2019. p. 1-5.

Khatri S, Arora A, Agrawal AP. Supervised machine learning algorithms for credit card fraud detection: a comparison. In: 10th international conference on cloud computing, data science & engineering (Confluence); 2020. p. 680-683.

Awoyemi JO, Adetunmbi AO, Oluwadare SA. Credit card fraud detection using machine learning techniques: a comparative analysis. In: International conference on computer networks and Information (ICCNI); 2017. p. 1-9.

Seera M, Lim CP, Kumar A, Dhamotharan L, Tan KH. An intelligent payment card fraud detection system. Ann Oper Res 2021;1–23.

Guo S, Liu Y, Chen R, Sun X, Wang X. X, Improved SMOTE algorithm to deal with imbalanced activity classes in smart homes. Neural Process Lett. 2019;50(2):1503–26.

The Credit card fraud [Online]. https://www.kaggle.com/mlg-ulb/creditcardfraud

Kasongo SM. An advanced intrusion detection system for IIoT based on GA and tree based algorithms. IEEE Access. 2021;9:113199–212.

Mienye ID, Sun Y. Improved heart disease prediction using particle swarm optimization based stacked sparse autoencoder. Electronics. 2021;10(19):2347.

Hemavathi D, Srimathi H. Effective feature selection technique in an integrated environment using enhanced principal component analysis. J Ambient Intell Hum Comput. 2021;12(3):3679–88.

Pouramirarsalani A, Khalilian M, Nikravanshalmani A. Fraud detection in E-banking by using the hybrid feature selection and evolutionary algorithms. Int J Comput Sci Netw Secur. 2017;17(8):271–9.

Saheed YK, Hambali MA, Arowolo MO, Olasupo YA. Application of GA feature selection on Naive Bayes, random forest and SVM for credit card fraud detection. In: 2020 international conference on decision aid sciences and application (DASA); 2020. p. 1091–1097.

Davis L. Handbook of genetic algorithms; 1991.

Li Y, Jia M, Han X, Bai XS. Towards a comprehensive optimization of engine efficiency and emissions by coupling artificial neural network (ANN) with genetic algorithm (GA). Energy. 2021;225:120331.

Khalilia M, Chakraborty S, Popescu M. Predicting disease risks from highly imbalanced data using random forest. BMC Med Inf Decis Mak. 2011;11(1):1–13.

Abhishek L. Optical character recognition using ensemble of SVM, MLP and extra trees classifier. In: International conference for emerging technology (INCET) IEEE; 2020. p. 1–4.

Chen T, He T, Benesty M, Khotilovich V, Tang Y, Cho H. Xgboost: extreme gradient boosting. R package version 04-2. 2015;1(4):1–4.

Harik GR, Lobo FG, Goldberg DE. The compact genetic algorithm. IEEE Trans Evol Comput. 1999;3(4):287–97.

Jain A, Nandakumar K, Ross A. Score normalization in multimodal biometric systems. Pattern Recognit. 2005;38(12):2270–85.

Kasongo SM, Sun Y. A deep long short-term memory based classifier for wireless intrusion detection system. ICT Express. 2020;6(2):98–103.

Norton M, Uryasev S. Maximization of auc and buffered auc in binary classification. Math Program. 2019;174(1):575–612.

Article   MathSciNet   Google Scholar  

Google Colab [Online]. Available: https://colab.research.google.com/

Scikit-learn : machine learning in Python [Online]. https://scikit-learn.org/stable/

Altman ER. Synthesizing credit card transactions. 2019. arXiv preprint arXiv:1910.03033

Download references

This research is funded by the University of Johannesburg, South Africa.

Author information

Authors and affiliations.

Department of Electrical & Electronic Engineering Science, University of Johannesburg, Kingsway Ave, 2006, Johannesburg, South Africa

Emmanuel Ileberi & Yanxia Sun

Department of Electrical Engineering, University of South Africa, Florida, 1709, Johannesburg, South Africa

Zenghui Wang

You can also search for this author in PubMed   Google Scholar

Contributions

Ileberi Emmanuel wrote the algorithms and methods related to this research and he interpreted the results. Y. Sun and Z. Wang provided guidance in terms of validating the obtained results. All authors read and approved the final manuscript.

Authors' information

Yanxia Sun got her joint qualification: D-Tech in Electrical Engineering, Tshwane University of Technology, South Africa and PhD in Computer Science, University Paris-EST, France in 2012. Yanxia Sun is currently working as Professor is the Department of Electrical and Electronic Engineering Science, University of Johannesburg, South Africa. She has 15 years teaching and research experience. She has lectured five courses in the universities. She has supervised or co-supervised five postgraduate projects to completion. Currently she is supervising six PhD students and four master students. She published 42 papers including 14 ISI master indexed journal papers. She is the investigator or co-investigator for six research projects. She is the member of the South African Young Academy of Science (SAYAS). Here research interests include Renewable Energy, Evolutionary Optimization, Neural Network, Nonlinear Dynamics and Control Systems.

Zenghui Wang, a Professor in Department of Electrical Engineering, University of South Africa.

Corresponding author

Correspondence to Emmanuel Ileberi .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Ileberi, E., Sun, Y. & Wang, Z. A machine learning based credit card fraud detection using the GA algorithm for feature selection. J Big Data 9 , 24 (2022). https://doi.org/10.1186/s40537-022-00573-8

Download citation

Received : 30 July 2021

Accepted : 06 February 2022

Published : 25 February 2022

DOI : https://doi.org/10.1186/s40537-022-00573-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Machine learning
  • Genetic algorithm
  • Fraud detection
  • Cybersecurity

research paper for credit card

  • Public Service That Makes a Difference ®

2017 Series • 17–14

Research department working papers, credit card utilization and consumption over the life cycle and business cycle.

Nearly 80 percent of U.S. adults have a credit card, and more than half of them revolve their debt from month to month. Using a large sample of credit bureau data, this paper documents a tight link between available credit (the limit) and credit card debt, and then it offers a model-based interpretation of this linkage. Credit limits change frequently for individuals, increase rapidly on average as people age, and show large changes over the business cycle. Yet credit card debt changes nearly proportionately to credit and at about the same time, so the fraction of credit used is relatively stable over time. The authors build a life-cycle consumption model that includes the joint use of credit cards to pay directly for expenditures, to help smooth consumption against income shocks, and to borrow longer term (revolving indefinitely). The authors estimate the parameters of the model using several data sources, including a large credit bureau database and a new daily diary of consumer payment choices.

social-email-icn

Implications

People experience important changes in credit throughout their life, especially between the ages of 20 and 40, when their credit limits soar. These changes in credit are in effect changes in liquidity, and observing how people react to them provides insight into the more general savings and consumption decisions they make.

Although people use credit cards for different purposes, all uses contribute to stable credit utilization. Payment use is proportional to consumption, and when an increase in income leads to an increase in credit limits, a convenience user will increase consumption and payments use. People who use credit cards to borrow because of impatience see a rise in their credit limit as an increase in wealth and increase their consumption (and debt) accordingly. And those who use credit cards for smoothing purposes early in life—when income rises more slowly than credit limits—increase their credit card debt at about the same rate as their credit limits rise.

The revolving credit available to consumers changes substantially over the business cycle, life cycle, and for individuals. We show that debt changes at the same time as credit, so credit utilization is remarkably stable. From ages 20–40, for example, credit card limits grow by more than 700 percent, and yet utilization holds steadily at around 50 percent. We estimate a structural model of life-cycle consumption and credit use in which credit cards can be used for payments, precautionary smoothing, and life-cycle smoothing, uniting their monetary and revolving credit functions. Our estimates predict stable utilization closely matching the individual, life-cycle, and business-cycle relationships between credit and debt. The preference heterogeneity implied by the different uses of credit cards drives our results. The revealed preference that some people with credit cards borrow at high interest, while others do not, suggests that around half the population is living nearly hand to mouth.

  • Full Text Document (pdf)

Contributing business areas Research Monetary Policy & Economic Research

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • PeerJ Comput Sci
  • PMC10280638

Logo of peerjcs

A systematic review of literature on credit card cyber fraud detection using machine and deep learning

Eyad abdel latif marazqah btoush.

1 School of Business, University of Southern Queensland, Toowoomba, QLD, Australia

Xujuan Zhou

Raj gururajan.

2 School of Computing, SRM Institute of Science and Technology, Chennai, India

Ka Ching Chan

Rohan genrich, prema sankaran.

3 School of Management, Presidency University, Bangalore, India

Associated Data

The following information was supplied regarding data availability:

This is a literature review.

The increasing spread of cyberattacks and crimes makes cyber security a top priority in the banking industry. Credit card cyber fraud is a major security risk worldwide. Conventional anomaly detection and rule-based techniques are two of the most common utilized approaches for detecting cyber fraud, however, they are the most time-consuming, resource-intensive, and inaccurate. Machine learning is one of the techniques gaining popularity and playing a significant role in this field. This study examines and synthesizes previous studies on the credit card cyber fraud detection. This review focuses specifically on exploring machine learning/deep learning approaches. In our review, we identified 181 research articles, published from 2019 to 2021. For the benefit of researchers, review of machine learning/deep learning techniques and their relevance in credit card cyber fraud detection is presented. Our review provides direction for choosing the most suitable techniques. This review also discusses the major problems, gaps, and limits in detecting cyber fraud in credit card and recommend research directions for the future. This comprehensive review enables researchers and banking industry to conduct innovation projects for cyber fraud detection.

Introduction

The banking industry has been profoundly impacted by the evolution of information technology (IT). Credit card and online net banking transactions, which are currently the majority of banking system transactions, all present additional vulnerabilities ( Jiang & Broby, 2021 ). Hackers have increasingly targeted banks with enormous quantities of client data. Therefore, banks have been in the forefront of cyber security for business. In the past thirteen years, cyber security industry expanded fast. The market is predicted to be valued 170.4 billion in 2022 ( Morgan, 2019 ). In the next three years, the cost of cybercrime is expected to rise by 15% every year, finally exceeding $10.5 trillion USD each year by 2025 ( Morgan, 2020 ).

In the banking industry, cyber fraud using credit cards is a significant concern that costs billions of dollars annually. Banking industry has made strengthening cyber security protection a priority. Multiple systems have been developed for monitoring and identifying credit card cyber fraud. However, because of the constantly evolving nature of threats, banking industry must be equipped with the most modern and effective cyber fraud management technologies ( Btoush et al., 2021 ).

The acceptance of credit card and other forms of online payments has exploded in recent years, this resulted in an increase in cyber fraud in credit cards. In credit card, there are several forms of cyber fraud. The first type is the actual theft of a credit card. The theft of confidential details of credit card is the second type of cyber fraud. When the credit card information is entered without the cardholder’s permission during an online transaction, further fraud is committed ( Al Smadi & Min, 2020 ; Trivedi et al., 2020 ).

The detection of cyber fraud in credit cards is a challenging task that attracted the interest of academics working in the fields of machine learning (ML). Datasets associated with credit cards have significant skewness. A great number of algorithms are unable to discriminate items from minority classes when working with datasets that have a considerable skew. In order to achieve efficiency, the systems that are used to identify cyber fraud need to react swiftly. Another important matter of concern is the way in which new methods of attack, influence the conditional distribution of the data over the time period ( Benchaji, Douzi & El Ouahidi, 2021 ). According to Al Rubaie (2021) , there are a number of challenges need to be addressed for cyber fraud detection in credit card. These challenges contain massive volume of data, that is unbalanced or incorrectly categorised, frequent changes in the type of transaction, and real-time detection.

As current technology being progressed, cyber credit card fraud is also developing rapidly, making cyber fraud detection a crucial area. The conventional techniques to resolve this problem is no longer sufficient. In the conventional technique, domain experts in cyber fraud compose the algorithms which are governed by strict rules. In addition, a proactive strategy must be used to combat cyber fraud. Every industry is attempting to employ ML-based solutions due to their popularity, speed, and effectiveness ( Priya & Saradha, 2021 ). ML and DL methods have been shown to be affective in this field. In particular, DL has garnered the most attention and had the most success in combating cyber threats recently. Its ability to minimize overfitting and discover underlying fraud tendencies, as well as its capacity to handle massive datasets, make it particularly useful in this field. In the past few years, DL techniques have been applied to recognize new fraudulent patterns and enable systems to respond flexibly to complex data patterns. In this review, we choose to focus on the latest research from 2019–2021 in order to provide the most up-to-date and relevant information on the topic because DL’s popularity has increased during this period.

While there are numerous cyber fraud detection techniques available, as yet no fraud detection systems have been able to deliver high efficiency and high accuracy. Thus it necessary to provide researchers and banking industry with an overview of the state of the art in cyber fraud detection and an analysis of the most recent studies in this field to conduct innovation projects for cyber fraud detection. To achieve this goal, this review will provide a detailed analysis of ML/DL techniques and their function in credit card cyber fraud detection and also offer recommendations for selecting the most suitable techniques for detecting cyber fraud. The study also includes the trends of research, gaps, future direction, and limitations in detecting cyber fraud in credit card.

This review focuses mostly on identifying the ML/DL techniques used to detect credit card cyber fraud. Moreover, we aim to analyse the gaps and trends in this field. Over the past few years, there have only been a few review articles published on detecting credit card cyber fraud. This review takes a look at the detection of card fraud from the standpoint of cybersecurity and applies ML/DL techniques and approached the topic from a financial standpoint. Furthermore, unlike other reviews, which also include conference article, ours only includes recent journal articles.

The aim of this review is to provide researchers with an overview of the state of the art in cyber fraud detection and an analysis of the most recent studies in this field. This review will assist researchers in selecting high-performance ML/DL algorithms and datasets to consider when attempting to detect cyber fraud. To answer the four research questions, we have utilized the search string to conduct research in six digital libraries. This resulted in a total of 2,094 article, all of which are journal article. In addition, we utilised the snowballing strategy to integrate more relevant articles missed by the automated search. Through careful referencing of the explored article, we have narrow down our collection and found the most relevant answers for our four research questions. As a result, 181 article were chosen for further study.

We describe our search study selection, data extraction procedures, and overall research methodology in “Survey Methodology” of this article. In “Result and Analysis”, we present the findings and answers to our research questions. In “Conclusions”, we conclude the study by discussing its findings.

Survey Methodology

The review investigates the present status of research on detecting cyber fraud in credit card and addresses our research questions. The methodology begins with a description of the data sources, the search strategy, the inclusion and exclusion criteria, as well as the quantity of research article selected from the different databases.

Research questions

This review attempts to summarise and analyse the ML and DL credit card cyber fraud detection algorithms from 2019 to 2021. The following research questions (RQs) are therefore posed:

RQ1: What ML/DL techniques are utilised in detection of credit card cyber fraud? This question aims to specify the ML/DL techniques that have been applied.

RQ2: What percentage of credit card cyber fraud detection articles discussed supervised, unsupervised, or semi-supervised techniques? This question seeks to determine the proportion of research articles that employ supervised, unsupervised, and semi-supervised credit cyber fraud detection techniques.

RQ3: What is the estimated overall performance and outcomes of ML/DL models? This question focuses on ML/DL model performance estimation and model results.

RQ4: What are the research trends, gaps, and potential future directions for cyber fraud detection in credit card? The question guides to uncover research trends, gaps in the existing literature, and future direction of credit card cyber fraud research.

Data sources and research strategy

After determining the research questions, we constructed the research as follows:

  • – The main search terms are determined by the research questions.
  • – Boolean operators (AND and OR) are used to restrict search results.
  • – The search terms utilised for this review are related to detect cyber fraud in credit card and ML/DL techniques used for fraud detection.

The methodology incorporates the following electronic literature databases in order to obtain a comprehensive and broad coverage of the literature and to maximise the probability of discovering highly relevant articles:

  • – Google Scholar—ACM—IEEE Xplore—SpringerLink—Web of Science—Scopus.

For the purpose of locating the most relevant article, particular Keywords were formulated into a search string. This string was divided into search units and Boolean operators were used to combine them. All of the mentioned resources have keyword-based search engines. We selected the following search string to retrieve the most relevant studies:

((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”)).

We include “artificial intelligence” OR “deep learning” OR “machine learning” thus that we can find studies that utilised any of these techniques. Additionally, we included the “credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud” term to concentrate on any fraud-related content so that we do not miss any relevant articles.

We conducted a search for the above string in six digital libraries. The research string is edited and converted into an appropriate search query input for each library. Table 1 provides the detailed search queries. We limited our review to journal articles, excluding conference article, books, and other publications. In December 2021, our search conducted for the years from 2019 to 2021. There were a total of 2,094 items retrieved from research libraries. Table 2 depicts the distribution of the items throughout the libraries. We identified 365 duplicate article. After eliminating the duplicates, we continued with the selection process based on the remaining 1,729 article. In addition to the automatic searches of digital libraries, snowballing mechanism was also used.

Digital libraryQuery
Google scholar((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”)).
ACM((All: AI) OR (All: “artificial intelligence”) OR (All: DL) OR (All: “deep learning”) OR (All: ML) OR (All: “machine learning”)) AND ((All: “credit card fraud”) OR (All: “card fraud”) OR (All: “card-fraud”) OR (All: “credit-fraud”) OR (All: “card cyber fraud”) OR (All: “transaction fraud”) OR (All: “payment fraud”) OR (All: “fraud detec*”) OR (All: “bank* fraud”) OR (All: “financ* fraud”)) AND (Publication date: (01/01/2019 TO 12/31/2021))
IEEE Xplore((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”)). Filters applied: Journals 2019–2021.
Springerlink39 Result(s) for ‘((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”))’ within article 2019–2021.
Web of science((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”)). Refined by: publication years: 2019 or 2020 or 2021 Document types: Articles languages: English.
ScopusTITLE-ABS-KEY (((AI OR “artificial intelligence” OR DL OR “deep learning” OR ML OR “machine learning”) AND (“Credit card fraud” OR “card fraud” OR “card-fraud” OR “credit-fraud” OR “card cyber fraud” OR “transaction fraud” OR “payment fraud” OR “fraud detec*” OR “bank* fraud” OR “financ* fraud”))) AND (LIMIT-TO (PUBYEAR, 2021) OR LIMIT-TO (PUBYEAR, 2020) OR LIMIT-TO (PUBYEAR, 2019)) AND (LIMIT-TO (DOCTYPE, “AR”)).
NODatabaseWeb addressRetrieved article
1Google scholar 1,418
2Springerlink 39
3Scopus 292
4IEEE Xplore 76
5Web of science 233
6ACM 36
Total of retrieved article 2,094
The number of duplicates 365
The number of article after removing duplicates 1,729

Study selection

We executed the above search strategy during December 2021 and identified 2,094 article. After removing duplicates (365 articles), the titles and abstracts of 1,729 unique citations were screened for eligibility. We screened the titles and abstracts for relevance. If the study’s relevance could not be verified due to insufficient abstract information or the absence of an abstract, the citation was assigned for full-text review. Thus we reviewed the full text of 281 studies. Disagreements on the included studies were resolved through discussion and consensus. The selected article were filtered to ensure that only relevant studies were included in our review. Then the article were exported to EndNote and grouped for each database and then exported to a literature review management software called Rayyan ( Ouzzani et al., 2016 ) to facilitate the screening and selection process. To initiate the filtering and selection processes, duplicate articles gathered from multiple digital resources are eliminated. Then using inclusion and exclusion criteria, removed the irrelevant article. Using quality evaluation processes we included only the qualified article that offer the most effective answers to our study objectives. Using the collected article references, we searched for further related publications. Figure 1 displays the article selection process. The inclusion and exclusion criteria utilised for this review are detailed in Table 3 . After the filtration process was completed, 181 article were observed for this study.

An external file that holds a picture, illustration, etc.
Object name is peerj-cs-09-1278-g001.jpg

Inclusion criteriaExclusion criteria
Include journal article onlyExclude conference article, chapter book, and other publication.
Include articles about credit card cyber fraud detectionExclude articles not related to detect cyber fraud in credit card
Include articles that used ML/DLExclude articles that did not use ML/DL
Include articles published in 2019, 2020, and 2021Exclude articles that published before 2019 and after 2021
Include articles in English languageExclude publications in languages other than English.

Data extraction

This process aims to analyse the final selection of article in order to collect the data required to answer the four research questions. Table 4 displays our data extraction form. In the final column of Table 4 , the reason for extracting the corresponding data were given. We answered RQ1 and RQ2 using information regarding techniques and datasets. We used this information to group studies with comparable datasets and techniques. Extraction of each article’s discussion and findings was an aid in estimating the overall performance of approaches and answering RQ3. By extracting out the article’ objectives and conclusions, we are able to recognise trends, conduct gap analysis, determine future research, and provide a response to RQ4. As a result, in order to identify the gaps and define the next direction of future research should take, on the basis of the article’s objectives and conclusions, we conducted a summary analysis.

StrategyCategoryDescriptionPurpose
Automatic extractionTitle of articlethe article’s titleAdditional information
Authors of articleThe author’s name
Article yearThe year of publication
Article typeJournal
Manual extractionObjectivesstudy objectivesRQ4
ConclusionOutcomes of studyRQ4
TechniquesML/DL technique utilised to support objectivesRQ1 and RQ2
Discussion and resultOutcomesRQ3
Algorithm typeML, DL, or mixRQ1 and RQ2
DatasetDataset used in articleRQ1
Future workGaps, trends, and future workRQ4

Result and analysis

Distribution of chosen articles throughout the years.

To explore the most recent techniques described in journals published in this field, limits were placed on publishing years. Our review selected article that were published from 2019 to 2021. In Fig. 2 we specified the distribution of article by year of publication. Since our study was completed in December 2021, it is important to note that article published after December 2021 were not included.

An external file that holds a picture, illustration, etc.
Object name is peerj-cs-09-1278-g002.jpg

Publication type

In this review, we evaluated only journal publications. Table A1 displays the selected research articles published during the observation period.

Article IDArticle titleTypeYearReference
A1Comparative analysis of back-propagation neural network and k-means clustering algorithm in fraud detection in online credit card transaction.Journal2019
A2Credit card fraud detection using machine learning classification algorithms over highly imbalanced data.Journal2020
A3Hybrid CNN-BILSTM-Attention based identification and prevention system for banking transactions.Journal2021
A4Identify theft detection using machine learning.Journal2021
A5Hidden Markov model application for credit card fraud detection systems.Journal2020
A6Enhanced SMOTE & fast random forest techniques for credit card fraud detection.Journal2020
A7Fraud identification of credit card using ML techniques.Journal2020
A8Improvement in credit card fraud detection using ensemble classification technique and user data.Journal2021
A9Credit card fraud detection integrated account and transaction sub modules.Journal2021
A10Credit card fraud detection using autoencoder model in unbalanced datasets.Journal2019
A11Fraud detection in credit card using logistic regression.Journal2020
A12A financial fraud detection model based on LSTM deep learning technique.Journal2020
A13Comparative study of machine learning algorithms and correlation between input parameters.Journal2019
A14Example-dependent cost-sensitive credit cards fraud detection using SMOTE and Bayes minimum risk.Journal2020
A15Credit card fraud detection on skewed data using machine learning techniques.Journal2021
A16Facilitating user authorization from imbalanced data logs of credit card using artificial intelligence.Journal2020
A17Intelligence feature selection with social spider optimization based artificial neural network model for credit card fraud detection.Journal2020
A18Deal-deep ensemble algorithm framework for credit card fraud detection in real-time data stream with Google TensorFlow.Journal2020
A19Credit card fraud detection using artificial neural network.Journal2021
A20IFDTC4.5: intuitionistic fuzzy logic based decision tree for E-transactional fraud detection.Journal2020
A21Credit card fraud detection using hybrid models.Journal2019
A22Comparative analysis of different distribution dataset by using data mining techniques on credit card fraud detection.Journal2020
A23Improving detection of credit card fraudulent transaction using generative adversarial networks.Journal2019
A24Credit card fraud detection using pipeling and ensemble learning.Journal2020
A25Emerging approach for detection of financial fraud using machine learning.Journal2021
A26Detection of fraud transactions using recurrent neural network during COVID-19: fraud transaction during COVID-19.Journal2020
A27Enhancing the credit card fraud detection through ensemble techniques.Journal2019
A28Credit card fraud detection using data mining and statistical methods.Journal2020
A29Credit card fraud detection model based on LSTM recurrent neural networks.Journal2021
A30Credit card fraud detection using machine learning algorithms.Journal2020
A31Credit card fraud detection using autoencoders.Journal2020
A32Credit card fraud detection using naïve Bayes and robust scaling techniques.Journal2021
A33A closer look into the characteristics of fraudulent and transactions.Journal2020
A34Evaluation of deep neural networks for reduction of credit card fraud alerts.Journal2020
A35Deep convolution neural network model for credit-card fraud detection and alert.Journal2021
A36Graph neural network for fraud detection spatial-temporal attention.Journal2020
A37Deep learning-based hybrid approach of detecting fraudulent transactions.Journal2021
A38Combined technique of supervised classifier for the credit card fraud detection.Journal2020
A39Supervised machine learning algorithms for detection credit card fraud.Journal2021
A40Using harmony search algorithm in neural networks to improve fraud detection in banking system.Journal2020
A41Detecting electronic banking fraud on highly imbalanced data using hidden Markov models.Journal2021
A42Machine learning based on resampling approaches and deep reinforcement learning for credit card fraud detection systems.Journal2021
A43Credit card fraud detection system using data mining.Journal2020
A44A comparative study on credit card fraud detection.Journal2021
A45Supervised machine learning algorithms for credit card fraudulent transaction detection.Journal2019
A46Credit card fraud detection analysis using robust space invariant artificial neural networks (RSIANN).Journal2019
A47Credit card fraud detection system.Journal2020
A48Artificial intelligence based credit card fraud identification using fusion method.Journal2019
A49Credit card fraud detection using random forest.Journal2019
A50Performance evaluation of credit card fraud transaction using boosting algorithms.Journal2019
A51Fraud detection in credit card transaction using anomaly detection.Journal2021
A52Semi-supervised classification on credit card fraud detection using autoencoders.Journal2021
A53Artificial neural network technique for improving predication of credit card default: a stacked sparse autoencoder approach.Journal2021
A54Credit card fraud detection based on machine learning.Journal2019
A55Comparison of different ensemble methods in credit card default prediction.Journal2021
A56A novel method for detection of fraudulent bank transactions using multi-layer neural networks with adaptive learning rate.Journal2020
A57Using generative adversarial networks for improving classification effectives in credit card fraud detection.Journal2019
A58Ensemble of deep sequential models for credit card fraud detection.Journal2021
A59Detection of credit card fraudulent transaction using boosting algorithms.Journal2021
A60Predication credit card transaction fraud using machine learning algorithms.Journal2019
A61Financial fraud detection using naïve Bayes algorithm in highly imbalance data set.Journal2021
A62Anomaly detection in credit card transactions using machine learning.Journal2020
A63Uncertainty-aware credit card fraud detection using deep learning.Journal2021
A64Credit card fraud detection using ensemble classifier.Journal2019
A65An implementation of decision tree algorithm augmented with regression analysis for fraud detection in credit card.Journal2020
A66Credit card fraud detection technique using hybrid approach: an amalgamation of self-organizing maps and neural networks.Journal2020
A67Machine learning methods for discovering credit card fraud.Journal2020
A68Improved deep forest more for detection of fraudulent online transaction.Journal2020
A69Using variational auto encoding in credit card fraud detection.Journal2020
A70Credit card fraud detection using naïve Bayesian and c4.5 decision tree classifiers.Journal2020
A71Credit card fraud detection using fuzzy rough nearest neighbor and sequential minimal optimization with logistic regression.Journal2021
A72Fraud classification and detection model using different machine learning algorithm.Journal2021
A73An efficient domain-adaptation method using different machine learning GAN for fraud detection.Journal2020
A74Service-based credit card fraud detection using oracle SOA suite.Journal2021
A75Comparison and analysis of logistic regression, naïve Bayes and KNN machine learning algorithms for credit card fraud detection.Journal2021
A76Credit card fraud detection using isolation forest and local factor.Journal2021
A77Credit card fraud detection using random forest algorithm.Journal2019
A78A multiple classifiers system for anomaly detection in credit card data with unbalanced and overlapped classes.Journal2020
A79Supervised machine learning algorithms for credit card fraudulent transaction detection.Journal2019
A80Credit card fraud detection using machine learning.Journal2019
A81Champion-challenger analysis for credit card fraud detection: hybrid ensemble and deep learning.Journal2019
A82A novel framework for credit card fraud detection.Journal2021
A83Automatic machine learning algorithms for fraud detection in digital payment systems.Journal2020
A84A new hybrid method for credit card fraud detection on financial data.Journal2019
A85A study of fraud detection approaches in credit card transactions.Journal2020
A86Credit card fraud detection using Bayesian belief network.Journal2020
A87An efficient approach for credit card fraud detection.Journal2020
A88Comparative analysis for fraud detection using logistic regression, random forest and support vector machine.Journal2020
A89Fraud detection and prevention in banking financial transaction with machine learning using R.Journal2020
A90Comparative study on credit card fraud detection based on different support vector machines.Journal2021
A91Credit card fraud detection with autoencoder and probabilistic random forest.Journal2021
A92Towards automated feature engineering for credit card fraud detection using multi-perspective HMMs.Journal2020
A93An experimental study with imbalanced classification approaches for credit card fraud detection.Journal2019
A94Credit card fraud detection system using machine learning.Journal2021
A95Analysis of credit card fraud detection using machine learning models on balanced and imbalanced datasets.Journal2021
A96Credit card fraud detection using machine learning and data science.Journal2019
A97Novel machine learning approach for analysis anonymous credit card fraud patterns.Journal2019
A98Credit card fraud detection using machine learning.Journal2021
A99Detection fraudulent credit card transactions using outlier detection.Journal2019
A100Credit card fraud detection in payment using machine learning classifiers.Journal2020
A101An autoencoder based model for detecting fraudulent credit card transaction.Journal2020
A102A comparative study on classification algorithms for credit card fraud detection.Journal2020
A103Credit card fraud detection using random forest algorithm.Journal2019
A104Credit card fraud detection using supervised learning approach.Journal2021
A105A SOMTE based oversampling data-point approach to solving the credit card data imbalance problem in financial fraud detection.Journal2021
A106Using machine learning to detect credit card fraudulent transactions.Journal2021
A107Credit card fraud detection using autoencoder neural network.Journal2019
A108Credit card fraud detection using ANN.Journal2019
A109An improved hybrid system for the prediction of debit and credit card fraud.Journal2019
A110Deep learning methods for credit card fraud detection.Journal2020
A111A comparison of data sampling techniques for credit card fraud detection.Journal2020
A112Credit card fraud detection using machine learning algorithms.Journal2020
A113A machine learning approach for detecting credit card fraudulent transaction.Journal2021
A114Credit card fraud detection using AdaBoost.Journal2020
A115A comparison study of credit card fraud detection: supervise unsupervised.Journal2019
A116Credit card fraud detection using random forest algorithm.Journal2019
A117A comparative study of machine learning classifiers for credit card fraud detection.Journal2020
A118Spectral-cluster solution for credit-card fraud detection using a genetic algorithm trained modular deep learning neural network.Journal2021
A119Comparative analysis of credit card fraud detection in simulated annealing trained artificial neural network and hierarchical temporal memory.Journal2021
A120Credit card fraud detection using isolation forest.Journal2021
A121Credit card fraud detection using machine learning algorithms.Journal2020
A122Credit card fraud detection framework a machine learning perspective.Journal2020
A123The improving prediction of credit card fraud detection on PSO optimized SVM.Journal2019
A124Credit card fraud detection using boosted stacking.Journal2019
A125Credit card fraud detection technique by applying graph database model.Journal2021
A126Online fraud detection using deep learning techniques.Journal2021
A127A hybrid method for credit card fraud detection using machine learning algorithm.Journal2021
A128Anomaly detection using unsupervised methods: credit card fraud case study.Journal2019
A129Discovering of credit card scheme with enhance and common by vote.Journal2021
A130Enhanced credit card fraud detection based on SVM-recursive feature elimination and hyper-parameters optimization.Journal2020
A131Bidirectional gated recurrent unit for improving classification on credit card fraud detection.Journal2021
A132Credit card fraud detection system using smote technique and whale optimization algorithm.Journal2019
A133Fraud detection in online transaction.Journal2020
A134Credit card fraud detection using machine learning.Journal2021
A135Machine learning approach on apache spark for credit card fraud detection.Journal2020
A136Credit card fraud detection using weighted support vector machine.Journal2020
A137Machine learning methods for analysis fraud credit card transaction.Journal2019
A138A review on credit card fraud detection using machine learning.Journal2019
A139Financial fraud detection using bio-inspired key optimization and machine learning technique.Journal2019
A140Semisupervised algorithms based credit card fraud detection using majority voting.Journal2021
A141Artificial intelligence framework for credit card fraud detection using supervised random forest.Journal2021
A142An intelligent payment card fraud detection system.Journal2021
A143HOBA: a novel feature engineering methodology for credit card fraud detection with a deep learning architecture.Journal2021
A144Dual autoencoders generative adversarial network for imbalanced classification problem.Journal2020
A145Performance analysis of isolation forest algorithm in fraud detection of credit card transactions.Journal2020
A146Credit card fraud detection from imbalanced dataset using machine learning algorithm.Journal2020
A147Credit card fraud forecasting model based on clustering analysis and integrated support vector machine.Journal2019
A148Credit card anomaly detection using improved deep autoencoder algorithm.Journal2020
A149Credit card fraud detection using deep learning techniques.Journal2021
A150Detecting credit card frauds using different machine learning algorithms.Journal2021
A151Isolation forest and local outlier factor for credit card fraud detection system.Journal2020
A152Analysis of machine learning credit card fraud detection models.Journal2021
A153Time varying inertia weight dragonfly algorithm with weighted feature-based support vector machine for credit card fraud detection.Journal2021
A154Predicting credit card fraud on a imbalanced data.Journal2019
A155Master card fraud detection using arbitrary forest.Journal2019
A156Credit card fraud detection using data analytic techniques.Journal2020
A157Optimized stacking ensemble (OSE) for credit card fraud detection using synthetic minority oversampling model.Journal2021
A158Aggrandized random forest to detect the credit card frauds.Journal2019
A159An efficient credit card fraud detection model based on machine learning methods.Journal2020
A160Modified focal loss in imbalanced XGBoost for credit card fraud detection.Journal2021
A161Credit card fraud detection using hidden Markov model.Journal2019
A162Credit card fraud detection using isolation forest.Journal2020
A163Comparing different models for credit card fraud detection.Journal2020
A164Credit card fraud detection under extreme imbalanced data: a comparative study of data-level algorithms.Journal2021
A165Credit card fraud detection: a comparison using random forest, SVM and ANN.Journal2019
A166Credit card fraud detection using machine learning methodology.Journal2019
A167Credit card fraud detection using machine learning.Journal2021
A168An intelligent approach to credit card fraud detection using an optimized light gradient boosting machine.Journal2020
A169Real time credit card fraud detection.Journal2021
A170Credit card fraud detection using federated learning techniques.Journal2020
A171A supervised learning algorithm for credit card fraud detection.Journal2021
A172A comparative study of credit card fraud detection using machine learning for United Kingdom dataset.Journal2019
A173Outlier detection credit card transactions using local outlier factor algorithm (LOF).Journal2019
A174Credit card fraud detection using machine learning approach.Journal2021
A175Real-time deep learning based credit card fraud detection.Journal2020
A176A perceptron based neural network data analysis architecture for the detection of fraud in credit card transactions in financial legacy system.Journal2021
A177Credit card fraud detection techniques.Journal2020
A178Adaptive model for credit card fraud detection.Journal2020
A179Credit card fraud detection by modelling behaviour pattern using hybrid ensemble model.Journal2021
A180Credit card fraud detection using PSO optimized neural network.Journal2020
A181Detection and prediction of credit card fraud transactions using machine learning.Journal2019

Data synthesis results

This section examines the ultimately selected article (181 article). In order to provide a response to each of our four research questions, a synthesis of the data is performed. For RQ1: What types of ML/DL algorithms and datasets are used in credit card cyber fraud detection?

Cyber fraud detection techniques

In this part we address RQ1, which seeks to specify the ML/DL techniques used in detecting cyber fraud in credit card from 2019 to 2021.

Machine learning

ML identified as a technique relevant to a wide range of problems, especially in sectors requiring data analysis and processing. ML, which is classified as supervised ML, unsupervised ML, and reinforced ML, plays a crucial role in resolving the unbalanced dataset. ML techniques are tremendously effective for detecting and preventing fraud because they enable the automated recognition of patterns across vast amounts of data. Adopting the proper ML models facilitates the differentiation between fraudulent and legitimate behaviour. These clever systems may adapt over time to new, unseen fraud schemes. Thousands of computations must be executed correctly in milliseconds for this to be possible. Both supervised and unsupervised technologies help detect cyber fraud and must be included in the future generation of fraud safeguards.

Supervised Learning is the training technique for ML algorithms on labelled data sets and configurable data with known variable targets. Classification, regression, and inference are all instances of supervised learning. In all field, supervised models that are trained on a large number of accurately labelled transactions are the most common ML technique. Each transaction is classified as either fraudulent or legitimate. The models are trained by giving them voluminous labelled transaction data in order for them to discover patterns that best resemble genuine behaviour.

Unsupervised learning is the process of training a ML algorithm on a dataset containing ambiguous target variables. The model make an effort to discover the most significant patterns in data. Unsupervised learning technique include dimension removal and cluster segmentation.

Semi-supervised learning combines supervised and unsupervised learning by training model on unlabeled data. In this method, the unsupervised learning attribute is utilised to determine the optimal data representation, while the directed learning attribute is used to analyse the relationships within that representation and subsequently create predictions.

Multiple research utilised supervised, unsupervised, and semi-supervised ML approaches. Table B1 displays the frequency of use of ML and DL techniques in the reviewed literature, indicating how often each technique type is utilised. Several article utilised several ML/DL techniques, as should be highlighted.

Learning typeTechniqueUsage frequencyReference
SupervisedLogic regression (LR)52 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , .
Naive Bayes (NB)42 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Decision tree (DT)49 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , .
Random forest (RF)74 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , .
K-near neighbor (KNN)39 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Support vector machine (SVM)56 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , .
Bayesian belief networks2 ,
Genetic algorithm (GA)5 , , , , .
Artificial immune systems (AIS)1
Fuzzy logic1
Logistic model tree (LMT)1
UnsupervisedHidden Markov model (HMM)7 , , , , , , .
K-means7 , , , , , , .
Isolation forest19 , , , , , , , , , , , , , , , , , ,
Self-organizing map (SOM)2 ,
Principle component analysis (PCA)3 , , .
Density based spatial clustering of applications with noise (DBSCAN)1 ,
Local outlier factor (LOF)13 , , , , , , , , , , , , .
One-class SVM3 , ,
Semi-supervisedSemi-supervised learning3 , ,
ReinforcementReinforcement1
Ensemble learningADA Boost20 , , , , , , , , , , , , , , , , , , , .
RUSBoost2 , .
XGBoost (XG)18 , , , , , , , , , , , , , , , , ,
CatBoost (CB),3 , , .
Gradient boosting12 , , , , , , , , , , , .
Light gradient boosted (Light GBM)4 , , , .
Bagging5 , , , , .
Voting10 , , , , , , , , , .
Pipelining1
stacking4 , , ,
Deep learningCNN7 , , , , , , .
DNN4 , , ,
DCNN4 , , , .
Long short-term memory (LSTM)/BILSTM8 , , , , , , , .
Auto-encoder (AE)18 , , , , , , , , , , , , , , , , ,
Dual autoencoders (DAE)4 , , ,
Deep reinforcement learning (DLR)1
Generative adversarial networks (GANs)7 , , , , , , .
Recurrent neural network (RNN)7 , , , , , , .
Gated recurrent units (GRU)3 , , .
Gradient descent algorithms1
Variational automatic coding (VAE)1
Artificial neural network (ANN)36 , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , .
Multilayer perceptron (MLP)14 , , , , , , , , , , , , , .
Restricted Boltzmann machine (RBM)3 , , .
Deep belief network (DBN)1
Sampling techniqueSynthetic minority over sampling technique (SMOTE)17 , , , , , , , , , , , , , , , , .
The adaptive synthetic (ADASYN)3 , ,
Random oversampling (ROS)1
Tomek2 ,

Supervised techniques

Classification techniques

Utilizing supervised algorithms is the most common method for detecting credit card cyber fraud. Various supervised models are utilised in this field. Support vector machine (SVM) utilised to classify data samples into two groups using a maximum margin hyper plane. It specifically classifies fresh data points using a labelled dataset for every category. The SVM used in 56 reviewed articles. SVM’s kernel consists of mathematical functions that convert input data to high-dimensional space. Therefore, SVM can classify linear and nonlinear (using kernel function) data.

Linear, radial, polynomial, and sigmoid are the four types of kernel functions, utilised in Li et al. (2021) , this article uses SVM to detect credit card fraud. Using cuckoo search algorithm (CS) and genetic algorithm (GA) with particle swarm optimisation technique to optimise the SVM parameters (PSO). Experiments have shown that the linear kernel function is the most effective function. Kernel function is optimised using radial basis function. In terms of overall performance, PSO-SVM outperforms CS-SVM and GA-SVM.

Pavithra & Thangadurai (2019) suggested a hybrid architecture involving the optimization of the particles swarm (PSO). Feature selection algorithm based on SVM was used to improve prediction of cyber fraud. Results shown PSO-SVM method is an optimal preparatory instrument for enhancing feature selection optimisation. In Zhang, Bhandari & Black (2020) , a weighted SVM algorithm is utilised. Experiments revealed that this model significantly enhance the performance. Weighted feature based SVM (WFSVM) with time varying inertia weight base dragonfly algorithm (TVIWDA) proposed in Arun & Venkatachalapathy (2021) . TVIWDA-optimized property is chosen to increase the detection accuracy. Then, using the WFSVM classifier and the specified characteristics, the classification is performed. The results shown that the suggested model outperforms the current random tree based technique. WFSVM is more efficient with smaller datasets.

The decision tree (DT) approach has gained remarkable interest from researchers. The DT algorithm appeared in 49 articles. In Bandyopadhyay et al. (2021) , the DT classifier applied for detection of financial frauds. DT algorithm performs the best with an accuracy of (0.99) comparing with another classifier. DT with boosting technique applied in Barahim et al. (2019) . The results show that applying boosting with DT outperforms other methods. The model obtained highest accuracy of 98.3%. In Choubey & Gautam (2020) , a combination of supervised algorithms such as DT, RF, LR, naive Bayes (NB), and K-near neighbor (KNN) have been utilised. The study observed that hybrid classifier DT with KNN worked better than any other single classifier. In Hammed & Soyemi (2020) , the utilisation of the DT algorithm enhanced with regression analysis is described. The result indicates enhanced performance. This approach is accurate, with a misclassification error rate of 18.4%, and the system successfully validated all of the inserted incursions used for testing.

Among ML approaches, the C4.5 algorithm acts a DT classifier. The decision is based on certain occurrences of data. Four articles utilised C4.5 tree ( Askari & Hussain, 2020 ; Beigi & Amin Naseri, 2020 ; Husejinovic, 2020 ; Mijwil & Salem, 2020 ). New model applied C4.5 in Mijwil & Salem (2020) . The study revealed that C4.5 is the best classifier comparing with other ML techniques. Credit card fraud detection using C4.5 DT classifier with bagging ensemble has been applied in Husejinovic (2020) . The study revealed that bagging with C4.5 DT is the best algorithm. Logistic model tree (LMT) has been used in DT for classification. In Hussein, Abbas & Mahdi (2021) , LMT applied to fraud classification and detection. The result shows that applying LMT algorithm to classification fraud is better than other techniques. LMT model obtained 82.08% accuracy. Intuitionistic fuzzy logic based DT (IFDTC4. 5) applied in Askari & Hussain (2020) for transaction fraud detection. The results show that the IFDTC4.5 outperforms other techniques and able to detect fraud proficiently.

One of the most powerful techniques is RF, which is a modern variation of DT. According to the examined literature, RF is the most prevalent credit card fraud detection method (74 articles). Some reviewed articles used RF only for comparison with the developed methods. In Amusan et al. (2021) , RF applied for fraud detecting on skewed data. Results indicated that RF recorded highest accuracy (95.19%) comparing with KNN, LR, and DT. Furthermore, RF applied with other techniques such as SVM, NB, and KNN in Ata & Hazim (2020) . The results showed that RF algorithm performs better than the other techniques. A hybrid model or combination of supervised classifiers appeared in Choubey & Gautam (2020) . Several techniques such as RF, KNN, and LR have been applied. Results show that RF with KNN worked better than applied as a single classifier.

New model applied RF in Meenakshi et al. (2019) . The study revealed that the RF algorithm performs better with more training data, but testing and application speeds will decrease. Jonnalagadda, Gupta & Sen (2019) applied RF in their study. The recommended values for the highest level of RF precision are 98.6%. This proposed module is suitable to a larger data set and yields more precise results. With more training data, RF algorithm will perform better. In Hema & Muttipati (2020) LR, RF, and Catboost have been applied for discovering cyber fraud. The result shows RF with Catboost gives high accuracy. RF gives the best result with accuracy (99.95). RF with SMOTE applied in Ahirwar, Sharma & Bano (2020) . The results obtained by the RF algorithm showed that this approach would be successful in real time. This model is intended to have some insight into the identification of fraud.

Bayesian technique is an additional classification method. We explored 42 articles that utilised NB, and two articles used Bayesian belief networks (BBN). Detection of credit card fraud via NB and robust scaling approaches described in Borse, Patil & Dhotre (2021) . The results indicate that the NB classifier with the robust scaleris is the most effective in predicting fraudulent activity in the dataset. NB using robust scaling got the accuracy 97.78%. In Divakar & Chitharanjan (2019) , the NB classifier and other classifiers were applied. NB did not obtain the best result when comparing with other classifiers. In Gupta, Lohani & Manchanda (2021) , among ML algorithms such as LR, RF, and SVM, the NB algorithm’s performance is remarkable. BBN applied in Kumar, Mubarak & Dhanush (2020) for detecting fraud in credit card. Result showed a BBN is more accurate than the NB classifier. This is disturbed with using the fact of conditional dependence between the attributes in Bayesian network, but it requires more calculation and training process. The transaction of data value available in dataset which is trained with their results as fraud or genuine transaction which is predicted by a testing data value for individual transaction.

The K-nearest neighbors (KNN) algorithm applied in 39 articles. Various studies were used KNN technique in detecting credit card fraud. KNN uses neighbouring samples to identify class label. The KNN technique is best for overlapping sample sets ( Yao et al., 2019 ). In this review, several articles applied KNN as classifiers. Chowdari (2021) reported that the KNN is a stronger classifier at detecting fraud in credit cards comparing with other techniques such as DT, LR, and RF. In DeepaShree et al. (2019) , Kumar, Student & Budihul (2020) , the KNN classifier applied for credit card fraudulent transaction detection, comparing with RF and NB, KNN showed the highest accuracy than the RF algorithm and NB. In Parmar, Patel & Savsani (2020) and Vengatesan et al. (2020) , the KNN technique compared with many other techniques such SVM, LR, DT, RF XGBoost. The KNN model is the most precise model. KNN model got accuracy score: 99.95%. New ML approach to detect anonymous fraud patterns appeared in Manlangit, Azam & Shanmugam (2019) , Synthetic minority oversampling technique (SMOTE) with KNN proposed. Results reveal that proposed model performed well. KNN model achieves a precision 98.32% and 97.44.

Regression techniques

In this review, the studies utilised logistic regression (LR) technique frequently. A total of 52 studies employed LR for cyber fraud detection. LR models can be utilised for both multiclass and binary classification. LR is a statistical strategy that models a binary dependent variable using a logistic function. In Adityasundar et al. (2020) , LR applied over highly imbalanced data. Using unbalanced data, the study developed a classification model that is extremely resistant. New system uses LR to build the classifier proposed in Alenzi & Aljehane (2020) . Comparing the proposed LR-based classifier against the KNN and voting classifiers. The result demonstrates that LR-based produces the most accurate findings, with a 97.2% success. Itoo & Singh (2021) revealed a comparison between LR, NB, and KNN for fraud detection. Results show that LR achieved an optimal performance. LR was successful in achieving greater accuracy than KNN and NB. The LR attained accuracy of 95%, while the NB achieved 91%, and the KNN achieved 75% ( Itoo & Singh, 2021 ). In Karthik et al. (2019) , a newly proposed approach shown that employing a stacking classifier that applies LR as a meta classifier is the most promising method, followed by SVM, KNN, and LR. A study by Soh & Yusuf (2019) suggested four models to detect fraud on an imbalanced data. Result shows that the RF and KNN are overfitting. Thus, only the DT and LR have been compared. The result shows that LR with stepwise splitting rules has outperformed the DT with only 0.6% error rate. Sujatha (2019) used single and hybrid model of under sampling and over sampling. The study revealed that LR is best among all the algorithms. The result shows that the proposed model LR and NN approaches outperform DT.

Ensemble techniques

Random forest model is an ensemble approach appeared in the examined literature. RF often achieves superior performance against single DT by producing a stack of DT over training. New research conducted in 2021 revealed that RF outperforms K-means and SVM ( Al Rubaie, 2021 ).

Another ensemble method is bagging, which is a collection of different estimators created using a particular learning process to enhance a single estimator. Bagging reduces DT classifier variance. The approach creates random subsets from the training sample. In the reviewed articles, five article applied bagging methods ( Alias, Ibrahim & Zin, 2019 ; Husejinovic, 2020 ; Lin & Jiang, 2021 ; Mijwil & Salem, 2020 ; Karthik, Mishra & Reddy, 2022 ). Husejinovic (2020) applied C4.5 DT, NB, and bagging ensemble to predict fraud. Result shows that best algorithm is bagging with C4.5 DT.

Boosting includes adaptive boosting algorithm (AdaBoost), RUSBoost, gradient boosting algorithm (GBM), LightGBM, and XG Boost algorithm. A total of 59 articles utilised boosting techniques in the reviewed articles. AdaBoost employed by Barahim et al. (2019) . In this study, DT, NB, and SVM used with AdaBoost. The results show that AdaBoost with DT outperforms other techniques. A comparison of different ensemble methods to predict fraud in credit card has been done by Faraj, Mahmud & Rashid (2021) . Experiment shows that XGBoosting performs better when compared to other ensemble methods and also better than neural networks.

Stacking is a method of ensemble learning that combines multiple classification or regression systems. In stacking, a single model used to exactly integrate predictions from contributing models, but in boosting, a series of models are utilised to enhance the predictions of earlier models. In contrast to bagging, utilising the complete data set as compared to portions of the training dataset. Four articles have been used stacking to learn a classifier for detecting fraud in credit card ( Karthik et al., 2019 ; Muaz, Jayabalan & Thiruchelvam, 2020 ; Prabhakara et al., 2019 ; Veigas, Regulagadda & Kokatnoor, 2021 ). The stacked ensemble approach has demonstrated potential for detecting fraudulent transactions. Stacked ensemble has the best performance at 0.78 after trained for sampled datasets ( Muaz, Jayabalan & Thiruchelvam, 2020 ).

Unsupervised techniques

Clustering is the process of categorising similar instances into identical groupings. The clustering methods utilised far less comparing with classification methods in the reviewed article. The hidden Markov model is used to model probability distribution across sequences of observation. It consists of hidden states and observable outputs. HMM has been applied in seven articles. In Das et al. (2020) , HMM model applied to detect cyber. Results show a great performance of proposed system, also demonstrate advantage of learning cardholder’s spending behaviour. Singh et al. (2019) suggested method to identify cardholders spending profile, then attempts to find out the observation symbols, these observation symbols will help for an initial estimate of the model parameters. Thus, HMM can detect if the transaction is genuine or fraud. SMOTE utilised along with HMM and density based spatial clustering of application and noise. This new model (SMOTE+DBSCAN+HMM) performed relatively better for all the various hidden states.

K-means has been applied in seven articles. The K-means algorithm is a non-hierarchical method applied for data clustering. The algorithm uses a simple method. Thus, K-means classifies a given dataset into a specified number of clusters or K-clusters. In Abdulsalami et al. (2019) , K-mean was applied with back-propagation neural network (BPNN). The result shows that there is a significance difference between BPNN and K-means for detecting fraud credit card transaction. The BPNN model achieved a great accuracy with less false alarms comparing with K-means model. Results also show that the accuracy of BPNN is 93.1% while K-means accuracy is 79.9%.

Isolation forest is an unsupervised ensemble. No point-based distance calculation and no profiling of regular instances are done. Instead, the Isolation forest builds an ensemble of DTs. The concept of isolation forest is to spilt anomalies with the purpose of isolation them. An ensemble of DTs is generated for a particular data collection, the data points with the shortest average path length are considered anomalous. Isolation forest has been applied in 19 articles. In Meenu et al. (2020) , a new Isolation Forest model to detect fraud is utilised. The model demonstrates the efficiency in fraud detection, observed to be 98.72%, which indicates a significantly better approach than other fraud detection techniques. Isolation forest with local outlier factor to detect fraud applied in Vijayakumar et al. (2020) . Isolation forest showed accuracy as 99.72% while local outlier factor showed accuracy as 99.62%. Isolation factor is better observed in online transactions. A study by Palekar et al. (2020) that K-means clustering and (Isolation forest and local outlier factor) can be created and developed on a very large scale to detect fraud in credit card transaction.

Self-organising map (SOM) is unsupervised neural networks learning (NN). SOM is appropriate for building and analysing the profiles of customers to detect fraud. SOM applied in two reviewed articles. SOM and NN in hybrid approach applied in Harwani et al. (2020) . Compared to using SOM and ANN alone, the suggested model reached a better accuracy and cost. In Deb, Ghosal & Bose (2021) , three unsupervised algorithms, K-means, K-means clustering using principle component analysis (PCA), T-distributed stochastic neighbor embedding (T-SNE), and SOM are presented. This model achieved accuracy of 90% for fraud detection in credit card. The results show also K-means clustering along with PCA is much better than simple K-means. Also, T-SNE is much better than PCA as the PCA gets highly affected by outliers.

Semi-supervised techniques

A hybrid technique combining supervised and unsupervised learning. The unsupervised learning attribute is utilised to determine the optimal representation of data, whereas the supervised learning attribute is employed to investigate the relationships in the representation before beginning to predict. Semi-supervised learning is extremely useful when the data collection is unbalanced. The studies in this review utilised semi-supervised technique in their researches. Three studies employed semi-supervised to detect fraud in credit card ( Dzakiyullah, Pramuntadi & Fauziyyah, 2021 ; Pratap & Vijayaraghavulu, 2021 ; Shekar & Ramakrisha, 2021 ). In Dzakiyullah, Pramuntadi & Fauziyyah (2021) , a combination of semi-supervised learning and AutoEncoders to detect fraudulent transaction is presented. This proposed model utilized an autoencoder then trains the basic linear classifier to allocate the data collection into own class. Also, the T-SNE applied to visualise the essence of fraudulent and non-fraudulent transactions. Results obtained are helpful because that credit card fraud will be easily classified with 0.98%.

Semi supervised algorithms using majority voting applied in Pratap & Vijayaraghavulu (2021) ; in this study, 12 ML algorithms applied. Firstly, the standard models are used. Secondly, AdaBoost and majority voting added. Result indicates that the Majority voting technique achieves high accuracy.

Deep learning

Deep learning (DL) is subsection of ML uses data to teach computers how to perform tasks. The fundamental tenet of DL is that as we expand our NN and train them with new data, their performance continues to improve. The main advantage of DL over traditional ML is its higher performance on large datasets. The most frequently used DL algorithms in cybersecurity are feed forwards neural networks (FNNs), stacked autoencoders (SAE), and convolutional neural networks (CNNs). As shown in Fig. 3 , DL techniques have been used in 34 reviewed articles. A total of 39 reviewed articles used combination of DL and ML techniques to detect fraud in credit card.

An external file that holds a picture, illustration, etc.
Object name is peerj-cs-09-1278-g003.jpg

An artificial neural network (ANN) employs cognitive computing to aid in the development of machines capable of employing self-learning algorithms including pattern recognition, natural language processing, and data mining. ANN presents more accurate results because it learns from the patterns of authorized behaviour and thus distinguishes between ‘fraud’ and ‘non-fraud’ in credit card transaction. We explored 36 articles that used ANN in our review. In Agarwal (2021) , ANN implemented for identity theft detection. The proposed model aims to use the different layers in a NN to determine the fraud transaction. The result shows that applying an ANN gives accuracy nearly equal to 100%. The result shows that ANN is best suited for determining if a transaction is fraudulent or not. New recent study applied ANN to detect fraud. The ANN technique has been used then compared with ML algorithms such as SVM, KNN. The result shows that ANN gives accuracy more than other ML algorithms, the suggested model is optimal for detecting credit card fraud ( Asha & Suresh Kumar, 2021 ).

In Abdulsalami et al. (2019) , back-propagation neural network (BPNN) and K-means are applied. The results indicate that the BPNN is more accurate than K-means algorithm. BPNN obtained accuracy of 79.9%. The results also indicate that K-means reduced prediction time provided it and advantage over BPNN. In Daliri (2020) harmony search algorithm with ANN (NNHS) are applied to improve fraud detection in banking system. The results show acceptable capability in fraud detection based on the information of customers. In Oumar & Augustin (2019) ANN with LR applied for fraud detection. Back-propagation has decreased the error function and enabled the model to discriminate between a fraudulent and a legitimate transaction. The suggested model is 99.48% accurate in its predictions and highly reliable.

Multilayer perceptron (MLP) is the most approach in ML because to its excellent accuracy in approximation nonlinear function. MLP comprises of three distinct layers. We explored 14 articles that used MLP in our review. In Alias, Ibrahim & Zin (2019) , MLP and fifteen other types of supervised ML techniques are examined to determine the one with highest accuracy for detecting fraudulent transaction. The result shows that MLP generated the greatest detection accuracy of 15 algorithms, at 98%. Can et al. (2020) applied MLP and other ML techniques such as DT, RF, and NB. Regarding amount-based profiling, both MLP and classifiers demonstrated substantial improvements. In Faridpour & Moradi (2020) , a novel ML-based model for detecting fraud in banking transaction utilising customer profile data is provided. In the proposed model, bank transactional data is utilised and an MLP with adjustable learning rate is trained to demonstrate the transaction authenticity, thus improving detection process. The suggested model surpasses SVM and LR. The accuracy of the proposed model is 0.9990.

Convolution neural network (CNN) is composed of multiple layers, output of which are used as inputs to layers that follow. ConvNET’s purpose is to reduce the input into a framework that is easier to comprehend, without sacrificing crucial information for making accurate predictions. CNN used in seven articles in the review. In Agarwal et al. (2021) , DL techniques like CNN, BILSTM with ATTENTION layer have been used to detect and classify the illegitimate transactions. The CNN-Bi-LSTM-ATTENTION model detects the fraudulent class with high accuracy. Analysis shows that the model is adequate and yields an accuracy of 95%. The results demonstrate that the addition of an attention layer increases the performance of the model, allowing it to accurately discriminate between fraudulent and legitimate transactions. A CNN, NB, DT, and RF hybrid model is deployed in Aswathy & Samuel (2019) , these algorithms are used as single models. Then these are used as hybrid models using majority voting technique. Adaptive boosting algorithm was used to boost the performance of classifiers.

DNNs, which provide potent tools for automatically producing high-level abstractions of complicated multimodal data, have recently garnered a great deal of interest from business and academics. DNNs learn features on their own, resulting in an increasingly accurate learning process. DNNs have been shown to be more efficient and accurate. Four studies employed DNN. In Arya & Sastry (2020) , the proposed model is flexible to data disparity and resistant to hidden transaction patterns. Adaptive optimisation is recommended to improve fraud prediction. Result demonstrates its superiority over current other methods.

Credit card fraud detection using uncertainty-aware DL was implemented in Habibpour et al. (2021) . It is vital to evaluate the uncertainty of DNN predictions. According to the study, there are three uncertainty quantification (UQ) techniques, ensemble, Monte Carlo dropout, and ensemble Monte Carlo dropout that can be used to quantify the level of uncertainty associated with predictions and produce a categorisation that is reliable. According to the findings, the ensemble method is superior at capturing the uncertainty related to predictions.

Deep convolution neural network (DCNN) applied in four articles. The DCNN technique can improve detection accuracy when a huge volume of data is involved. In Chen & Lai (2021) , existing ML models, including LR, SVM, and RF, as well as auto-encoder and other DL models. Results show a detection accuracy of 99% was attained over a 45-s duration. Despite the vast quantity of data, the model provides enhanced detection. DL technique provides high accuracy and rapid pattern in detecting complex and unknown patterns. 1DCNN, 2DCNN, and DCNN have also been utilised to detect credit card cyber fraud in Cheng et al. (2020) , Deepika & Senthil (2019) , Nguyen et al. (2020) .

A recurrent neural network, often known as an RNN, is a structure that used to remember previous input sequences. It is comprised of links between the internal nodes of a directed graph. Depending on the amount of their internal memory. RNN applied in seven articles in this review ( Bandyopadhyay & Dutta, 2020 ; Chen & Lai, 2021 ; Forough & Momtazi, 2021 ; Hussein et al., 2021 ; Osegi & Jumbo, 2021 ; Sadgali, Sael & Benabbou, 2021 ; Zhang et al., 2021 ). In Bandyopadhyay & Dutta (2020) , Implementing and applying RNN on synthetic dataset. The suggested model can detect fraudulent transaction with a 99.87% accuracy. The outcomes demonstrate that the approach is relevant and appropriate for detecting fraud. In Forough & Momtazi (2021) , a deep RNN-based ensemble model and an ANN-based voting approach proposed. The ensemble model leverages a variety of RNN as the fundamental classifier and combines output using an FFNN as voting method. Classification employs a number of GRU or LSTM network. The outcomes indicate that the suggested model outperforms competing models. The proposed model is superior to existing models in this field. Bidirectional gated recurrent unit (BGRU) is applied in Sadgali, Sael & Benabbou (2021) . Algorithms such as, GRU, LSTM, BRU, and SMOTE utilised in this model. BGRU obtained a high accuracy of 97.16%.

Long short-term memory (LSTM) is helpful technique to predict fraud because of the history knowledge it contains and the link that exists between prediction outputs and historical input. LSTM architecture enables sequence prediction problems to be learned through long-term reliance. LSTM and BiLSTM applied in eight articles ( Agarwal et al., 2021 ; Alghofaili, Albattah & Rassam, 2020 ; Benchaji, Douzi & El Ouahidi, 2021 ; Cheon et al., 2021 ; Forough & Momtazi, 2021 ; Nguyen et al., 2020 ; Osegi & Jumbo, 2021 ; Sadgali, Sael & Benabbou, 2021 ). In Alghofaili, Albattah & Rassam (2020) , a new model developed to improve both the present detection techniques and the detection accuracy in light of huge data. Findings demonstrated that LSTM performed perfectly, achieving 99.95% accuracy. Benchaji, Douzi & El Ouahidi (2021) recommended a model with the purpose of recording the previous purchasing behaviour of card holders. The results show that LSTM model obtained a high level of performance and accuracy.

DL based hybrid approach of detecting fraudulent transactions applied in Cheon et al. (2021) . The new model includes a Bi-LSTM-autoencoder with isolation forest. This model proposed a detection rate of 87% for fraudulent transactions. The suggested model scored the highest mark. This model has the potential to be employed as an effective method for detecting fraud.

Deep belief network (DBN) applied in one article ( Zhang et al., 2021 ). The new model utilised DBN and advanced feature engineering base on a Homogeneity-oriented behaviour analysis (HOBA). Results indicate that suggested model is effective and capable to identify fraud. DBN classifier with HOBA achieves a performance that is superior to that of the standard models.

Boltzmann machine (RBM) comprises of visible and hidden layers linked by symmetric weights. The neurones in the visible layer correspond to the X inputs, whilst the responses of the neurones H in hidden layer reflect the eventuality distribution of the inputs. RBM appeared in three articles in the review ( Niu, Wang & Yang, 2019 ; Suthan, 2021 ; Suvarna & Kowshalya, 2020 ). In Niu, Wang & Yang (2019) , supervised and unsupervised techniques have been applied. XGB and RF as a supervised technique obtain the best performance with AUROC is 0.961. RBM provides the best performance among unsupervised techniques. Results indicate that supervised models outperform the unsupervised models. Because of the problem of inadequate annotation and data imbalance, unsupervised techniques remain promising for credit card fraud detection.

A generative network (GAN) is comprised of two feed forward neural network, a Generate and a Discriminator, competing each other. The G produces new candidates while the D evaluates the quality. Each of the two networks is typically a DNN with multiple layers interconnected. GAN appeared in seven articles ( Ba, 2019 ; Fiore et al., 2019 ; Tingfei, Guangquan & Kuihua, 2020 ; Hwang & Kim, 2020 ; Niu, Wang & Yang, 2019 ; Wu, Cui & Welsch, 2020 ; Veigas, Regulagadda & Kokatnoor, 2021 ). In Ba (2019) , GANs employed as an oversampling technique. The findings indicate that Wasserstein-GAN is reliable during training and creates accurate fraudulent transactions comparing with other GANs. In Fiore et al. (2019) , GAN employed to enhance the effectiveness of classification. A model for addressing the problem of class imbalance is described. GAN trained to generate minority class instances, then combined with training data to create an augmented training set to enhance performance. The results indicate that a classifier trained on expanded data outperforms its original equivalent.

The input-output mapping between the encoding and decoding phases is discovered by the autoencoder (AE). The input is mapped by the encoder to the hidden layer, and the input is rebuilt by the decoder using the hidden layer as the output layer. AE appeared in 18 articles in this review. AE mentioned in 18 articles within this review. In Misra et al. (2020) , autoencoder model for cyber fraud detection is applied. Two-stage model with an autoencoder that coverts the transaction characteristics to a lower-dimensional feature vector at the first step. A classifier is then fed these feature vectors in a subsequent step. Results show that the suggested model outperform other models.

In Wu, Cui & Welsch (2020) , dual autoencoders generative adversarial networks (DAEGAN) is employed for the imbalanced classification problem. The new model trains GAN to duplicate fraudulent transaction for autoencoder training. To create two sets of features, two autoencoders encode the samples. The new model outperforms several classification algorithms. Due to extremely skewed class distributions, credit card datasets present classification situations that are unbalanced. To address this difficulty. New model proposes in Tingfei, Guangquan & Kuihua (2020) employing oversampling technique based on variational automatic coding (VAE) in combination with DL techniques. Results demonstrate that the VAE model outperforms synthetic minority oversampling strategies and conventional DNN methods. In addition, it performs better than previous oversampling techniques based on GAN models.

Metaheuristic techniques

In Makolo & Adeboye (2021) , a new hybrid model is created by applying Genetic algorithm and multivariate normal distribution to unbalanced dataset. After trained on the same dataset, the prediction accuracy compared to that of DT, ANN, and SVM. The model yielded a remarkable F-score of 93.5%, whereas ANN is 68.5%, DT is 80.0%, and SVM is 84.2%. Enhanced hybrid system for credit card fraud prediction in Nwogu & Nwachukwu (2019) . The genetic algorithm with RF model optimisation (GAORF) is employed. Utilising real and genetic algorithms. This model’s classification accuracy enhanced through the optimisation of RF models. This can assist in resolving the problem of a shortage of transaction data, as well as the problem of inadequate optimisation and convergence of RF algorithms. The model improved significantly reducing the overall number of misclassifications.

The use of harmony search algorithm (HAS) with NN to increase fraud detection is described in Daliri (2020) . The model uses HAS to optimise the parameters of ANN. Proposed NNHS model provides a method based on HAS that successfully predicts the optimal structure for ANN and identifies the algorithm hidden inside the data. The comparisons revealed that the highest accuracy achieved is 86%.

Instance-based learning

In Hussein, Abbas & Mahdi (2021) , fraud detection model utilising various ML algorithm, including NB, DR, rules classifier, lazy classifier (IBK, LWL, and KStar), meta classifier, and function classifier, implemented in this study. Results indicate that lazy classifier (LMT) technique is the most accurate, with an accuracy of 82.086%.

Percentage of articles that address supervised, unsupervised, or semi-supervised in credit fraud detection?

This section answers RQ2 which attempts to show the proportion of gathered research article that employ supervised, unsupervised, or semi-supervised techniques. We examined credit card fraud detection techniques described in research article. According to Fig. 4 , 74% of the chosen article utilised the supervised technique. Consequently, supervised technique is the most commonly employed in the reviewed article. In contrast, 12% utilised unsupervised techniques, and 12% utilised both supervised and unsupervised techniques. A total of 2% of reviewed article utilised semi-supervised learning. Additionally, 1% utilised reinforcement learning. Supervised and unsupervised learning have been implemented in 2019, 2020, and 2021. While semi-supervised learning only implemented three times in 2021. In the same manner, reinforcement learning has only been utilised in 2021. Compared to supervised and unsupervised learning, semi-supervised learning and reinforcement learning were not embraced by a large number of researchers. The ML/DL techniques type of each study article is listed in Table C1 for more information. The proportion of supervised, unsupervised, and semi-supervised is showed in Fig. 4 .

An external file that holds a picture, illustration, etc.
Object name is peerj-cs-09-1278-g004.jpg

Article IDML/DL techniquePerformance metricsResults and valueDatasetFuture work
A1Back propagation neural network (BPNN).
K-means
Precision, recall error rate
FPR, accuracy hit and miss rate.
There is a significance difference between K-means and BPNN. BPNN model has higher accuracy comparing with K-means. BPNN accuracy = 93.1%. K-means accuracy = 79.9%.Real credit card data/European cardholdersComparing the effect of combing these two models together so as to optimise the accuracy.
A2LRAccuracy,
recall, precision
The model reached high performance using imbalanced dataset. L-BFGS is 0.980. Lib-linear is 0.9816. Newton-CG is 0.9812. Sag is 0.997. Saga is 0.996.Real data/European cardholdersNA
A3CNN
BILSTM
Confusion matrix,
accuracy, precision, recall
The proposed model (CNN-BI-LSTM-ATTENTION) achieved high accuracy in fraud detecting. Adding attention layer enhances performance. Accuracy is 95%.IEEE-CIS fraud detection from NA
A4ANNAccuracy
Precision
Recall
The ANN proposed model is best suited for detecting fraud. The accuracy around 100%.NACombining this algorithm with other algorithms.
A5HMMNAApplying HMM model to detect credit card fraud would be successful.NANA
A6RF
SMOTE
Sensitivity
Specificity
Precision
F-measure
Accuracy
Misclassification rate, ROC
The model showed high performance. When using RF the large number of datasets can be processed automatically. Quick RF classifier accuracy with imbalanced dataset is 98%. Quick RF Classifier accuracy with balanced dataset is 99%.Real-world data/UCSD FICO/2009NA
A7ADA boost majority balloting
NB, QDA, LR, DT, RF, NN, KNN, and SVM.
-Accuracy
-Matthews correlation coefficient (MCC)
Results showed that using bulk balloting technique achieves high accuracy in detecting fraud.
NB: 0.9458. QDA: 0.9544. LR:0.9913 DT: 0.9837. RF: 0.9869. NN:0.971
KNN: 0.9718. SVM: 0.8526.
Genuine world MasterCard data set.Procedures be stretched out to the internet becoming acquainted with designs.
A8K-means
RF, J48
SVM
AccuracyResults showed that RF is better on global dataset with 92.1% accuracy.
K-means: 85.6%. RF: 92.1%. J48 DT: 89.3%. SVM: 89.9%.
Two types of data:
Global/Bank.
User dataset
For this model, the transaction time is required.
A9FraudMiner
RUSBoost
Bagged
KNN
SVM
Sensitivity
False alarm rate
Balanced Classification rate, MCC
This model showed great performance with catch rate 85.3% and MCC of 0.83.Public dataset/Provided by FISCO/UCSDNA
A10Autoencoder
LR
Confusion matrix
Accuracy
Recall
F1-score
Precision
Results showed that proposed model can detect fraud transaction between 64%, 79%, and 91%. This model is better than LR (57%) with unbalanced dataset. The model solved data balancing problem. : accuracy is 97.23. Recall is 0.90. Precision is 0.06. The F1-score is 0.12. While results on accuracy is 99.91. Recall is 0.57. Precision is 0.93 and F1-score is 0.71.Real dataset from ULBCompare the performance of this model with other classification algorithms.
A11LR, KNNConfusion matrix
Accuracy
Sensitivity
Error rate
The LR-based model is the best comparing with KNN and voting classifier.
Accuracy is 97.2%. Sensitivity is 97%. Error rate is 2.8%.
Real dataset/European cardholdersProposed model suffers in the response time.
A12LSTMAccuracy
Loss rate
Execution time
Results showed great performance of LSTM comparing with Autoencoder. Model accuracy is 99.95%.Real dataset/European cardholdersCalculate timing and location of fraud
A13LR, MLP, XGBoost, K-fold cross, RF, Bagging Gradient Boosting, Voting, KNN SVM, GNB.Accuracy
Confusion matrix
MLP achieved highest accuracy comparing with 15 algorithms. The accuracy is 98%Real dataset/European cardholdersFurther research of MLP to increase the detection performance.
A14LR
RF. XG
CatBoost (CB),
F1-score
AUC
Savings
Results showed that the CatBoost obtained the best savings with 0.7158 alone. When applying SMOTE the savings is 0.971. When applying SMOTE and BMR, the saving is 0.9762. XGBoost achieved the best saving 0.757 when applying BMR without the SMOTE. XG + BMR: F1-score is (0.2890). AUC is (0.9699). Savings is (0.7570). CB + SMOTE + BMR: F1-score is (0.8250). AUC is (0.9999). Savings is (0.9762).Real dataset/European cardholdersUsing another dataset. Also testing XG and CB
A15LR, RF
KNN
DT
Accuracy
Precision
Recall
The results show that RF achieved highest performance.
RF: accuracy (95.19%), precision (0.9794), recall (0.9226).
Real dataset/Europeans cardholdersOther data balancing techniques be explored.
A16SVM
RUSBoost
LR, MLP, DT, KNN, AdaBoost, RF
Accuracy
Precision
Specificity
F1-score
AUPR, ROC
The results showed that CtRUSBoost outperformed other algorithms. Results scores on three dataset: A, B, and C.
sensitivity (96.30), specificity (85.60), precision (94.20), F1-score (88.60). Sensitivity (99.60), specificity (98.70), precision (95.70), F1-score (97.60). Dataset C: sensitivity 100), specificity (99.80), precision (99.30), F1-score (99.60).
Three datasets from (A, B, C)Customized the model and adding new algorithms.
A17Social spider optimisation (SSO), ant colony optimisation (ACO), ANNSensitivity
Specificity
Accuracy
F-score
Kappa
The model SSO-ANN achieved high performance with 93.20% accuracy on Germane dataset, and 92.82% on Kaggle dataset.Benchmark dataset.
Kaggle dataset
Improving the model by using clustering techniques.
A18Deep ensemble algorithm (DEAL). CNN. DNN. MLP, Auto encoder. SVM, LRMean absolute error (MAE)
Fraud catching
rate (FCR)
Accuracy
DEAL model obtained high performance in detecting fraud. Model accuracy is 99.81%Real dataset/Europeans cardholdersUsing AI and IoT in cloud computing
A19SVM, KNN, ANNConfusion matrix,
accuracy, precision, recall
ANN provides high accuracy in detecting fraud comparing with the unsupervised algorithms.Real dataset/Europeans cardholdersNa
A20DT IFDTC4.5
intuitionistic fuzzy logic
Accuracy,
sensitivity, false positive rate, specificity
IFDTC4.5 outperforms existing techniques. The model able to detect fraud efficiently. However, still the frauds cannot be eliminated by 100%.Singaporean bank and one similar synthetic data set.Add multi factor authentication using the biometrics like iris, voice .
A21NB, DT, RF, CNNPrecision,
Recall,
Accuracy
Algorithms like NB, DT, RF and CNN are used. These algorithms are used as single models. Then these are used as hybrid models using majority voting technique. Adaptive boost also used in the model.Publicly available credit card data set.This model will extend to online model.
A22SVM, NB, KNN, RFAccuracy
Sensitivity
Specificity
Precision
Results showed that RF performs better than other algorithms. Applying sampling approach will improve the performance. NB: 97.80%. SVM: 97%. KNN: 46.98%. RF: 98.23%.Real dataset/European cardholders/ULBUsing huge dataset instead of sampling techniques
A23Generative adversarial networks GANsAUC
AUPRC
Recall
F1-score
Precision
The results show that applying Wasserstein-GAN will improve detecting fraudulent transactions comparing with traditional GAN. WCGAN model achieves: AUC is 0.948. AUPRC is 0.717. Recall is 0.6420. Precision is 0.852. F1-score is 0.710.NANA
A24LR, KNN, RF, NB, MLP, AdaBoost, pipelingAccuracy
Precision
Recall, F1-score
The results showed that applying pipeling can improve the model’s performance. Accuracy: 00.99%. Precision: 0.84. Recall: 0.86. F1-score: 0.85.Real dataset/European cardholdersNA
A25GNB, LR, DT, RFAccuracy
Recall
Precision
F1-score, MSE
The result showed that DT algorithm is the best with an accuracy: 0.999. Recall: 0.782. Precision: 0.766. F1-score: 0.774. MSE : 0.0008Real dataset/European cardholders/ULBNA
A26RNNAccuracy, recall, precision
F1-score, MSE
The result showed that RNN model is capable in detecting fraud. The accuracy is 99.87%. MSE is 0.01. F1-score is 0.99.Synthetic dataset and real datasetNA
A27DT, NB, SVM
AdaBoost
Accuracy
Sensitivity
Specificity
Precision
ROC, F1-measure
The results showed that applying Boosting with DT outperforms other methods. The model obtained highest accuracy of 98.3%. F measure is 93.98%. Using boosting techniques improve the performance.Real dataset/Europeans cardholdersNA
A28DT,SVM, k-means
Optimal resampling strategy, C4.5 DT
AdaBoost
Accuracy
Sensitivity
Cost sensitive
The suggested model obtained high performance with 96.59% accuracy and 67.52% sensitivity.Real dataset/CB bank/Brazilian bankCompare this model with other models
A29LSTMMSE
MAE, RMSE
Results showed that the LSTM model achieves perfect performance. AUC: 0.995. MSE: 0.0035. MAE:0.0065From the Kaggle website.Further study of other types of RNN technique.
A30SMOTE, LOF, isolation forest, SVM, LR, DT, RFAccuracy
Precision
MCC
LR, DT and RF are the best algorithms. The better parameter to deal with unbalanced data is MCC. Classifiers performing better when using SOMTE. RF: accuracy (0.9998), precision (0.9996), MCC (0.9996). DT: accuracy (0.9708), precision, (0.9814), MCC (0.9420). LR: accuracy (0.9718), precision, (0.9831), MCC (0.9438).Real dataset/Europeans cardholders/KaggleNA
A31AutoencodersNAThe results showed that Autoencoders model most promising for detecting fraud in credit card.Real data/European cardholdersUsing balanced dataset and unhidden features.
A32NB using robust scalingAccuracy,
Precision, Recall Sensitivity
AUC score
F1-score
The result shows NB which used Robust Scaleris showed improvements in predicting and detecting fraud in credit card. Accuracy: 97.78%. Precision: 99.79%. Recall: 97.78. F1-score 98.71. AUC: 95.73.Real dataset/Europeans cardholders/KaggleNA
A33NB, RF, DT, MLPPrecision
Recall, F-measure
Specificity
The result showed that the amount-based profiling both MLP and RF obtained high improvement. This model boost fraud detection.Dataset from 35 banks in TurkeyThe high number of false positive needs further study.
A34CNN
DAE
MLP
Precision
Recall
AUC
Confusion matrix
ROC curves
Results showed that DNN is capable in fraud detection. MLP2OH128H918 obtained an alert reduction rate. Threshold/D (0:1) of 35.16% when capturing 91.79% fraud cases. The rate of misclassification is 8.21%. Threshold/D (0:2) of 41.47% when capturing 87.75% fraud cases. Misclassification rate is 12.25%.Dataset from a Spanish organisation.NA
A35DCNN, RNN, SVM, LR, RF.AccuracyProposed model obtained accuracy of 99% in detecting fraud in credit card in time duration of 45 seconds.Real dataset/Europeans cardholdersApplying the fraud location and timing calculation.
A363DCNN, Spatial-temporal attention-based graph network (STAGN)AUC
Precision
recall
The suggested model showed a high performance in detecting fraud in credit card. The model is effective and accurate.Real-world data (Commercial bank)Builds a real-time detection system.
A37Bi-LSTM-autoencoder and isolation forestAccuracy
Confusion matrix
The suggested hybrid model contains Bi-LSTM Autoencoder and the isolation forest with unbalanced data. This model obtained the highest detection rate with 87%Real dataset/Europeans cardholdersNA
A38KNN, DT, RF
LR, NB
Confusion matrix recall/sensitivity precision timeHybrid classifier/combination of supervised classifiers which worked better than any other single classifier. KNN + DT: Sensitivity: 85.63%. Precision: 86.90%. KNN + LR: Sensitivity: 57%. Precision: 85.55%. KNN + RF: Sensitivity: 82%. Precision: 95.89%. KNN + NB: Sensitivity: 58%. Precision: 80.57%Real dataset/Europeans cardholders/KaggleUse unsupervised combined classifier for batter result and use more classifier.
A39LR, RF, DT, KNNAccuracy, specificity, precision,
sensitivity
The accuracy of LR is 94.9%, DT accuracy is 91.9%, and RF accuracy is 92.9%. KNN has a 93.9% success rate. Despite LR was more accurate, majority of this algorithm under fit. Thus, KNN is the best technique.Real dataset/Europeans cardholders/KaggleNA
A40ANN
Harmony search algorithm (HSA)
Accuracy, recall
SM calculation
confusion matrix
The suggested model NNHS provides a solution using HAS for ANN. The best accuracy achieved is 86. Recall is 87.German dataset available at the UCI websiteNA
A41HMM, SMOTE
DBSCAN
Precision
Recall
F1-score
Proposed approach (SMOTE + DBSCAN + HMM) performed relatively better for all the various hidden states.Simulated mobile based transactionsNA
A42Deep reinforcement.
Resampling SMOTE and ADASYN
Accuracy
Precision
Sensitivity
Specificity
The proposed model of ML with two resampling techniques and DRL is reliable. SMOTE and ADASYN are used to resampling dataset. The proposed system obtained high accuracy with 99%. RF and XGBoost are the best techniques.Real dataset/Europeans cardholders/KaggleExtend dataset. Applying new ML and DL algorithms
A43HMMAccuracyThe model is very efficient and showed the importance in learning spending behaviour. The accuracy is 80%.NANA
A44K-means. PCA
T-SNE
SOM
AccuracyThe model obtained an accuracy of 90%. The results were vary as the initialization of the weight of nodes SOM grid is done by randomly records or patterns.Statlog
Australian dataset.
Trying different iterations and store weights of SOM
A45KNN, RF, NBAccuracyKNN showed the highest accuracy than the RF algorithm and NB.Real-world datasetMore ML supervised algorithm can be added.
A46DCNN
space invariant ANN
AccuracyThe results showed that proposed robust SIANN (RSIANN) is outperformed other techniques. The accuracy is 85%. SVM accuracy is: 0.77. RF accuracy is: 0.72. NB accuracy is: 0.70.
DCNN accuracy is 0.82.
NAUsing kernels technique also using pre trained CNN.
A47SVM, NB, DTAccuracyResults showed that the new system will reduce the frauds which are happening while transactions.NANA
A48SVM, GNB, DTExecution timesThe proposed model using fusion of detection algorithms and AI. Support Vector Classifier take less time. SVC obtained solution with less time. 0.191343 ms.Real data/European cardholdersUsing other datasets also applying other algorithms
A49RFAccuracyThe result showed that RF obtained high performance. However, the speed will suffer. On the other hand, SVM suffer from unbalanced data. The SVM obtained good performance.NANA
A50NB, DT, RF, LR
AdaBoost
Gradient Boost
XGBoost
Accuracy
Recall
Precision
Confusion matrix
Results showed that XGBoost is the best boosting technique in predicting fraud. The accuracy is 100%. F1-score is 0.88. NB classifier: 95.6%. DT classifier: 90.0%. RF classifier: 97.7%. LR: 98.3%. AdaBoost: 99.9%. Gradient boost: 99.9%. XGBoost: 100%.Real dataset/Europeans cardholders/KaggleNA
A51RF, DT
LR, LOF
Isolation forest
F1-scores
Precision
Recall
Results showed that isolation forest obtained better efficiency. RF: 95.5%. DT: 94.3%. LR: 90%. Isolation forest: 99.77%. Local outlier factor: 99.69%.Real dataset/Europeans cardholdersUsing NN for training the system, to obtain better accuracy.
A52Semi-supervised learning.
AutoEncoders
Precision
Recall
F1-score
The results show that using semi-supervised technique is efficient to detect fraud. Accuracy is 0.98%.Real dataset/Europeans cardholdersInvestigate the intelligent dependent attributes.
A53AutoencodersAccuracy
Precision
F1-score
Sensitivity
The proposed model obtained high performance. SSAE+LDA model showed significant improvement comparing with other research on same dataset. Accuracy is 90%, F1-score is 90%, precision is 91%, sensitivity is 90%.Real dataset/UCIStudy effect of optimizers, stacking diverse autoencoders
A54Light gradient boosting. RFAccuracy
AUC
This study only used to identify the fraudulent user. The results show that light gradient boosting obtained great performance with a total recall rate of 99%.Real dataset/Europeans cardholdersFurther study on how to judge fraud ring based on relation map.
A55XGBoosting
Neural network
Accuracy
Precision, F1-score
Recall, ROC, AUC
Results indicated that XGBoosting performs better when comparing with other ensemble models. XGB AUS is 0.778Consumer’s dataset/Taiwan.NA
A56MLP
LR, SVM. Gradient descent algorithms.
AccuracyResults showed that proposed model performs good comparing with LR and SVM. MLP Accuracy: 0.9990
LR Accuracy: 0.9723
SVM Accuracy: 0.9345
NAA dependent variable with numerous classifications can be used.
A57GANAccuracy
Precision
The model obtained an improved sensitivity. GAN model can training of small dataset.
GAN Accuracy: 0.99962. Precision: 0.9583.
Real dataset/Europeans cardholdersDevelop a strategy to reduce the decreasing in specificity to minimum
A58Ensemble learning approach
RNN, FFNN
LSTM, GRU
Recall
Precision
F1-score
Results showed that proposed model based on LSTM with ensemble GRU on two datasets outperforms other models. The new model is efficient in term of realtime.-Real dataset/Europeans cardholders. -Brazilian bankDevelop new Model to take advantage of deep encoder and decoder.
A59CatBoost
XGBoost
Stochastic gradient boosting
Precision
Recall
Confusion matrix
Results showed that the CatBoost is the best comparing with XGBoost and SGB boosting. CatBoost accuracy is 0.921. Recall is 1.00. XGBoost accuracy is 0.914. Recall is 0.99. SGB accuracy is 0.907. Recall is 0.97.NANew models using supervised and unsupervised.
A60LR, ANN
SVM, RF
Boosted Tree
Kolmogorov-Smirnov Formula.
FDR
The new model using boosted tree shows best performance in fraud detection.
FDR = 49.83%
Real dataset/government agency/USASome data and fields such as time, day point of sale should be added.
A61NB, RF
LR, SVM
AUC
Precision
Recall
NB technique shows high performance comparing with other techniques. Accuracy is 80.4%. Area under curve is 96.3%Real dataset/Europeans cardholdersDevelop another model for sampling imbalanced data.
A62Isolation forestPrecision-recall curve (AUCPR)
AUC
The proposed model demonstrate the efficiency in fraud detection, observed to be 98.72%, which indicates a significantly better approach than other techniques.Real dataset/Europeans cardholders/KaggleFinancial institutions must make available data set. Thus, outcome will be more efficient.
A63UQ techniques:
MCD
EMCD
Confusion matrix
UAcc, USen
USpe, UPre
The suggested model using UQ provide high performance in predicting fraud. Ensemble technique is efficient in fraud prediction. MCD: UAcc (0.82)
Ensemble: UAcc (0.85). EMCD: UAcc (0.84)
Publicly available dataset/Vesta corporationThe quality of final uncertainty estimates should be improved.
A64RF, LR, DT, GNB combination with ensemble.Matthews correlation coefficient (MCC)The accuracy of all the five models is 100% & even the MCC score is +1 for the models been evaluated.Real dataset/Europeans cardholdersNA
A65DT augmented with regression analysis.Accuracy
Confusion matrix
The results showed that new model successfully verified the injected intrusions. Accuracy is 81.6% with 18.4% misclassification error.Dataset from the UCI repositoryNA
A66SOM
ANN
AccuracyUsing hybrid model of SOM and ANN achieved high performance compared to use ANN or SOM alone.Dataset from the UCI repositoryCreating a NN with some optimization technique.
A67LR, RF, and CatBoostAccuracy
Precision
Recall
The result showed that model of RF with CatBoost provides efficient accuracy. RF technique has the most elevated incentive than the LR and CatBoost algorithm.. RF: Accuracy (99.95). CatBoost: Accuracy (99.93). LR: Accuracy (99.88).Real dataset/Europeans cardholders/KaggleNA
A68Deep forest
XGBoost
AE, gcForest
Accuracy
Precision, Recall
Confusion matrix
The proposed model showed high performance in detecting card fraud.Dataset from China’s bank.NA
A69GAN, variational
automatic coding (VAE)
Accuracy
F-measure
Precision
The model showed that VAE-based oversampling performs better than the normal DNN and synthetic minority over sampling technique as it can solve the imbalanced problem.Real dataset/Europeans cardholdersImproving the model recall rate
A70C4.5 DT, NB
Bagging ensemble
Accuracy
Precision, Recall
The model shows that bagging with C4.5 DT is the best algorithm with rate of 1,000 for class 0.0825 for class 1.Real dataset/Europeans cardholdersNA
A71Fuzzy rough nearest neighbor (FRNN)
SMO, LR, MLP, NB, IBK, RF
Positive predictive value (PPV).
F-measure
Specificity
PPV, F-measure
The results showed that the suggested model provided significant results. The rate of detection is 84.90, AUC is 0.8555/Australian dataset. While 76.30% detection rate with 0.679 AUC/German dataset.Australian dataset/German datasetOther ensemble techniques should be considered.
A72NB, DT, (LMT, J48,)
Rules classifier
Lazy classifier
Meta classifier
Accuracy
Recall
Precision
F1-score
The result showed that applying LMT algorithm to classification fraud is better than other techniques. LMT model obtained 82.08% accuracy.Client’s data in Taiwan. Data available on:
.
Further study to find out new algorithms with higher voting.
A73Feature maps and GANs
SVM
CNN
AUC score
ROC
Confusion matrix
Results showed that the suggested model is applicable to test datasets and less time is required for learning. SVM obtained better detection. However, learning time exceeds other models when dataset increase. CNN-based model needs long time. SOMTE performance is effective.Machine learning group ULB. Kaggle.Change on oversampling techniques in the suggested model.
A74CNN,SVM, RF
isolation forest
Autoencoder
Accuracy
Precision
ML models have been implemented for classification purpose. Achieved competitive accuracy in CNN model. CNN: Accuracy (99.51).Real dataset/Europeans cardholdersPredict fraud in real-time. Applying service on the cloud platform.
A75LR, NB, KNNAccuracy, Recall
Specificity
Sensitivity
F-measure
Precision
Results showed that LR showed optimal performance. It is getting high accuracy of 95%. NB accuracy is 91%. KNN accuracy is 75%. LR showed better sensitivity, precision, specificity, and F-measure.Real dataset/Europeans cardholders/KaggleNA
A76Isolate forest and local outlier factor (LOF) algorithmsAccuracy
Precision
Recall
F-measure
The result showed that local outlier factor achieved high accuracy with 97%. Isolation forest accuracy is 76%Real dataset/Europeans cardholders/KaggleNA
A77RF, DTAccuracy
Sensitivity
Specificity
Precision
The result showed that this model is accurate on large dataset with 98.6% accuracy. RF provides high performance, however, it needs many training data.Dataset from product reviews on credit card transaction.Develop AI/ML/DL techniques
A78Multiple classifiers system (MCS). NB, C4.5, KNN, ANN, SVM.TPR
TNR
Accuracy
Results showed that the suggested model can tackle the unbalanced class distribution and overlapping class samples. The proposed model obtained high TPR, which is 0.840 and 0.930 accuracy. TNR is 0.955.Dataset1: ULB
Dataset2: credit cardholders/Taiwan bank
Considering combining the DL algorithms for promising detection results.
A79KNN, SVM, LR
HYBRID NB-RF
XGB
Accuracy, recall, precision, TPR, FPR,Results showed that all proposed models are superior in performance. Staking classifier using LR as meta classifier is most promising then SVM, LR, KNN and HNB-RF. Stacking classifier accuracy is 0.95. RF accuracy is 0.94.Real dataset/Europeans cardholders/KaggleApplying Voting classifier.
A80Hybrid models using AdaBoost
and majority voting, NB, SVM
MCCResults showed that the majority voting obtained high accuracy. The best MCC score is 0.823.A publicly available data set/Turkish bank.Applying online learning models so we enable efficient fraud detection.
A81DT, LR, Shallow NN. Challenger model: DL model with ensemble.AUROC
K–S statistics
alert rate, recall precision
Results showed that after testing off-line and post-line, operate the FDS with DL model. This shown +3.8% improvement of recall. The hybrid ensemble model perform well in detecting fraud.Dataset from company/South KoreaNA
A82LR, NB, AdaBoost, and voting classifierAccuracy, recall, precision, sensitivity
F1-score
Results showed a good accuracy for
NB: 91.41%. LR: 94.51%. AdaBoost: 95.67%. Voting: 94.69%.
Real dataset/Europeans cardholders/KaggleA hybrid classification method will be designed.
A83Ensembles of classifiers based on DT, XGBoost and LightGBM.Accuracy,
precision, recall
AUC
Confusion matrix
The result showed that the ensemble of models allowed to detect maximum 85.7% of fraud. Accuracy is 79‒85%.Real dataset/Europeans cardholders/KaggleNA
A84AdaBoost voting
KNN, greater part casting ballot techniques.
MCCThe results showed that perfect MCC score achieved when using AdaBoost and greater part casting a ballot. Commotion from 10% to 30% included with data. The model yielded best MCC of 0.942.Informational index from a Turkish bankNA
A85RF, KNN, NB, SVMAccuracyThe result shows that RF has the highest accuracy of detection of fraud. RF accuracy is: 0.9996.NASeeking information from advanced technologies.
A86ANN, BBNConfusion matrixResult showed a Bayesian Network is more accurate than the NB Classifier. This is disturbed with using the fact of conditional dependence between the attributes in Bayesian Network, but it requires more difficult to calculation and as training process.Real dataset/Europeans cardholders/KaggleNA
A87KNN, NB, LRAccuracy, sensitivity, specificity,The result showed that KNN performed high performance of matrices except accuracy.Real data/European cardholdersNA
A88LR, RF, SVMAccuracy,
precision,
F1-score, recall
Compression between LR, RF and SVM is performed and the accuracy of LR is 77.97%, RF is 81.79% and SVM is 65.16. So, RF is better than the SVM and LR.Real dataset/UCINA
A89LR, RF, XGBoost, ANN, isolation forest, PCA with SVM.Accuracy, sensitivity, specificity, MCC precision, BCRResults show that RF and XGBoost provided better result than other models. The accuracy of XGBoost is 0.9951. RF accuracy is 0.9955.Mobile money transactions published on Kaggle.Combined ANN with genetic algorithm to enhance accuracy.
A90SVM, GA
Cuckoo search
Particle swarm
Accuracy
Precision
Recall
The results showed that Linear kernel function is the best. Accuracy is 91.56%. Radial basis used to enhance kernel accuracy. The accuracy improved from 42.86 to 98.05%. Overall, PSO-SVM better than CS-SVM and GA-SVM.Data from law enforcement department in ChinaLook for new algorithms to optimize SVM
A91AE-PRE
Bootstrap aggregating
Bagging
Accuracy
TPR, TNR
FPR ROC curve
AUC, MCC
The result shows that AE-PRF is efficient when dataset is unbalanced. AE-PRF obtained high performance in accuracy.Real dataset/Europeans cardholders/KaggleImprove AE-PRF model with adding fine-tuning the hyperparameters of AE and RF models.
A92Multi-perspective HMMsPR-AUCThe results showed that HMM model is powerful in detecting fraud.Real dataset/BelgianCombine LSTM with HMM-base features
A93C5.0, SVM, ANN NB, BBN, LR, KNN, artificial immune systems (AIS).Accuracy
Recall
Precision
The results showed that C5.0, SVM, and ANN are performing well with imbalanced classification problem. Even these techniques improve the classifier’s performance in fraud, high number of fraud cases continue undetected.Two dataset available at Develop new model with big data driven ecosystem.
A94Hybrid model:
DT, SVM, ANN
genetic algorithm (GA).
F-score
Accuracy
Recall
The results showed that the suggested hybrid model obtained high accuracy with 93.5% comparing with ANN, SVM, and DT. The hybrid model applied GA outperform other techniques.Realworld dataset from financial institutionReal-life test for the suggested model
A95DT, RF, KNN, LR
K-means, DBSCAN, MLP, NB, XGBoost
Gradient boost
Accuracy
Precision
Recall
F1-score
The result showed that RF yielded perfect performance result with accuracy 99.995. RF is suitable for large datasets.Real dataset/Europeans cardholders/KaggleNA
A96Local outlier factor.
Isolation forest
Precision
Accuracy
The results showed that the model reached over than 99.6% accuracy. Precision at 28%. When fed more data in the model, the precision raised to 33%.Dataset from German bank in 2006.Adding more algorithms. Using more dataset.
A97KNN, PCA,
SMOTE
Recall
Precision
F1-score
The results showed that the suggested model performed well. For KNN: Precision 98.32. F-score 97.44%. For Time subset when using the misclassified instance, precision is 100% and F-score is 98.24%.Real dataset/Europeans cardholders/KaggleKnow how PCA can affect the performance of a dataset.
A98KNN, DT, LR RF, XGBoostAccuracy
F1-score
Precision
Recall, AUC-ROC
The results show that the XGBoost and DT outruns all other algorithms in detecting fraud.Real dataset/Europeans cardholders/KaggleStudy on other ML algorithms and various forms of stacked classifiers.
A99Outlier detection
DT, RF and NN
Precision
Recall
ROC
Confusion matrix
The results showed that RF is the most precise and accurate technique. However, it takes long time to train. NN is the next best algorithm. DT is the least accurate. In term of time efficiency and computational resource utilization the NN is the best technique.Real dataset/Europeans cardholders/KaggleNA
A100NB, C4.5 DT, and bagging ensemble learner.Precision
Recall
PRC
The result showed that the performance is between 99.9% and 100%. The best classifier is C4.5 DT with 94.1% precision and 78.9% recall. The acceptable performance is bagging ensemble with 91.6% precision and 80.7% recall. As for the worst performance, it is the NB classifier with precision of 65.6% and a recall of 81%.Real dataset/Europeans cardholders/KaggleOther classifiers will be used and applied to a set of local data that will be collected from banks in
Iraq.
A101Autoencoders
MLP, KNN and LR
Accuracy Precision Recall
F1-score
Results showed that the suggested model maintains a good performance. It outperforms the systems based on either different classifiers or variants of autoencoder. It establishes the efficiency of proposed two stage model. Proposed method accuracy is 0.9994. Precision is 0.8534. F1-score is 0.8265.Dataset from ULB machine learning group on Kaggle.Proposed two stage model can be tuned to handle stream data. The model can be trained on a batch of transactions.
A102LOF, AdaBoost, RF, isolation forest, DT, KNN, HMM, GA, ANN, NB, LRAccuracy
Confusion matrix
Results showed that the local outlier factor accuracy is greater than other algorithms. Local outlier factor accuracy is: 0.898.Real data/European cardholdersNA
A103RFAccuracyThe results showed that RF performs better with large dataset. The accuracy is 99.9%. The SVM algorithm can be used instead of RF. However, SVM still suffers from the imbalanced dataset.NAPrivacy preserving techniques can be applied in distributed environment.
A104RFAccuracy
F1-score, Precision, Recall
The result showed that the RF performed better comparing with DT and NB. The suggested model showed better accuracy on huge dataset.Real dataset/100,000 cardholdersApplying semi-supervised technique
A105Oversampling with SMOTE
SVM, LR, DT, RF
Accuracy Precision Recall,
F1-score
Results showed that when using SMOTE technique, the model works better in predicting fraudulent. RF and DT provided best performance.Real dataset/Europeans cardholdersBuilding a real-time solution to detect fraud.
A106RF
AdaBoost oversampling ADASYN
Accuracy
Recall
Precision
F1-score
This research examines various existing credit card fraud systems using ML approaches. Despite the fact that RF produces outstanding results on tiny sets of data, there are still certain problems, such as data imbalance. RF accuracy is: 0.999.Real dataset/Europeans cardholders/KaggleUsing large amount of data. More pre-processing procedures.
A107Autoencoder neural network DAERecall
Accuracy
The results showed the DAE improves classification accuracy of minority class of imbalanced datasets. Proposed model increases accuracy of minority class. When threshold equal to 0.6, model achieves best performance with 97.93%.Real dataset/Europeans cardholders/KaggleDimensionality reduction of high-dimensional data needs further research.
A108ANN with LRAccuracy, Precision and RecallThe results show that the model is very good. Accuracy achieved of 0.9948, the recall is 0.8639 and precision of 0.2134.Real data/European cardholdersNA
A109GAORFAccuracy
Confusion matrix
The results showed that using real and genetic algorithm optimised RF models. The model has good improvement and bringing down misclassifications.Commercial bank in NigeriaNA
A1102DCNN, 1DCNN
LSTM, NLP
SMOTE
Accuracy
F-score
Precision
Recall
The result showed that using CNN and LSTM yielded better performance. LSTM (50 blocks) was the highest with F1-score of 84.85%. Sampling techniques applied to solve imbalanced dataset and improve model performance.Real dataset/Europeans cardholders/KaggleHyperparameters to build DL techniques to improve performance.
A111ANN, RF, GBM
RUS, SMOTE
DBSMOTE
SMOTEENN
F1-score
Recall
Precision
Accuracy
The result showed that using sampling techniques enhanced the detecting of fraud in credit card. Recall obtained with SMOTE by DRF classifier is 0.81 which is the best. Precision is 0.86. Staked ensemble shown promise in detecting fraud.Real dataset/Europeans cardholders/KaggleUsing other sampling techniques. Applying unsupervised and semi- supervised techniques.
A112Local outlier factor, LR, RF, DT
isolation forest
Accuracy
MCC
The result showed that the LR, SVM obtained higher accuracy. SVM accuracy is 0.9987. LR accuracy is 0.9990. One-class SVM applied in this study.Real dataset/Europeans cardholdersNA
A113ANN.LR, DT, RF and XGBoostAccuracy
Precision
Recall
F1-score
Results showed that ANN and XGBoost performed a high performance. ANN achieved a 99% accuracy.-Real dataset/Europeans cardholders.
-Synthetic dataset
Use more real world datasets.
A114NB, SVM
AdaBoost
MCC
Accuracy
The results showed that boosting technique achieved a good accuracy. The best MCC score is 0.823.Real world dataset.Extend the model to online learning model.
A115KNN, LR, SVM, RF, DT, XGB, OCSVM, AE, RBM, GANAUROC
FPR
TPR
The results showed that applying supervised approach such as, RF and XGB achieved better performance. XGB obtained 0.989 AUROC. RF obtained 0.988 AUROC. Unsupervised techniques RBM achieved the best performance with 0.961 AUROC.Real dataset/Europeans cardholders/KaggleFocuses on new GAN model
A116RFAccuracy
Sensitivity
Specificity Precision
The result showed that building multiple DT achieved good performance with 98.6% accuracy.Real dataset/Europeans cardholders/KaggleNA
A117IBk, IB1, KStar, RandomCommittee, and RandomTree
AdaBoost
Accuracy
Precision
Recall
The results showed that the best accuracy achieved by Bagging, Rotation Forest, Random SubSpace, Random Committee, LMT, and REPTree. The IBK, IB1, RandomCommittee, KStar, and RandomTree obtained good accuracy. And can detect fraud 348 (35.27%), 354 (40.97%), 396 (45.83%), 397 (45.94%), and 399 (46.18%) respectively.UCSD—FICO dataset.NA
A118Spectral-clustering hybrid of GA trained modular NN.Sensitivity
Specificity
Accuracy
Results showed that hybrid model is efficient in detecting fraud. The model obtained sensitivity of 90%, specificity of 19% and prediction accuracy of 74% with improvement rate of 12% for data inclusion.Dataset from banks/Africa and Nigeria.NA
A119ANN, SA-ANN
HTM-CLA
DRNN, LSTM
AccuracyResults showed that the HTM-CLA offered a realistic features. HTM-CLA with SA-ANN achieved good performance. The maximum accuracy obtained from SA-ANN.Real dataset/Australia
Real dataset/German
Reduce computational burden in HTM-CLA technique
A120Isolation forest
KNN, DT, LR, RF
Sensitivity time and precisionThe result showed that KNN sensitivity is better than DT. However, DT needs less time to detect fraud. DT is the best model.Real data/European cardholdersNA
A121LR, DT, RF, NB, ANNAccuracy
Recall
Precision
Results showed that the accuracy is 94.84% when using LR. 91.62% when using NB and 92.88% when using DT. ANN obtained better accuracy of 98.69%. ANN is the best.Real dataset/Europeans cardholdersNA
A122KNN, DT, SVM, LR, RF
XGBoost
Accuracy
F1-score
Confusion matrix
The result showed that KNN model is the best comparing with other techniques.
Accuracy is 99.95%. F1-score is 85.71%.
Real dataset/Europeans cardholdersUsing other resampling and applying DL techniques.
A123Hybrid architecture involving the optimization of the particles swarm (PSO)
SVM
Accuracy
F1-score
Confusion matrix
The PSO algorithm is used to select characteristics and the SVM is used for the iterative development of the feature selection. Results shown that a minimum of functionalities is extracted by the suggested PSOSVM. The PSO-SVM algorithm is an optimal preparatory instrument for enhancing feature selection optimisation. Accuracy for German dataset: with SVM: 78.69. PSOSVM: 89.42. Accuracy for Australian Dataset: with SVM: 78.84.PSOSVM: 89.27.German credit card datasets.
Australian credit cards
NA
A124Stacking AdaBoost
majority voting
LR, DT, RF
AccuracyThe result showed that the suggested model provided better fraud detection. The boosted stacking performs better than others. Boosted Staking accuracy is 94.5%Real dataset/Europeans cardholders/KaggleNA
A125Neo4j, PageRank, RF, DT KNN, SVM, MLP, LOF, isolation forestAccuracy
MCC, F1-score
Recall, Precision
ROC, AUC,AUPR
The result showed that significant improvement in performance metrics of DT. LOF yielded a better result with 99.54% accuracy and recall 83.39%. When using PageRank graph feature. RF accuracy is 99.47%.Synthetic dataset/BankSimOther graph algorithms to extract feature and DL should be studied further.
A126Autoencoder,
RBM
Recall, Precision
AUC
The result showed the AE and RBM can make AUC more accurate. AE based camera and H2O applied.Real dataset/Europeans cardholdersNA
A127AdaBoost
Majority vote
MLP, SVM
LOR, HS
MCC metricsThe result showed that the hybrid model of majority voting provided good accuracy. The model achieved great location rate 98% with 0.1%. Perfect MCC score when using AdaBoost and Majority voting.Real dataset/Europeans cardholders/KaggleExamined other internet study models
A128AE, one-class SVM and robust Mahalanobis outlier detectionPrecision
Error rate
MSE
Results showed that the advantage of robust Mahalanobis is that does not need label for training. The performance of the three models was vary. To get vision about performance of models the available labels used for model performance evaluations.Real dataset from international corporationGlobal and local outlier, cardholder behaviour need to be considered.
A129AdaBoost, NB, RT
Majority voting DT, GBM, NN, SVM, Spark ML
MCCThe results showed that the hybrid model of NB, SVM, and DL techniques obtained an ideal MCC score 0.823.Public real dataset/bankExpand to internet learning
A130SVM-RFE
Hyper-parameters Optimization
SMOTE
Accuracy Precision
Recall, Specificity
F-score
Results showed that the proposed model is high effective and obtained the best accuracy with 99%.Real dataset/Europeans cardholders/KaggleUsing more complex datasets
A131RNN
SMOTE Tomek
LSTM, BLSTM
GRU, BGRU
Accuracy
Recall, Precision
AUC
Results showed that BGRU achieved the best accuracy 97.16%, then BLSTM with 96.04%.Real dataset/Europeans cardholders/KaggleFocuses on the behavior of customer.
A132WOA
SMOTE
BPNN
AccuracyThe result showed that the WOA and SMOTE obtained more efficient than BPNN.Real dataset/Europeans cardholdersNA
A133NB, SVM, RFAccuracyThe results showed that the RF is the best technique with accuracy of 100%.Real data/European cardholdersNA
A134RF, SVM, LOF
isolation forest
Accuracy
Precision
Recall
The result showed that the RF obtained 99.92 accuracy. RF performed better comparing with other techniques.Real dataset/Europeans cardholdersImprove dataset and add other algorithms to the suggested model
A135K-means
C5.0 DT
Hadoop and Spark
Accuracy
ROC
AUC
The results showed that the spark-based IHA hybrid model obtained 94% accuracy. It is suitable for detect fraud.Public domain
Applying this model to other fields
A136SVM
Undersampling
techniques
Accuracy
Precision
Recall
Results showed that the new model improves the performance. Accuracy is 99.9%. SVM obtained best precision with 89.5%.Real dataset/Europeans cardholdersNA
A137Isolation forestAccuracyThe results showed that the isolation forest obtained accuracy with 99.87.Professional survey organizations.Using hybrid techniques and AI
A138RFAccuracyThe results showed that RF using feedback and delayed supervised sample is better than other techniques. RF accuracy is 0.962.NAApplying semi-supervised techniques
A139SVM, KNN
AdaBoost
PSOS
RIG
Accuracy
Precision
Recall
F-measure
The results point out that PSOS technique is the best feature optimisation technique. This technique enhanced the accuracy from 82.90% to 85.51%. PSOS technique gives more performance.Australian financial dataset.Extend the model by using hybrid techniques
A140AdaBoost majority voting, NB, SVM, DLMCCThe results showed that Majority voting obtained a high accuracy and best MCC score with 0.823.Public realworld data setExtend to online learning model
A141RF, NNAccuracy
Precision
Recall, F-measure
The result showed that RF obtained accuracy with 90%. RF is suitable technique.Real-life B2C datasetThr RF itself needs improvement.
A142NB, RF, DT, GBT, DS, ANN, RT, MLP, LIR, LOR, SVMAccuracy
ACC
MCC
The results showed that the best AUC obtained is 0.937 from GBT using aggregated features. Aggregated features improve the models performance.Public data sets. Benchmark databases.Further evaluation of this models using different datasets.
A143HOBA
DBN, RNN, CNN
BPNN, SVM, RF
Accuracy Precision, Recall
F1-measure
The results showed that the DBN with HOBA variable obtained better performance. Using DL techniques and HOBA feature engineering improve the performance.Real-world dataset/bank in ChinaBuild real-time model. Build a combination model of ML and DL
A144AE
GAN
Precision
Recall
F1-measure
The result shows that the DAEGAN model achieved best performance. AUC is 0.958. Recall is 0.815. AUPRC is 0.805. DAEGAN improves accuracy.Real dataset/Europeans cardholdersImprove the model
A145Isolation forestAUCPR
F1-score
Precision, Recall
ROC-AUC
The result showed that the model achieved good performance. AUCPR is better than ROC-AUC in describing performance. Precision is 0.807. Recall 0.763. F1-score is 0.784. ROC-AUC is 0.973. AUCPR is 0.759.Real-life dataset from ULB. Kaggle.NA
A146SVM, LOF
isolation forest
SVM
Accuracy
Precision
F1-score, Recall
The results point out that isolation forest with LOF model very fast and accurate. The accuracy is 99.74%, SVM obtained 45.84%. LOF achieved 99.66%.NANA
A147SVM, K-means
AdaBoost
Recall, AccuracyThe result showed that SVM and AdaBoost obtained high performance.Dataset from a bank.NA
A148Deep auto-encoderAccuracy
Precision, Recall
AUC-ROC Curve
The results showed that the algorithm is perfect and gave high performance 98.8% acceptance rate. The proposed algorithm can be used for any Binary classification task.Real dataset/Europeans cardholdersNA
A149NNAccuracyThe result showed that the suggested model can be integrated with mobile apps to detect fraud. Model obtained excellent accuracy with 99.75%.Real dataset/Europeans cardholdersNA
A150RF, DT, SVM, GNB LRAccuracyThe result showed that the DT provided better performance. However, speed still suffer.NAUsing other ML and DL techniques
A151Isolation forest
LOF
Recall, Precision
F1-score
The isolation forest obtained accuracy with 99.72%. With number of errors 71. LOF accuracy is 99.62% and number of errors 107. Isolation forest is better in detecting fraud.Real dataset/Europeans cardholdersUsing NN technique
A152DT, RF, HMM, NNAccuracy false alarm rate, MCCThe results point out that the RF obtained high performance with 0.999% accuracy in fraud detection.Real dataset/UCINA
A153TVIWDA
SVM
WFSVM
Accuracy
Precision, Recall
F1-score
The result showed that using TVIWDA with WFSVM improved the accuracy of detection. The suggested system obtained 97.82% accuracy. Precision is 92.62%.German credit card dataset.Solving the imbalanced data problem
A154Oversampling pre-processing technique SAS.
RF, KNN, DT, LR
AccuracyThe study proposed 4 models to detect credit card fraud. The result showed that the RF and KNN are overfitting. Thus, only the DT and LR have been compared. The best performing model is a LR. Result shows that LR with stepwise splitting rules has outperformed the DT with only 0.6% error rate.Real dataset/Europeans cardholders/KaggleUse different sampling technique such as undersampling, SMOTE or roughly balancing to compare the result.
A155RF, DNNAccuracyThe results showed that the RF perform perfect with large number of data. RF accuracy is 0.999.NANA
A156KNN, LRAccuracy Precision
Recall, F-measure
The result shows that the KNN technique is achieved best result. Precision is 0.95. Recall is 0.72. F1-score is 0.82.Real dataset/Europeans cardholdersNA
A157SMOTE
MLP, KNN, SVM
OSE, NN, GAN
Accuracy
F1-score
The results point out that the model using stacking classifier which combines GAN-improved MLP with SVM and KNN. OSE is preferred because of its ability to harness the abilities of MLP which works better in finding hidden patterns. The accuracy of OSE is 99.8%Real dataset/Europeans cardholders/KaggleApply weighted voting and boosting algorithms
A158Aggrandized RFAccuracy Precision
Recall, F-measure
Sensitivity
Specificity
The result showed that the aggrandized random forest is obtained high accuracy with 0.9972% for balanced data. And 0.9995% for imbalanced data. RF is the best technique in detecting fraud.NANA
A159RF, ANN, SVM, LR, tree classifier gradient boostingAccuracy
Precision
Recall
F1-score
Results showed that the RF algorithm demonstrate an accuracy percentage with 95.988%. SVM accuracy is 93.228%. LR accuracy is 92.89%. NB accuracy is 91.2%. DT accuracy is 90.9%. GBM accuracy is 93.99%.ULB dataset from KaggleApply other ML techniques
A160SVM, NB, KNN
focal loss
XGBoost
W-CEL, LR
Accuracy
Precision
Recall
MCC
The result showed that the suggested model achieved accuracy with 100%. Precision is 0.97. Recall is 0.56. MC is 0.72 using extreme imbalanced dataset. When using mild balanced dataset, the accuracy is 99%. 0.88 precision. 0.87 recall. 0.89 MCC. The suggested model is not working well when using extreme dataset. XGBoost improves model performance.ULB dataset from KaggleSolve the imbalanced dataset problem
A161HMMNAThe study provided a method to find out the spending behaviour of cardholder, then find out the observation symbols so that help in estimating the model performance.NANA
A162LOF, K-means
isolation forest
Precision
Recall
F1-score
The result shows that proposed model provided an accuracy with 98%. K-means clustering, isolation forest and LOF.Real dataset/Europeans cardholdersNA
A163KNN, LR, RF XGBoost extreme gradient boostPrecision
ROC-AUC
As the XGBoost is showing more accuracy than other models. Out of these algorithms, XGBoost model is preferable over the RF model and LR model.Real data/European cardholdersRF model would be improved
A164AdaSyn, ROS, RUS, Tomeklinks
AIIKNN, Tomek
SMOTE+ENN, AdaBoost, KNN, RF, SVM, eXtreme
XGBoost, LR
Accuracy
Precision
Recall
K-fold
AUC-ROC
Execution time
The result showed that oversampling followed by undersampling performs well for ensemble classification models. AIIKNN, SMTN, and RUS are performing well. SVM and KNN achieved perfect results. Best precision provided by oversampling followed by undersampling methods in conjunction with RF. NB classifier was the least.Machine learning Group ULB. Kaggle.NA
A165RF, SVM, ANN.AccuracyThe result showed that ANN produced high accuracy then RF then SVM.NAUsing more techniques.
A166KNN, isolation forest, local outlier factorAccuracy
Recall score
Results showed that all algorithms achieved 95.0% accuracy. Isolation forest had high accuracy and K-means produced the low accuracy. LR and vanilla LR gave great accuracy.Real dataset/Europeans cardholdersImplement an autoencoder or SVM.
A167LIGHTGBM
AdaBoost, RF
Accuracy, precision and recallThe results showed that AdaBoost provided the highest result with 0.9613. In term of precision, Light BGM produces the highest result with 0.986. AdaBoost provided the highest recall with 0.889.Real dataset/Europeans cardholders/KaggleAdding more parameters.
A168OLightGBM
RFLR
SVM, DT, KNN
NB
Accuracy
Recall
Precision
F1-measure
The results highlight the importance of adopting an efficient parameter optimization strategy for enhancing the predictive performance. The proposed model outperformed other techniques with accuracy 98.40%. AUC is 92.88%. Precision is 97.34%. F1-score is 56.95%.Real dataset/Europeans cardholders/KaggleNA
A169RF, Apache KafkaTrue positive rate (TPR), TNR, recall, precision accuracyUsing Apache Kafka to consume the transactions from the transaction record and publish them in real time. This project is using Cassandra as the storage layer. This proposed system offers the user maximum security and precision.Data from the file system to the Cassandra database.NA
A170Autoencoder
RBM
Federate learning
Accuracy
ROC, Recall
Precision
The results showed that the average accuracy of Autoencoder is 94% and RBM is 88%. AUC achieved a result of 0.94.Real dataset/Europeans cardholdersNA
A171RFAccuracy
Recall, Precision
F1-score
The result showed that RF obtains good performance on small dataset. Some problems with imbalanced dataset. RF accuracy is 0.9632. Precision is 0.894. Recall is 0.85. F1-score is 0.871.Real dataset/Europeans cardholdersImprove RF itself
A172RF, LR, DT, KNN NB,
Undersampling and oversampling techniques.
Accuracy
Sensitivity Specificity Precision
Matthews’s co-relation
Results showed that LR is the best algorithm. The proposed classifier NN and LR outperform DT. LR accuracy is 0.9699.Real dataset/Europeans cardholders/KaggleNA
A173Local outlier factor, LOF, INFLO, and AVFAccuracy
Recall
Precision
The results showed that using LOF, INFLO, and AFV resulted in the highest level of LOF. 96% accuracy, 98% recall, and 93% precision.World websiteTrying other algorithms.
A174LR, DT, SVM, NB, RF, KNNAccuracy
Precision Recall
The result showed that using RF obtained best accuracy of 99.947%, precision is 76%, and recall is 92.68%.Real dataset/Europeans cardholdersANN can be used to construct new classification techniques.
A175Deep learning based fraud detection model (DLFD)Accuracy
Precision
Recall
DL model is constructed for the prediction process using Keras. Comparison with existing models indicate high performance in detecting fraud. Detection rate is 8.7%. DLFD accuracy/0.997. Precision/0.929. Recall/0.795.BankSim dataset was used for analysis of performance.Improving the TPR levels and also on handling the concept drift.
A176ANNAccuracyThe result showed that ANN is successful in fraud detection. Accuracy is 98%. However, ANN faced problems when training on huge datasets.Dataset from company/South AfricaNA
A177ANN, GA, LR,
SMOTE
Accuracy
Precision
Recall, F1-score
The results showed that the ANN with genetic algorithm obtained accurate results. The accuracy is 99.83%. Precision is 50.70%. Recall is 97.27%. F1-score is 66.66%.Real dataset/Europeans cardholdersNA
A178SVM, fuzzy association rules (FAR). Gradient recurrent unitNAThe results showed that the proposed framework provided significant contribution. The framework allow to detect abnormal transaction.NAImplementation and evaluation the framework.
A179Hybrid ensemble-based. Boosting and bagging, RF, LRMCC, Precision
Recall
Detection rate
Accuracy
Results showed that the model is efficient in detecting fraud. MCC is 1.00. The false positive rate is 0.00235. False negative rate is 0.0003048. The detection rate is 0.9918. Accuracy is 0.9996. MCC is 0.9959.Brazilian bank data and UCSD-FICO dataNA
A180Particle swarm optimization (PSO). NNAccuracy
Precision
Recall
Results showed that performance of PSO is very high with 99.9% accuracy.Real dataset/European cardholdersFocus on solving imbalanced.
A181LR, RF
Under sampling and oversampling
Confusion matrix,
precision, F1-score,
Roc-AUC
RF precision is 0.93. F1-score is 0.85. The oversampling, under sampling of data for accuracy of classifiers is promising. Oversampling technique gave better fraud prediction results as compared to random under sampling.Real dataset/European cardholdersNN and using combination of HMM or KNN to achieve better in fraud detection.

Overall performance estimation of ML/DL model in credit fraud detection

This section addresses RQ3, which concerns the estimate of ML/DL model performance. Accuracy of estimation is the primary performance indicator for ML/DL models. This question focuses on the following features of estimating accuracy; performance metric, accuracy value, and dataset. As the construction of ML/DL models is dependent on the dataset, we examined the data sources of ML/DL models in the reviewed article. In addition, we found a number of datasets utilised in the experiments of associated article. This review articles employs two sets of datasets; real-word data set and synthetic dataset. The dataset utilised most frequently in the reviewed article is a real-word dataset. In addition, 154 research article employed real-world datasets, eight utilised synthetic datasets, and 19 did not specify the dataset source.

Evaluation metrics were used to calculate ML/DL model performance. Confusion matrix provides output matrix that characterises the model’s overall effectiveness. ML/DL model’s accuracy is compared using confusion matrix sensitivity and specificity, F-score, precision, receiver operating characteristic (ROC), and area under precision recall area (AUPR).

In this review, a number of different performance indicators have been used in addition to accuracy. As shown in Table C1 , we found 177 article that clearly presented the performance metrics of the proposed models. Four article did not mention the performance metrics. We discovered that 177 of reviewed article mentioned the performance indicators of their suggested models. However, four reviewed article did not mention the performance metrics. In this review, accuracy, recall, precision, and F-score were often employed as performance indicators. Accuracy is the proportion of test set records that were properly categorised transaction to fraudulent or non-fraudulent. The ration of true positives to all positives is referred to as precision. The proportion of fraudulent transactions that we correctly detected as fraudulent compared to the total number of fraudulent transactions would be the precision. Recall is percentage of all correctly classified predictions made by an algorithm. In addition, the value of F1 provides a single score that is proportionate to both recall and precision. Full two-dimensional area under the entire ROC curve is measured by AUC. One of the best indicators for analysing the effectiveness of credit card fraud detection is the ROC curve. The classification’s quality is measured by MCC. Because it covers true positive, true negatives, false positive, and false negatives, it is a balanced metric. MCC utilised in 13 reviewed article.

In addition, 30 of the 181 studies employed only a single performance metric, with the majority of these article using only accuracy (24) article, MCC (five) article, and execution time (one) article. Using single performance metric is insufficient for determining the quality of ML/DL model. However, article such as 43 and 74 utilised more than five performance indicators to represent the performance of their ML/DL model. In addition, a number of reviewed article give computational performance measurements as well as performance metrics. The length of time the model took to complete the assigned task is called execution time. To ascertain how long the model takes to detect fraud, the execution time is calculated. As a result, we guarantee that the model successfully achieves its goal. Execution time employed in Alghofaili, Albattah & Rassam (2020) , Devi, Thangavel & Anbhazhagan (2019) , Singh, Ranjan & Tiwari (2021) . The loss rate function compares actual and expected training output to speed up learning. Loss rate employed in article ( Alghofaili, Albattah & Rassam, 2020 ). Test of the effect of cost sensitive wrapping of Bayes minimal risk (BMR) applied in article ( Almhaithawi, Jafar & Aljnidi, 2020 ) as a cost-saving measure. Balanced accuracy (BCR) combines the matrices of sensitivity and specificity to produce a balanced outcome. BCR presented in article ( Layek, 2020 ). In ( Arun & Venkatachalapathy, 2020 ) Kappa assesse the predication performance of the classifier model. Few article ( Arya & Sastry, 2020 ; Bandyopadhyay et al., 2021 ; Bandyopadhyay & Dutta, 2020 ; Benchaji, Douzi & El Ouahidi, 2021 ; Rezapour, 2019 ) introduced mean square error (MSE) assessment metrics, mean absolute error (MAE), and root mean square error (RMSE). Table C1 shows the proposed ML/DL model along with performance and datasets.

Trend of research

To answer RQ4, we examine the trend of the reviewed article. In addition, we compare the models created over the three years to determine and evaluate which techniques recently garnered more attention. This also assist, to identify the gaps so that future research will be able to address them in their own work. First, we examined the distribution of the chosen article by the publication year. In year 2019 (47 articles), 2020 (70 articles), and 2021 (64 articles). Significant difference existed between the years 2019 and 2020, the number of published articles for credit card fraud detection increased (23 articles). However, there was no notable difference between 2020 and 2021 (six articles). Fig. 2 demonstrates this comparison.

In response to RQ1, we demonstrated that 110 distinct ML models, 34 distinct DL models, and 39 models that combine ML and DL have been utilised by researchers. RF, LR, and SVM are the most commonly employed ML approaches. ANN, AUE, and LSTM are the most utilised DL approaches. In addition, we observed increased interest in combining ML and DL models.

In our review, we count the various learning-based credit card cyber fraud detection techniques applied in the reviewed article to answer RQ2. From this review we found that the most common technique among the reviewed article is the use of supervised algorithm. Supervised algorithms applied in 74% of the reviewed article. A total of 12% of the reviewed article utilised unsupervised techniques. A total of 12% used supervised and unsupervised techniques. A total of 2% applied semi-supervised technique. A total of 1% used reinforcement technique. For the RQ3, we listed the performance metrics that each research article applied. We discovered that 24 out of 181 reviewed article utilised accuracy as their only key performance metric. We also found a number of datasets that utilised in the reviewed article. Majority of the reviewed article using real-world datasets. A total of 154 research article applied real-world data, eight article used synthetic data, and 19 did not mention the source.

In RQ4, we identified research gaps by investigating unexplored or infrequently studied algorithms. In addition, we found supervised learning as the most prevalent learning technique and SMOTE as the most prevalent oversampling technique. The majority of researchers focused on supervised techniques such as LR, RF, SVM, and NN.

Combination techniques that employ multiple algorithms are becoming increasingly prevalent in the detection of cyber fraud. Detecting cyber fraud in credit card increasingly involves the use of DL. DL techniques utilised 34 times in the reviewed article, whereas 39 of the reviewed article applied a combination of DL and ML techniques for credit card cyber fraud detection. DL is advantageous for fraud detection since it solves the difficulty of recognising unexpected and sophisticated fraud patterns. Moreover, as the number of fraud cases to be recognised is relatively limited, DL may be effective. DL have garnered the most attention and had the most success in combating cyberthreats recently. Due to its ability to minimise overfitting and discover underlying fraud tendencies. Moreover, the capacity to handle massive datasets.

For supervised learning algorithms to predict future credit card transaction, each observation must have a label. Given that there is no classification for these observations, this could be a problem when trying to identify fraudulent transactions. Additionally, since fraudsters constantly alter their behaviour, it is challenging to develop a supervised learning model for a given transaction. The normal class is often the only one that unsupervised algorithms need labels for, and they can predict future observations based on deviations from the normal data. Future research should give more attention to unsupervised and semi supervised techniques, which can yield new insights. In addition, paying more attention to DL techniques such as CNN, RNN, and LSTM, we recommend that further research may be conducted on ML techniques, especially semi-supervised and unsupervised techniques in order to improve ML model performance. In addition, performing additional research on DL techniques is needed. As a result of the unavailability of a balanced dataset and the shortage of datasets, financial institutions are encouraged to make the essential dataset available, so that research outputs will be more effective and qualitative.

To detect cyber fraud in credit card, supervised, unsupervised, and semi-supervised ML/DL techniques applied in the reviewed article. Figure 4 displays that 74% of the reviewed article utilised supervised techniques. As a result, it is the most common technique used in the reviewed article. In addition, according to the reviewed article, classification and regression techniques been always of interest. On the other hand, 12% of selected articles applied unsupervised techniques, 12% of selected articles applied both supervised and unsupervised techniques, while 2% articles applied semi supervised techniques, and 1% articles applied reinforcement learning. A growing trend in this field is the use of ensemble techniques that capitalise on the benefits of several classification methods. The use of ensemble methods increased in 2020 and 2021 comparing with 2019. The other interesting finding is that DL approaches have attracted considerable interest during 2019 to 2021. The number of research articles that used DL techniques as single technique or combined with other ML techniques in 2019 is 15 articles, in 2020, 30 articles, and in 2021, 28 articles. It appears that the popularity of DL algorithms has increased.

The countries that published research on utilising ML/DL techniques to detect credit card cyber fraud is growing over time. In 2021, Ghana, Romania, Taiwan, and Vietnam are among the new countries that made an effort in detecting cyber fraud. India is the pioneer when it comes to the publication of ML/DL studies. Figure 5 depicts the number of article published by country and year (2019, 2020, and 2021).

An external file that holds a picture, illustration, etc.
Object name is peerj-cs-09-1278-g005.jpg

Gap analysis and the future direction

The most effective way for determining the approaches that are most appropriate for this research problem is to categorise the ML/DL algorithms used in detecting cyber fraud in credit card. Additionally, it is beneficial to determine why particular tactics were chosen. Supervised algorithms have always been of interest, as 74% of the reviewed articles have been used supervised algorithms, with the most commonly used being RF then LR then SVM. Unsupervised learning algorithms also applied in 12% articles with the most commonly used being Isolation forest. However, it is interesting that only 12% of the 181 reviewed studies utilised unsupervised learning techniques. Semi-supervised approach employed in 2% of the reviewed articles. It appears that semi-supervised and unsupervised learning techniques may be researched further. According to reviewed articles ( Choubey & Gautam, 2020 ; More et al., 2021 ; Muaz, Jayabalan & Thiruchelvam, 2020 ; Shirgave et al., 2019 ), unsupervised or semi-supervised learning techniques such as one-SVM, isolation forest, and K-means clustering should be utilised more in credit card fraud detection.

In the three years, DL techniques have been examined increasingly frequently. Utilising DL to get greater accuracy and efficient performance. By applying DL techniques, new fraudulent patterns can be recognised and system can respond flexibly to complex data patterns. Thus, for efficient credit card fraud detection, researchers are encouraged to conduct additional study on DL techniques. Several studies such as ( Benchaji, Douzi & El Ouahidi, 2021 ; Jonnalagadda, Gupta & Sen, 2019 ; Kalid et al., 2020 ) suggested further study of DL techniques for detection in credit card. Moreover, as each ML/DL technique has its own limitations, it is necessary to consider combining the ML and DL algorithms for promising detection results. Several article such as ( Agarwal, 2021 ; Dang et al., 2021 ; Gamini et al., 2021 ; Kalid et al., 2020 ; Singh & Jain, 2019 ) suggested combinations of DL methods and traditional ML methods to cyber fraud detection from an unbalanced data and enhance the accuracy.

Several reviewed article cited the lack of the dataset as the limitation of their work. According to Meenu et al. (2020) , the research outcomes will be more effective and of higher quality if the financial institutions make the crucial data set of various fraudulent actions available. As a result, one of the key problems in many studies is the lack of data. Limitations on the availability of the data could be overcome if there is a vital data set of diverse fraudulent activities across nations. Maniraj et al. (2019) noted that when dataset size increase, algorithm precision also increases. It appears that adding additional data will undoubtedly increase the model’s ability to detect fraud and decrease the number of false positives. The banks themselves must formally support this. The study ( Seera et al., 2021 ) proposed conducting further evaluation of their generated model with real data from diverse regions.

Additionally, the datasets are significantly skewed, which is a problem. Numerous studies attempted to develop a model that could perform properly with data that is highly skewed. Several articles ( Balne, Singh & Yada, 2020 ; Ojugo & Nwankwo, 2021 ; Shekar & Ramakrisha, 2021 ; Voican, 2021 ; Vengatesan et al., 2020 ), unbalanced data was applied, and balancing the dataset using sampling techniques such as oversampling or undersampling is left as future work. Several articles ( Ahirwar, Sharma & Bano, 2020 ; Almhaithawi, Jafar & Aljnidi, 2020 ; Manlangit, Azam & Shanmugam, 2019 ) applied oversampling techniques.

Undersampling techniques have been applied in several article ( Amusan et al., 2021 ; Ata & Hazim, 2020 ; Muaz, Jayabalan & Thiruchelvam, 2020 ; Rezapour, 2019 ; Zhang, Bhandari & Black, 2020 ). In Amusan et al. (2021) , a random undersampling technique was used, and the study recommended that other balancing data techniques be explored. One reviewed article ( Ata & Hazim, 2020 ) applied an undersampling technique. However, the study recommends adopting the suggested model by using massive dataset instead of using sampling technique. In addition, some articles such as Trisanto et al. (2021) and Singh, Ranjan & Tiwari (2021) applied undersampling techniques and oversampling techniques.

Oversampling technique such as SMOTE, ADASYN, DBSMOTE, and SMOTEEN have been used. Undersampling techniques such as random undersampling (RUS) has been applied. In light of this, future studies should consider applying alternative oversampling techniques, such as borderline-SMOTE and borderline oversampling with SVM, as well as undersampling techniques. In addition to fraud location, an algorithm to determine the timing of the fraud is required ( Alghofaili, Albattah & Rassam, 2020 ; Chen & Lai, 2021 ). In addition, an algorithm can be developed to predict fraudulent transactions in a real-time and deploying the service on various cloud platforms to make it easily accessible and reliable ( Ingole et al., 2021 ).

Limitation of the review

Our review is restricted to journal article published in 2019, 2020, and 2021 that apply ML/DL techniques. By using our methodology in the early stages, we eliminated several irrelevant article. This assured that the selected article met the requirements for our review. Even though we searched the most prominent digital libraries for the article, there may be more digital libraries having relevant research article that were not included for this study. The snowballing method used to include relevant article that excluded during automatic searching in order to address this limitation. In addition, as it is probable that while looking for the keywords, we would have missed some synonyms. Hence, we also analysed the search terms and keywords for recognised collection of research works. We restricted our search to only English-language articles. This creates a language bias, as there may be article in this field of study written in other languages.

Conclusions

This review studied cyber fraud detection in credit card using ML/DL techniques. We examined ML/DL models from the perspectives of ML/DL technique type, ML/DL performance estimation, and the learning-based fraud detection. The study focused on relevant studies that were published in 2019, 2020, and 2021. In order to address the four research questions posed in this study, we reviewed 181 research article. In our review, we have provided a detailed analysis of ML/DL techniques and their function in credit card cyber fraud detection and also offered recommendations for selecting the most suitable techniques for detecting cyber fraud. The study also includes the trends of research, gaps, future direction, and limitations in detecting cyber fraud in credit cards. We believe that this comprehensive review enables researchers and banking industry to develop innovation systems for cyber fraud detection.

On the basis of this analysis, we suggest that more research may be conducted on semi-supervised learning and unsupervised learning techniques. Based on our review, we recommend that DL techniques might be further researched for credit card cyber fraud detection. Researchers are encouraged to conduct further research on integrating the ML/DL algorithms for effective detection outcomes. In addition, researchers are advised to use both oversampling and undersampling techniques because the datasets are extremely skewed. Furthermore, we recommend researchers to mention dataset sources and performance metrics employed to present the outcomes. Banks are also encouraged to make available dataset of different fraudulent activities across nation for further research.

Funding Statement

The authors received no funding for this work.

Additional Information and Declarations

The authors declare that they have no competing interests.

Eyad Abdel Latif Marazqah Btoush conceived and designed the experiments, performed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Xujuan Zhou conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Raj Gururajan conceived and designed the experiments, performed the experiments, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Ka Ching Chan conceived and designed the experiments, analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Rohan Genrich conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Prema Sankaran conceived and designed the experiments, prepared figures and/or tables, authored or reviewed drafts of the article, and approved the final draft.

Watch CBS News

We may receive commissions from some links to products on this page. Promotions are subject to availability and retailer terms.

How to find the best credit card debt forgiveness programs

By Angelica Leicht

Edited By Matt Richardson

July 25, 2024 / 4:23 PM EDT / CBS News

Keyboard with button pay and credit card - payment concept I am author of image used on credit card and used data are fictitious.

If you aren't paying off your credit card balances in full each month, you may be putting yourself in a precarious financial position. After all, the average credit card rate has increased significantly over the last couple of years, climbing from 16.65% in 2022 to nearly 23% currently. That's an uptick of about 36% in just two years. 

So, if you're carrying a balance from month to month right now, chances are that you're dealing with a hefty amount of interest on what you owe. That alone can make it tough to balance your credit card payments with your other financial obligations. And, if your credit card balances continue to grow over time, there could be a point when it's tough to make  the minimum payments on your cards.

Should that happen, the good news is that there are debt relief solutions , like credit card debt forgiveness , that can help. When you enroll in a credit card debt forgiveness program, a debt relief company negotiates with your creditors to try and reduce what you owe in return for a lump-sum payment. That can be a good path to eliminating card debt , but it's important to find the right program first.

See which credit card debt forgiveness options are available to you now .

These steps can help you determine what the best credit card debt forgiveness programs are for you:

Research reputable debt relief companies

Start by researching what well-established debt relief companies offer credit card debt forgiveness programs in your area. It can help to look for organizations accredited by the American Fair Credit Council (AFCC) or the International Association of Professional Debt Arbitrators (IAPDA). These accreditations indicate that the companies adhere to industry best practices and ethical standards.

It's also important to make sure that any debt relief company you're considering is properly licensed to operate in your state. You can check with your state's attorney general's office or consumer protection agency for this information.

Start comparing the top debt relief companies and learn more about your options today .

Check reviews and ratings

Read reviews from previous clients on independent review sites like the Better Business Bureau (BBB), Trustpilot or Consumer Affairs. And, during the process, be sure to pay attention to both positive and negative feedback. That way, you can get a balanced view of each company's performance and reputation.

Take advantage of free consultations

Most reputable debt relief companies offer free initial consultations, so be sure to take advantage of the opportunity to speak with experts from multiple companies. That way, you can ensure you're getting the best possible service and terms . 

During the consultation process, ask each company about their programs, fees and approach to debt settlement. By comparing these factors you can better determine which companies make the most sense for you. Be sure to also ask each company about their success rates in negotiating with creditors. While past performance doesn't guarantee future results, it can give you an idea of the company's effectiveness. 

Consider other factors, like creditor relationships and suppor t

It can also help to look for companies that offer personalized debt forgiveness plans tailored to your specific financial situation. The ability to adjust your plan as your circumstances change can be crucial for long-term success.

You should also weigh whether the debt forgiveness programs have established relationships with creditors, which could potentially lead to more favorable negotiations. While this shouldn't be the sole deciding factor, it could be beneficial if a company has a good track record with your specific creditors.

Evaluate the level of support and communication each debt forgiveness program offers as well. Do they provide a dedicated account manager? How often will you receive updates on your progress? And, what channels are available for you to reach out with questions or concerns?

Be on the lookout for red flags

You may want to rule out any companies that make unrealistic promises or use high-pressure sales tactics . Legitimate debt relief firms will be transparent about the risks and limitations of their programs, so be cautious of any company that:

  • Guarantees to settle all your debts for a specific percentage
  • Promises to settle your debts extremely quickly (e.g., in a few months)
  • Asks for upfront fees before any debts are settled
  • Claims their program has no negative impact on your credit score

The bottom line

If you want to find the best credit card debt forgiveness program, you'll need to do some research and take numerous factors into consideration. After all, while these programs can offer relief for those struggling with overwhelming debt, they're not without risks and drawbacks. But by thoroughly vetting potential debt relief companies, understanding the process and considering all your options, you can make an informed decision about whether debt forgiveness is the right choice for your financial situation — and which program makes the most sense for you.

Angelica Leicht is senior editor for Managing Your Money, where she writes and edits articles on a range of personal finance topics. Angelica previously held editing roles at The Simple Dollar, Interest, HousingWire and other financial publications.

More from CBS News

Credit card delinquencies hit a 12-year high. 4 ways to resolve yours now

How much does a $80,000 HELOC cost monthly?

3 ways mortgage interest rates could drop this August

Gold prices are falling. 5 smart moves to make now

This device is too small

If you're on a Galaxy Fold, consider unfolding your phone or viewing it in full screen to best optimize your experience.

  • Personal Finance

How Many Americans Have a Perfect Credit Score?

Published on July 25, 2024

Christy Bieber

By: Christy Bieber

  • Just 1.31% of Americans have a perfect credit score.
  • The average credit score is 714.
  • Getting your score up to the very good range is sufficient to qualify for the best borrowing rates.

There are different formulas for assigning a credit score. However, FICO is the most commonly used credit scoring formula. FICO® Scores fall on a scale of 300 to 850, with higher scores awarded to those with a better credit history.

A score of 850 is considered to be a perfect score on this scale, but very few people have earned an 850 score or even come close to it.

Having perfect credit is very rare

According to credit score research from The Motley Fool Ascent, just 1.31% of all Americans who have a FICO® Score have earned a perfect 850. However, the rate of people with perfect credit does vary significantly by age.

The table below shows which demographic groups are most likely to have achieved perfection.

Scores by generation Percent of 850 FICO® Scores
Gen Z 0.1%
Millennials 4.1%
Gen X 22.4%
Baby boomer 59.4%
Silent 14.0%

As you can see, older Americans make up a higher percentage of people with perfect credit than their younger counterparts. That's because it's really hard to get a perfect score. You need:

  • A good mix of different kinds of credit. That's why people with 850 scores have more credit cards than the average person and higher personal loan balances, as well.
  • No delinquencies. Delayed payments can send your score plummeting. While the average American has 1.8 delinquent payments, those with perfect scores have none.
  • A lower credit utilization ratio. Using a small percentage of your credit is important, which is why people with perfect scores have lower credit card balances.

A long average age of credit and limited inquiries are also important if you want that coveted 850, as short credit histories and too many inquiries count against you.

Do you need a perfect 850 credit score?

With so few people having perfect credit, you may feel as if achieving a score of 850 is hopeless. The good news is, you don't have to in order to be able to get the best rates and terms from creditors and in order to be able to get approved for the most competitive credit cards .

A credit score between 670 and 739 is considered to be good, while a score of 740 to 799 is classified as very good, and a score of 800 to 850 is considered to be exceptional. Since the average score among all Americans is 714, most people don't have a lot to worry about. They're well within the good credit range, so they shouldn't have too hard a time borrowing.

Of course, it doesn't hurt to try for a perfect 850 if you want. After all, maintaining good credit habits can help you out, as can a high credit score, so you can only benefit from making responsible choices to try to earn a perfect score. No one is ever going to regret paying their bills on time or not carrying large balances on their credit cards, even if their score only ends up in the 750 range.

Just don't be discouraged if you don't achieve 850 -- especially if you're young, since some factors like average credit age can be out of your control. Keep working on improving your personal score, check your credit often to be sure there are no problems, and rest assured if you're above 670, you're probably doing fine. And if you aren't at that level yet, just keep making those on-time payments and whittling away debt and you'll get there.

Alert: our top-rated cash back card now has 0% intro APR until 2025

This credit card is not just good – it’s so exceptional that our experts use it personally. It features a lengthy 0% intro APR period, a cash back rate of up to 5%, and all somehow for no annual fee! Click here to read our full review for free and apply in just 2 minutes.

Our Research Expert

Christy Bieber

Christy Bieber is a full-time personal finance and legal writer with 15 years of experience. She has a JD from UCLA and is a former college instructor.

Share this page

We're firm believers in the Golden Rule, which is why editorial opinions are ours alone and have not been previously reviewed, approved, or endorsed by included advertisers. The Ascent, a Motley Fool service, does not cover all offers on the market. The Ascent has a dedicated team of editors and analysts focused on personal finance, and they follow the same set of publishing standards and editorial integrity while maintaining professional separation from the analysts and editors on other Motley Fool brands.

Related Articles

Cole Tretheway

By: Cole Tretheway | Published on June 7, 2024

Lyle Daly

By: Lyle Daly | Published on June 5, 2024

Christy Bieber

By: Christy Bieber | Published on June 5, 2024

By: Lyle Daly | Published on June 4, 2024

The Ascent is a Motley Fool service that rates and reviews essential products for your everyday money matters.

Copyright © 2018 - 2024 The Ascent. All rights reserved.

  • Share full article

Advertisement

Supported by

Why Paper Checks Refuse to Die

It’s hard to avoid hassle — or fraud — when you’re required to pay with paper and ink. Here’s why checks persist and why some people don’t mind.

An illustration of a street scene in which everyone is carrying a check.

By Ron Lieber

Ron Lieber read several hundred tips from Your Money newsletter subscribers while reporting this column.

Target stopped accepting personal checks as a form of payment this month, which might inspire the following question: What took so long?

Check fraud has more than doubled in recent years, and it costs at least a dollar for businesses to process each check they receive. Plenty of young adults have never even written one.

But if you haven’t used a check in years and consider it a badge of honor, that may say a lot about where you live and what you pay for. In many industries, checks continue to be a popular form of payment, and sometimes they are required.

According to consumer survey data from the Federal Reserve Bank of Atlanta, which tracks the percentage of payments that consumers make by check, the following industries receive the most check payments: Contractors, like electricians and plumbers, get 25 percent of their payments by check. Charitable and religious organizations are next at 22 percent. Landlords, government taxing authorities and professional-service firms also receive double-digit percentages of their payments by check.

There are many pockets of commerce in which checks are required. Readers of Your Money newsletter wrote in to complain about having to write checks for a number of things, like homeowners’ association dues and haircuts, along with dog shows and the occasional long-term care insurance policy.

Some people actually like it this way, though they cite many different reasons. Feelings play a role, as do fees. Fear does, too.

We are having trouble retrieving the article content.

Please enable JavaScript in your browser settings.

Thank you for your patience while we verify access. If you are in Reader mode please exit and  log into  your Times account, or  subscribe  for all of The Times.

Thank you for your patience while we verify access.

Already a subscriber?  Log in .

Want all of The Times?  Subscribe .

IMAGES

  1. (PDF) Credit Card Fraud Detection Using Random Forest Algorithm

    research paper for credit card

  2. (PDF) Credit card fraud and detection techniques: A review

    research paper for credit card

  3. Credit Card Research and Scenarios by Mrs B's Social Studies Store

    research paper for credit card

  4. (PDF) Perceptions about Credit Cards

    research paper for credit card

  5. Credit Cards Essay Example

    research paper for credit card

  6. 😀 Credit card essay papers. Credit Card Essay free essay sample. 2019-02-08

    research paper for credit card

VIDEO

  1. paper credit card and money purse

  2. Unboxing : Aurora AS890C 8-Sheet Cross-Cut Paper/Credit Card Shredder with Basket

  3. Debit card envelope

  4. Embassy 12Sheet Microcut Paper Credit Card Shredder

  5. Embassy 14Sheet Microcut Paper and CD Shredder

  6. 46 % off Aurora AS890C 8-Sheet Cross-Cut Paper/Credit Card Shredder with Basket

COMMENTS

  1. Credit Cards, Credit Utilization, and Consumption

    Figure 1 shows how the average U.S. consumer's credit card limit and debt varied significantly from 2000-2014. From 2000-2008, the average credit card limit increased by approximately 40 percent, from around $10,000 to a peak of $14,000. During 2009, overall limits collapsed rapidly before recovering slightly in 2012.

  2. (PDF) Credit Cards: A Sectoral Analysis

    Objective: This paper aims at sectoral analysis of the credit card industry in India by considering top three credit card issuers i.e., HDFC bank, SBI Cards, and ICICI Bank. Methodology: In order ...

  3. credit cards Latest Research Papers

    Revolving Credit. This paper investigates the implication of bank revolving credit in the form of credit card loans as a channel of monetary policy targeting the federal funds rate since 1980. Credit cards have become increasingly popular and a necessity for many transactions and purchases in the United States.

  4. Credit Card Fraud Detection using Machine Learning Algorithms

    Abstract. Credit card frauds are easy and friendly targets. E-commerce and many other online sites have increased the online payment modes, increasing the risk for online frauds. Increase in fraud rates, researchers started using different machine learning methods to detect and analyse frauds in online transactions.

  5. Credit card fraud detection in the era of disruptive technologies: A

    The work in Al-Hashedi and Magalingam (2021) covers research papers on financial fraud in general from 2009 to 2019 inclusive. It mainly discusses works based on data mining techniques and classifies the literature based on range of factors, including publication year, publisher, method used, and research area (credit fraud, cryptocurrency ...

  6. Modelling customers credit card behaviour using bidirectional LSTM

    The model was trained on a real credit card dataset and the customer behavioural scores are analysed using classical measures such as accuracy, Area Under the Curve, Brier score, Kolmogorov-Smirnov test, and H-measure. ... Therefore, the research of this paper is motivated by the necessity of automatically scoring the customer's behaviour ...

  7. (PDF) Consumers and credit cards: A credit cards: A review of the

    Research in the area of consumer credit card abundance of literature in the business, psychology, and public policy fields. 1960s, the work revolved around descriptive characteristics and evolved as scholars probed deeper by investigating ... Since the first paper on consumer credit cards was published in 1969, researchers have attempted to ...

  8. Determinants of consumers' intention to use credit card: a perspective

    Consumers prefer credit cards due to uncertainty when carrying cash (Khare et al., 2012) or special discounts from famous brands (Dali et al., 2015). They use credit cards as a source of revolving credit with long grace period (Chahal et al., 2014; Khare et al., 2012). They can even withdraw cash by credit cards as required (Chahal et al., 2014).

  9. PDF 2021 Consumer Credit Card Market Report

    Credit cards are central to the financial lives of over 175 million American consumers. Over the last few years and through 2019, the credit card market, the largest U.S. consumer lending market measured by number of users, continued to grow in almost all measures until suddenly reversing course in March 2020.

  10. Examining the dynamics leading towards credit card usage ...

    Many researchers have investigated the consumer's attitude towards using credit cards. However, how the different attributes contribute to credit card usage attitude is not evident. Thus, the main theoretical contribution of this study is to examine the importance and performance of a set of variables that explain the attitude towards using credit cards. It provides essential inputs to ...

  11. (PDF) Credit Card Fraud Detection

    1.3 "A Research Paper on Credit Card Fraud Detection" The proposed model involves pre-processing the credit card transaction data and then apply- ing various

  12. The Impact of Credit Cards on Spending: A Field Experiment

    1 Introduction. In this paper, we report results from the fi rst field experiment to examine the impact of. credit cards on spending, a quest ion of great interest for economics, law and public ...

  13. Credit Card Fraud Detection: A Systematic Review

    When the research is based on big data analytics, there will be a huge volume of data which can be implemented in Apache Hadoop, Spark, etc. Tensorflow, H2O, Pytorch, Keras, etc. are the libraries imported in the application of deep learning. ... Artikis, A., et al.: A prototype for credit card fraud management: industry paper. In: Proceedings ...

  14. Research on Default Prediction for Credit Card Users Based on XGBoost

    1. Introduction. Both the issuance of credit cards and the scale of credit have increased steadily in recent years. According to data from the People's Bank of China, at the end of 2020, the number of credit cards issued totaled 778 million, and the credit balance of credit cards was 7.91 trillion yuan.

  15. Enhanced credit card fraud detection based on attention mechanism and

    As credit card becomes the most popular payment mode particularly in the online sector, the fraudulent activities using credit card payment technologies are rapidly increasing as a result. For this end, it is obligatory for financial institutions to continuously improve their fraud detection systems to reduce huge losses. The purpose of this paper is to develop a novel system for credit card ...

  16. Research article Investigating the associations of consumer financial

    According to the U.S. Credit Card Statistics in 2021, 70.2% of consumers have at least one credit card, and 14% have at least ten. Moreover, the number of credit card accounts increased by 2.5% year-over-year, implying that credit cards have become a primary and vital payment method in modern societies.

  17. Compulsive Buying Behaviour of Credit Card Users and Affecting Factors

    Compulsive buying behaviour and credit card could have a powerful effect on consumers' financial stability. Further, in place of comprehending credit card usage and compulsive buying, this study correlates them with wealth attitudes such as power-prestige, financial knowledge and retention time.

  18. Credit Card Fraud Detection Using Machine Learning

    card statistics 2021) the number of people using credit cards around the world was 2.8 billion in 2019, in addition 70% of those users own a single card at least. Reports of Credit card fraud in the US rose by 44.7% from 271,927 in 2019 to 393,207 reports in 2020. There are two kinds of credit card fraud, the first one is by having a credit

  19. A machine learning based credit card fraud detection using the GA

    The recent advances of e-commerce and e-payment systems have sparked an increase in financial fraud cases such as credit card fraud. It is therefore crucial to implement mechanisms that can detect the credit card fraud. Features of credit card frauds play important role when machine learning is used for credit card fraud detection, and they must be chosen properly. This paper proposes a ...

  20. Credit Card Utilization and Consumption over the Life Cycle and

    Using a large sample of credit bureau data, this paper documents a tight link between available credit (the limit) and credit card debt, and then it offers a model-based interpretation of this linkage. Credit limits change frequently for individuals, increase rapidly on average as people age, and show large changes over the business cycle.

  21. PDF Consumers and credit cards: A review of the empirical literature

    Since the first paper on consumer credit cards was published in 1969, researchers have attempted to develop a demographic profile of credit card consumers. The demographic characteristics most often used were age (22 studies) and gender (20 studies), followed by income (11 studies) and education (5 studies).

  22. A systematic review of literature on credit card cyber fraud detection

    The review investigates the present status of research on detecting cyber fraud in credit card and addresses our research questions. The methodology begins with a description of the data sources, the search strategy, the inclusion and exclusion criteria, as well as the quantity of research article selected from the different databases. ...

  23. Determinantsofconsumers intentiontousecreditcard:a

    Credit cards, a combination of payment card and personal consumption credit, are widely used in around the world. Starting with a relationship between vendors and consumers, as well as a need to buy rst and pay later, Franklin National Bank in New York, the USA, issued rst-ever fi fi credit cards to market in 1951.

  24. PDF Paper Title: Gender-Based Sorting in the Credit Card Market

    Using unique credit bureau data, which tracks consumers from initial entry to subsequent participation in the credit card market, we document significant gender-based sorting across credit card products: women are 35% more likely than men to open a retail store credit card as opposed to a general-purpose credit card when they first

  25. You Don't Need a Perfect Credit Score. Here's a Better Target

    In particular, make sure to pay credit cards and loans on time, because these payments get reported on your credit history. Have at least one credit card that you use regularly. To build credit ...

  26. How to find the best credit card debt forgiveness programs

    After all, the average credit card rate has increased significantly over the last couple of years, climbing from 16.65% in 2022 to nearly 23% currently. That's an uptick of about 36% in just two ...

  27. Why I'm Thinking About Joining Costco Before September

    Review our list of the best credit cards for Costco to find out more. Alert: our top-rated cash back card now has 0% intro APR until 2025 This credit card is not just good - it's so ...

  28. Avoid These 4 Costly Credit Card Rewards Mistakes

    When you use your credit cards to pay for everyday purchases, you can earn points, miles, or cash back that can save you money. Alert: our top-rated cash back card now has 0% intro APR until 2025

  29. How Many Americans Have a Perfect Credit Score?

    According to credit score research from The Motley Fool Ascent, just 1.31% of all Americans who have a FICO® Score have earned a perfect 850. However, the rate of people with perfect credit does ...

  30. Why Paper Checks Refuse to Die

    In Queen Anne's County in Maryland, the service charge for processing a credit-card payment is a 2.95 percent flat fee. If you want to use what the county calls E-checks to pay directly from ...