{"paper_id":"258ef659-aa45-4031-8725-0c0fa9d3cbe2","body_text":"Analysis of Different Machine Learning Models for Credit Card Fraud Detection | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Analysis of Different Machine Learning Models for Credit Card Fraud Detection Harsh Mehta This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-5314340/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract The increase in number of online transactions has led to a significant amount of credit card fraud over the past decade. Unauthorized use of one’s credit card information by stealing the information through dark web or scam calls, poses a major risk to both customer and businesses, particularly in e-commerce setting. This paper presents a comparative analysis of multiple machine learning models for credit card fraud detection, including logistic regression, isolation forest, K – mean clustering, and convolutional neural networks. With a highly unbalanced dataset we aim to evaluate these models’ performance in differentiating between genuine and fraudulent transactions based on features such as transaction history, user details, and merchant information. Our experiment results will help provide insights into effectiveness of each model for finding patterns to distinguish between real and fake that can be applied to real world data. This research contributes to the field of financial security by offering guidance on model selection for credit card fraud detection and related applications. View this project here. Credit card fraud machine learning logistic regression isolation forest k-mean clustering convolutional neural network financial security Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Figure 9 Figure 10 Figure 11 Figure 12 Figure 13 Figure 14 Figure 15 Figure 16 Figure 17 Figure 18 Figure 19 Figure 20 Figure 21 Figure 22 Figure 23 Figure 24 Figure 25 Figure 26 Figure 27 I. INTRODUCTION The rapid growth of online financial transactional methods are seen in the recent times and adopted widely because it’s easy, reliable, and faster in multiple aspects compared to traditional payment methods. Among this online credit card fraud has been a concerning issue that challenges the security and integrity of information that can be circulated through internet. This paper will help future peers in understanding and choosing models according to their build requirements. A. Background on credit card frauds Credit card frauds have become a significant threat in the coming digital age, possessing an enormous financial risk to individual, businesses and the global financial system. As e-commerce and digital transactions grow with time so does the fraudulent activities. Credit card fraud generally occurs when unauthorized individual gain access to card information through various means like data breach, skimming devices, or phishing attacks. These scammers then use the stolen information to make unauthorized purchases or even cash withdrawals, often resulting in financial losses for cardholder and merchants. The problem goes beyond the loss of money as it affects the trust in digital payment systems, and potentially leads to long term economic instability if left unchecked. B. Current challenges in detection The detection and prevention of credit card fraud presents several challenges for developers and organizations trying to deal with it. One of the primary obstacles is working with high dynamic nature of fraudulent activities, with scammers always changing and adapting new methods to cheat the detection system. This makes it necessary to keep evolving our detection methods to stay ahead of emerging threats and avoid before it even takes place. The number of genuine transactions vastly outnumber fraudulent one, this results in having a dataset where fraud transactions represent very minute number of the whole dataset. This imbalance creates biased models that prioritize the majority class, which might miss critical fraud transaction. Additionally, the sensitive nature of financial data often limits access to real world datasets, making it very difficult for researchers and developers to build and test a model. C. Our approach and its significance Our approach to address this issue involves a performance analysis of multiple machine learning model applied to credit card fraud detection. Using a dataset from Kaggle named “Credit Card Transactions Fraud Detection Dataset” ( Brandon, 2022 ) which mimics real world transaction pattern while preserving user’s privacy, we implemented a unique methodology where we evaluate the effectiveness of different models like: regression model, decision tree model, clustering model and convolutional neural network (CNN). We compare the performance of these models across multiple metrics, such as classification report, confusion metrics, AUC-ROC scores and feature importance analysis, through this we aim to find relative strengths and weaknesses in the context of credit card fraud detection. This performance evaluation contributes to providing help in ongoing efforts for improvement in fraud detection systems and offers valuable guidance to future peers in selecting and implementing appropriate model according to needs for similar security applications. II. LITERATURE REVIEW A. Credit Card Fraud Detection using Machine Learning and Data Science, DOI: ISSN: 2278 − 0181 , (S P Maniraj, 2019 ) Fraud detection in credit card transactions has been a subject of extensive research due to its significant financial implications. Previous studies have explored various data mining applications and machine learning techniques for automated fraud detection. Supervised and unsupervised learning methods have been applied to this domain, with varying degrees of success. Some researchers have utilized outlier mining and distance sum algorithms to predict fraudulent transactions in emulated credit card transaction datasets. While these methods have shown promise in certain areas, they have not provided a consistent and permanent solution to the fraud detection problem. More recent approaches have incorporated advanced techniques such as hybrid data mining/complex network classification algorithms. These methods have demonstrated effectiveness in detecting illegal instances in real card transaction datasets, particularly for medium-sized online transactions. Efforts have also been made to improve the alert feedback interaction in fraudulent transaction detection systems. Artificial Genetic Algorithms have been explored as a novel approach, showing accuracy in identifying fraudulent transactions while minimizing false alerts. However, these methods often face challenges related to classification problems with variable misclassification costs. The ongoing research in this field continues to seek more robust and adaptable solutions to address the evolving nature of credit card fraud. B. A Research Paper on Credit Card Fraud Detection , (BORA MEHAR SRI SATYA TEJA, 2022 ) The paper explores various techniques used in credit card fraud detection, including outlier detection, unsupervised outlier detection, peer group analysis, and breakpoint analysis. Outlier detection identifies abnormal transactions that deviate from a user's typical behaviour, but it may misclassify legitimate unusual transactions. Unsupervised outlier detection focuses on understanding customer transaction patterns without predicting specific outcomes. Peer group analysis compares entities with similar characteristics to identify anomalies. Breakpoint analysis examines structural changes in data to detect anomalies. The authors note that while supervised learning methods are commonly used in fraud detection, they may fail in certain cases. The paper highlights the challenge of class imbalance in fraud detection datasets, where genuine transactions significantly outnumber fraudulent ones. This imbalance can lead to difficulties in accurately identifying fraudulent activities. The researchers also discuss the concept of \"concept drift,\" where transaction patterns change over time, further complicating the fraud detection process. To address these challenges, the paper proposes using machine learning algorithms such as Decision Trees and Random Forests, along with techniques like oversampling to mitigate class imbalance issues. C. A machine learning based credit card fraud detection using the GA algorithm for feature selection, DOI : 10.1186/s40537-022-00573-8 , (Emmanuel Ileberi, 2022 ) The literature survey on credit card fraud detection reveals a growing interest in machine learning techniques to address this critical issue in financial security. Researchers have explored various approaches, including supervised and unsupervised learning methods, to improve the accuracy and efficiency of fraud detection systems. Several studies have focused on the application of traditional machine learning algorithms such as Support Vector Machines (SVM), Decision Trees, and Neural Networks. These methods have shown promising results in identifying fraudulent transactions, although they often face challenges related to imbalanced datasets and the dynamic nature of fraud patterns. Recent research has increasingly turned towards ensemble methods and hybrid approaches to enhance fraud detection capabilities. Random Forest and Gradient Boosting algorithms have gained popularity due to their ability to handle complex, high-dimensional data and their robustness against overfitting. Additionally, some studies have explored the potential of deep learning techniques, including Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks, to capture intricate patterns in transaction data. These advanced methods have demonstrated improved performance in detecting subtle fraud patterns that may be missed by traditional approaches. A significant trend in the literature is the focus on feature engineering and selection techniques to improve model performance. Researchers have employed various methods, including Principal Component Analysis (PCA), Genetic Algorithms, and domain-specific feature extraction, to identify the most relevant attributes for fraud detection. Moreover, there is a growing emphasis on developing real-time fraud detection systems that can adapt to evolving fraud patterns and provide timely alerts. Despite these advancements, the literature highlights ongoing challenges in credit card fraud detection, including the need for more representative and up-to-date datasets, addressing class imbalance issues, and developing interpretable models that can provide insights into fraudulent behaviour patterns. D. Review of Machine Learning Approach on Credit Card Fraud Detection, DOI : 10.1007/s44230-022-00004-0 , (Rejwan Bin Sulaiman, 2022) This review examines various machine learning techniques for credit card fraud detection (CCFD), focusing on their effectiveness, limitations, and privacy considerations. The paper discusses several algorithms, including Random Forest (RF), Artificial Neural Networks (ANN), Support Vector Machines (SVM), and K-Nearest Neighbors (KNN). Each method demonstrates unique strengths and weaknesses in handling CCFD tasks. For instance, Random Forest shows promise in handling large datasets but may be slower in real-time scenarios. ANN, particularly when used in unsupervised learning, demonstrates high accuracy and fault tolerance, making it a strong contender for CCFD applications. SVM performs well with smaller feature sets but struggles with larger volumes of data, while KNN offers high accuracy and efficiency but faces challenges with memory usage and performance degradation on extensive datasets. The review highlights a critical challenge in CCFD: balancing effective fraud detection with data privacy and confidentiality. Traditional centralized approaches to fraud detection face limitations due to data sharing restrictions imposed by regulations like GDPR. Even anonymized datasets stored locally on servers’ risk being reverse-engineered, potentially compromising user privacy. This privacy concern is a recurring theme across various machine learning approaches discussed in the paper, emphasizing the need for more secure and privacy-preserving methods in CCFD. To address these challenges, the paper proposes a hybrid approach combining Federated Learning (FL) with Artificial Neural Networks. This innovative model aims to train data locally on edge devices, sharing only the trained model among participating institutions. This approach potentially enhances fraud detection accuracy while maintaining strict privacy standards. By allowing banks and financial centres to collaborate without directly sharing sensitive customer data, the proposed method offers a promising solution to the privacy-accuracy trade-off in CCFD. The authors suggest that this hybrid model could significantly improve fraud detection capabilities while ensuring compliance with data protection regulations, marking a potential advancement in the field of credit card fraud detection. E. A Review Paper on Feature Selection in Credit Card Fraud Detection , (Surbhi Bansal, 2024 ) Credit card fraud detection has been a subject of extensive research due to its significant economic impact. Researchers have compared the performance of various machine learning techniques such as Support Vector Machines, Random Forests, and Logistic Regression in detecting credit card fraud, highlighting the importance of feature selection in improving model accuracy. The challenge of class imbalance in fraud detection has also been addressed, with proposed methods combining techniques like SMOTE and random under sampling. These works have emphasized the need for adaptive learning techniques in handling evolving fraud patterns. Feature selection in fraud detection has seen increasing attention, with researchers exploring various approaches. The effectiveness of transaction aggregation for creating behavioural features has been demonstrated, significantly improving fraud detection rates. Scalable real-time fraud detection systems using feature engineering and hybrid methods have been proposed, showcasing the importance of both domain expertise and machine learning in feature creation. More recently, Swarm Intelligence techniques have been applied for feature selection in fraud detection, demonstrating improved model performance and interpretability compared to traditional methods. F. Credit card fraud detection using machine learning , (Mr. Thirunavukkarasu.M, 2021 ) Credit card fraud detection has been an active area of research due to its significant economic impact. Previous studies have compared the performance of various machine learning techniques such as Support Vector Machines, Random Forests, and Logistic Regression for detecting credit card fraud, with Random Forests often outperforming other methods. Research has also demonstrated the effectiveness of transaction aggregation combined with Random Forests for fraud detection, showing improved results over single transaction analysis. In recent years, machine learning approaches have gained prominence in fraud detection. Researchers have addressed the challenge of class imbalance in credit card fraud detection datasets, proposing methods that combine under sampling with different algorithms to improve overall performance. Comprehensive reviews of intelligent fraud detection techniques have highlighted the potential of ensemble methods like Random Forests in handling complex, high-dimensional data typical in financial transactions. The application of deep learning to credit card fraud detection has also emerged as a promising direction. Studies have explored the use of Long Short-Term Memory (LSTM) networks for sequence classification in credit card fraud detection, showing that incorporating transaction sequences can enhance detection accuracy compared to traditional methods. However, while deep learning models can offer improved performance, they often lack the interpretability of simpler models like Random Forests, which remains an important consideration in the financial industry. III. OBJECTIVES A. Understanding various ML models with respect to credit card fraud detection We aim to explore and analyze different machine learning models, specifically logistic regression, isolation forest, k-means clustering, and convolutional neural network, with respect to credit card fraud detection. We will understand the principle of each model and how are they used to identify fraud transactions. B. Performance analysis of ML models We will evaluate each model performance in detecting credit card fraud. This includes assessing their ability to correctly identify fraudulent transactions while minimizing false positives. This analysis is based on factors like accuracy, precision, and recall to provide an overall view of each models effectiveness. C. Assessing the effectiveness of each model using different metrics To ensure our model is performing well we will use various performance metrics beyond basic accuracy. This includes confusion matrices, AUC-ROC curves and F1 scores, by using these factors we will aim to find out more about the strengths and weaknesses of each detection models. D. Provide recommendation for the ML model Based on our analysis we will provide insights and recommendation on which model perform best for credit card fraud detection. These recommendations will consider factors such as model performance, computational requirements and ease of implementation providing guidance to future peers. E. Understanding features that affect the model development We will understand the importance of different features in the dataset and their impact on the performance of each model. This involves conducting feature importance analysis to identify which transaction characteristics are most crucial in determining whether a transaction is fraudulent or legitimate. IV. PROPOSED METHODOLOGY To develop this credit card fraud detection project using various machine learning models we have taken the following steps that helps us understand this project from scratch: A. System overview Our credit card fraud detection system follows the given workflow: Data Ingestion: Raw data that’s downloaded from Kaggle is fed to the system without any preprocessing or scaling. Preprocessing: The data undergoes cleaning through various methods and techniques to modelling can be done on the data that makes sense. Data Scaling: The numerical features are normalized in the data so the model can ensure to provide consistent outputs. Applying Pretrained Models: We use four different pre trained machine learning models on the preprocessed data. Classification Report and Metrics: Performance metrics and reports are produced for each model. B. Dataset Description The dataset used in this project is downloaded from Kaggle the dataset originally belongs to Brandon Harris and generated using a simulator ( Brandon, 2022 ). This data consists of legitimate and fraud transactions details from Jan 2019 till Dec 2020, and consist of card details of over 1000 customers and 800 merchants. This data generated creates easy to use fraud transaction dataset which is a representation of real-life transactions it contains two files named “fraudTrain” and “fraudTest” both of them combining contains over 1.5 million various transactions. C. Data Preprocessing In our preprocessing pipeline we: Convert date to datetime: The time features is converted to datetime for better interpretability. Extracting features from datetime: We extract additional features like hour, day and month to capture temporal patterns. Dropping unnecessary columns: Removing redundant and non-informative columns are always helpful for better model interpretability. Scaling the data: Numerical features are scaled using standard scalers to ensure all features contribute for model development. D. Model Description We are using four different types of models and they all work and train themselves using the data differently: Logistic Regression Model: In statics the logistic regression model helps in estimating the probability of an event taking place provided on the provided dataset, and helps analyze the relationship between factors. This would fit well as the model can mark the fake detection as odds and log them for future predictions. Isolation forest model: This algorithm is used for anomaly detection in the data with the help of binary trees. This algorithm is ideal for credit card fraud detection as it has a low time complexity and memory use that works well with huge amount of data too. K-Mean clustering: This is an unsupervised machine learning algorithm, which helps group unlabeled data into multiple groups or clusters. It creates a centroid in the data and based on the distance it classifies or categorize the data. This model will theoretically fit well as the model and create two cluster of real and fake and predict using their centroids. Convolutional Neural Network: CNN comes under deep learning and is a type of neural network that usually creates 3 layers: input, hidden, and output. It will help in Local Pattern Detection, and Feature Extraction and generally works well with large volume of data. E. Training Process Even though the preprocessing method for all the four models is the same but each one of them will undergo a different training process: Logistic Regression and Isolation Forest: They will be directly trained on the pre-processed data with default hyperparameter. K-Means: Here the number of clusters would be determined using elbow method before training. CNN: The network architecture would be modified according to the tabular data with multiple convolutional layers. The training would go on for 10 rounds with early stopping to prevent overfitting. F. Evaluation and Analysis We will evaluate the model using metrics such as: Accuracy: Overall correctness of the model. Precision and Recall: To access model performance on minority class. F1-score: The harmonic mean of precision and recall. ROC-AUC: To check models’ ability to distinguish between different classes. Confusion Matrix: To visualize model performance across all outcomes. Feature Importance Analysis: To check which feature in the dataset is most important for fraud detection. Finally, we will note down all the results and check how each model performs in various metrics and also note down the time and computational power that was required for each model to give the final predictions. TABLE I Requirements for deepfake detection model Hardware Requirements Software Requirements Graphic Card (Recommended): - NVIDIA GPU with CUDA support (Optional but recommended). Compute Resources: - 8 core CPU. - Adequate RAM (8GB or above) Storage: - SSD with at least 20GB free space Network Infrastructure: - High-speed Internet Connection. Operating System: - Windows 10/11, macOS, or Linux (Ubuntu 18.04 or later recommended) CUDA Toolkit: - Version compatible with PyTorch and GPU (Optional: works only with graphic cards) Necessary Library: - numpy - scikit-learn - matplotlib - seaborn - pandas Development Tools: - Anaconda - Jupyter Notebook These are general requirements V. RESULTS AND DISCUSSION Evaluation of each model is necessary to understand and rank the models accordingly. As discussed earlier we will evaluate all the four model on different metrics: Training and Testing accuracy: It’s the proportion of correct prediction by total number of cases. It’s used check training vs testing set to assess overfitting. Classification Report: It’s a summary of the key classification’s metrics including precisions, recall score, and F1-score for each class. It helps us provide a comprehensive view of model’s performance AUC-ROC and Average Precision Score: The AUC-ROC measures the model’s ability to differentiate between classes among different threshold. The average precision scores summarize the precision-recall curve as the weighted mean of precisions. achieved at each threshold Confusion Matrix: It’s a table showcasing the number of correct and incorrect predictions made by the model. This helps us provide a breakdown of model’s performance and understand error types. Feature Importance: This is the measure of the features which contributes to the prediction of the model. This helps us provide transaction characteristics and that provides insights for feature engineering and model interpretation. A. Logistic Regression Model Performance Images are available in the Figures carousel. B. Isolation Forest Model Performance Images are available in the Figures carousel. C. K-Mean Model Performance Images are available in the Figures carousel. D. Convolutional Neural Network Model Performance Images are available in the Figures carousel. Final Analysis of all the 4 models and their performance. Logistic Regression K- Mean Isolation forest CNN - Accuracy: 0.88 - F1 score: 0.94 - Recall: 0.88 - Accuracy: 0.55 - F1 score: 0.71 - Recall: 0.55 - Accuracy: 0.98 - F1 score: 0.99 - Recall: 0.99 - Accuracy: 0.98 - F1 score: 0.99 - Recall: 0.99 Computational Power: Low Computational Power: Low Computational Power: Medium Computational Power: High - AUC-ROC score: 0.91 - Average Precision Score: 0.15 - AUC-ROC score: 0.52 - Average Precision Score: 0.005 - AUC-ROC score: 0.54 - Average Precision Score: 0.006 - AUC-ROC score: 0.99 - Average Precision Score: 0.80 Top Feature: Amount Top Feature: Gender_M Top Feature: Category_personal_care Top Feature: Amount VI. CONCLUSION This study compares four machine learning models- Logistic Regression, K-Means clustering, Isolation Forest, and convolutional Neural Network (CNN) for credit card fraud detection. We evaluated these models using various set of metrics including accuracy, F1 score, recall, AUC-ROC score. Our results reveal the performance of various models. The CNN model is able to generate a model with accuracy of 0.98 F1 score of 0.99, and AUC-ROC of 0.99. However, this superior performance comes at cost of high computational power. The logistic regression comes out as a good model with good performance showcasing scores with accuracy of 0.88, F1 score of 0.94, AUC-ROC of 0.91 and also has low computational power. Therefore, these two models emerge as a viable option for real-time fraud detection as well where accuracy is important and computational power is optional. Interestingly, the Isolation Forest model achieves a high accuracy of .98 compared to CNN, but its low AUC-ROC score shows that there might be some potential issue with class separations. This tells us that it’s important to consider multiple metrics in evaluating model performance, particularly in imbalanced classification problems like fraud detection. The K-Means clustering performs poorly across all metrics showcasing its not an ideal model to predict credit card fraud detection, this also indicates that unsupervised learning methods may not fit well with problems like credit card fraud detections. These models explain the trade-off between models’ complexity and performance in credit card fraud detection. Where models like CNN provides higher detection capability but models like Logistic regression offer strong balance between accuracy and computational efficiency. Finally, the choices of the models should be done based on specific requirements and constraints of fraud detection system that is needed to be developed. This study helps contributing into the ongoing studies and development that happening around credit card fraud detection. Future Work can explore ensemble modelling techniques that uses strength of different models to improve the detection mechanism and develop computational efficient models that can run on any device with minimal requirements. Declarations ACKNOLODGEMENT I would like to extend my sincere and heartfelt thanks to my professor Dr. Pallavi who guided me by reviewing and providing feedback throughout this research and actively encouraged me to complete this work. I would also take a moment to appreciate Mr. Arun Samanta for taking out his time in reviewing my work and providing me a plagiarism report on the paper to keep this work original. The journey would not have been completed without the resources and knowledge provided by these faculties at presidency university. I am also very grateful to the services provided by OpenAI and Anthropic for their resources and tools for various help that includes resources collection, content paraphrasing and debugging coding errors. Finally, I would like to thank Google scholar and Research gate for providing relevant articles which helped later during the development of project. References AlEmad M (2022) Credit Card Fraud Detection Using Machine Learning. RIT Digital Institutional Repository BORA MEHAR SRI SATYA TEJA BM (2022) A Research Paper on Credit Card Fraud Detection. Int Res J Eng Technol, 1–4 Brandon H (2022) Synthetic Credit Card Transaction Generator used in the Sparkov program . Retrieved from GitHub: https://github.com/namebrandon/Sparkov_Data_Generation Emmanuel Ileberi YS (2022) A machine learning based credit card fraud detection using the GA algorithm for feature selection. J Big data, 2–15 Harris B (2020) Credit Card Transactions Fraud Detection Dataset. Retrieved from Kaggle: https://www.kaggle.com/datasets/kartik2112/fraud-detection Mr. Thirunavukkarasu.M AN (2021) CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING. Int J Comput Sci Mob Comput, 2–7 Rejwan B, Sulaiman VS (2022) Review of Machine Learning Approach on Credit Card Fraud Detection. Human-Centric Intell Syst, 1–12 Maniraj SP, A. S (2019) Credit Card Fraud Detection using Machine Learning and Data Science. Int J Eng Res Technol, 2–4 Surbhi Bansal RH (2024) A Review Paper on Feature Selection in Credit Card Fraud Detection. International joint conference on computing sciences , 1–5 Vaishnavi N, Dornadulaa GS (2019) Credit Card fraud Detection using Machine Learning Algorithms. International Conference on Recent Trends in Advanced Computing , 3–9 Additional Declarations The authors declare potential competing interests as follows: Nil Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {\"props\":{\"pageProps\":{\"initialData\":{\"identity\":\"rs-5314340\",\"acceptedTermsAndConditions\":true,\"allowDirectSubmit\":true,\"archivedVersions\":[],\"articleType\":\"Research Article\",\"associatedPublications\":[],\"authors\":[{\"id\":369242515,\"identity\":\"8b3f73a7-8f6e-4f11-9284-77e8de05d148\",\"order_by\":0,\"name\":\"Harsh Mehta\",\"email\":\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABAUlEQVRIiWNgGAWjYBACAyBmBrN4mA8YfDCwAbIYGw8QqYUtoXBGQRpISwOxWngMPvN8OAxm49Vizn728ecChsNy5j3HEjfwGJy3W9t+GGhLjU00Li2WPelm0jMYDhvLnG0+bCBhcDt525lEoJZjabkNuBx2II2NmYfhduIMfrY0AwOgFrMDQC2MDYdxazn/jPkzRAuP+Y8Eg3PJZucfEtByI41BGqyFt8fA4IDBATuzG4RsufGMTZrH4L+xBM+xBMMGg+QEsxtAWxLw+eV8GtBhFWlyEjzJB4z//LGzNzuf/vDBhxobnFqgGhHMRLDKBLzK0YA9KYpHwSgYBaNgZAAAf3xhggSleMIAAAAASUVORK5CYII=\",\"orcid\":\"\",\"institution\":\"Presidency University, Bangalore\",\"correspondingAuthor\":true,\"prefix\":\"\",\"firstName\":\"Harsh\",\"middleName\":\"\",\"lastName\":\"Mehta\",\"suffix\":\"\"}],\"badges\":[],\"createdAt\":\"2024-10-22 21:04:05\",\"currentVersionCode\":1,\"declarations\":{\"humanSubjects\":false,\"vertebrateSubjects\":true,\"conflictsOfInterestStatement\":true,\"humanSubjectEthicalGuidelines\":false,\"humanSubjectConsent\":false,\"humanSubjectClinicalTrial\":false,\"humanSubjectCaseReport\":false,\"vertebrateSubjectEthicalGuidelines\":true},\"doi\":\"10.21203/rs.3.rs-5314340/v1\",\"doiUrl\":\"https://doi.org/10.21203/rs.3.rs-5314340/v1\",\"draftVersion\":[],\"editorialEvents\":[],\"editorialNote\":\"\",\"failedWorkflow\":false,\"files\":[{\"id\":67416083,\"identity\":\"e77f2b06-4095-4fd0-bda2-c087807b6f55\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":1,\"title\":\"Figure 1\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":52064,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 1.1 Workflow diagram of credit card fraud detection model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage1.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/c2eb922a819d9aabfa2e9c77.png\"},{\"id\":67416816,\"identity\":\"988daa6f-6664-45c5-be57-9d8e9945c81b\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:52:01\",\"extension\":\"png\",\"order_by\":2,\"title\":\"Figure 2\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":32325,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 1.2 Workflow diagram of credit card fraud detection model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage2.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/e6a4c90b2c88271fc3a65efa.png\"},{\"id\":67416084,\"identity\":\"24047a64-bbba-402c-b45f-9d5e31c3521e\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":3,\"title\":\"Figure 3\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":33822,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 1.3 Workflow diagram of credit card fraud detection model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage3.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/93399aa312f447448447010b.png\"},{\"id\":67416814,\"identity\":\"81cdac7e-f573-4fc8-b7a7-4238b7e7d0b1\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:52:01\",\"extension\":\"png\",\"order_by\":4,\"title\":\"Figure 4\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":71183,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 2. Architecture of credit card fraud detection model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage4.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/f67497f005f46bacf2edd250.png\"},{\"id\":67416670,\"identity\":\"73cef104-2710-4ad9-bd0e-080b5b05b9e5\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":5,\"title\":\"Figure 5\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":3859,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.1. Training and Testing scores using Logistic regression model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage5.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/c4d36fe1b8974b4c7a91f323.png\"},{\"id\":67416081,\"identity\":\"78b2afd3-b451-4e54-8fbf-477e37d3f73c\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":6,\"title\":\"Figure 6\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":14408,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.2. Classification report of Logistic regression model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage6.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/61b8e8892569fd9e0f1fde00.png\"},{\"id\":67416087,\"identity\":\"325679c0-2d9d-4d61-a5e2-ea50d612da69\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":7,\"title\":\"Figure 7\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":52906,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.3. ROC Curve of Logistic regression model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage7.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/74677ef95afdd42d48e072ec.png\"},{\"id\":67416674,\"identity\":\"f90154a1-bdde-42de-81cd-2b347597f41f\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":8,\"title\":\"Figure 8\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":3987,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.4. AUC-ROC and Average Precision score of Logistic regression\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage8.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/19521bca527d8962c8e4f6d7.png\"},{\"id\":67417607,\"identity\":\"f923aef3-cf26-4e61-9282-61a7852fcb7e\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 17:08:01\",\"extension\":\"png\",\"order_by\":9,\"title\":\"Figure 9\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":23026,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.5. Confusion Matrix of Logistic regression model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage9.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/2ec4be08d99c5a6a5f94bcb9.png\"},{\"id\":67416817,\"identity\":\"21364e4c-2399-44ee-84cd-43f1c3946531\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:52:01\",\"extension\":\"png\",\"order_by\":10,\"title\":\"Figure 10\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":47989,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 3.6. Feature Importance Analysis of Logistic regression model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage10.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/1123c78a9bd89bd00d600a7d.png\"},{\"id\":67416679,\"identity\":\"7ccdce8c-eb24-4dd5-a8e5-dc3a39526a26\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":11,\"title\":\"Figure 11\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":5626,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.1. Training and Testing scores using Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage11.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/c60aa8cdf00bc9fb06db114a.png\"},{\"id\":67416671,\"identity\":\"027aaa32-c8b8-4694-b24a-83668616e1de\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":12,\"title\":\"Figure 12\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":14610,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.2. Classification report of Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage12.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/a83d5790713871cb661b4c9c.png\"},{\"id\":67417608,\"identity\":\"4afb369b-6723-4402-8dff-621b5d2bbfda\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 17:08:01\",\"extension\":\"png\",\"order_by\":13,\"title\":\"Figure 13\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":43876,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.3. ROC Curve of Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage13.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/19691cd3417dd48f9efbcab9.png\"},{\"id\":67416097,\"identity\":\"5a693bd1-085c-49af-8218-d5a686491601\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":14,\"title\":\"Figure 14\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":3087,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.4. AUC-ROC and Average Precision score of Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage14.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/a253e85608bfa485c44b2521.png\"},{\"id\":67416103,\"identity\":\"18cfdb5a-cc24-42a2-82a4-335f9b3cb3da\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":15,\"title\":\"Figure 15\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":39672,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.5. Confusion Matrix of Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage15.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/dc668e8cda3697dcce716ca2.png\"},{\"id\":67417408,\"identity\":\"2a5985eb-a3aa-442a-9fba-70dc08b6fab9\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 17:00:01\",\"extension\":\"png\",\"order_by\":16,\"title\":\"Figure 16\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":41992,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 4.6. Feature Importance Analysis of Isolation Forest model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage16.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/e995d0b58d69bd5f767d2452.png\"},{\"id\":67416672,\"identity\":\"d4d79100-e1cf-4e00-8116-53c3794b4b59\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":17,\"title\":\"Figure 17\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":3672,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.1. Training and Testing scores using K-Means model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage17.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/44425fcfe7411d4027295b0f.png\"},{\"id\":67417410,\"identity\":\"9e71d810-13dc-4b06-9d00-0a5801ca6ad8\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 17:00:01\",\"extension\":\"png\",\"order_by\":18,\"title\":\"Figure 18\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":13965,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.2. Classification report of K-Means model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage18.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/c50644f942f8b07f517a21aa.png\"},{\"id\":67416107,\"identity\":\"ac8330b1-aa4d-423a-b5ea-9858da557e0d\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":19,\"title\":\"Figure 19\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":76316,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.3. Clustering of K-means model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage19.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/76e6a7ea81fe2cd04a57bb69.png\"},{\"id\":67416089,\"identity\":\"3c6374ca-ed7f-4c4b-973c-c953a145ed84\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":20,\"title\":\"Figure 20\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":4052,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.4. AUC-ROC and Average Precision score of K-means model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage20.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/15b05eaa2fe201f08d867f99.png\"},{\"id\":67416085,\"identity\":\"a4bc9d9e-70d6-4d2f-931a-ae46ec17159d\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":21,\"title\":\"Figure 21\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":24657,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.5. Confusion Matrix of K-Means model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage21.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/d93dba9dba27879b96edb972.png\"},{\"id\":67416106,\"identity\":\"b3cfa251-8333-45eb-80ff-fc8436b9d754\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":22,\"title\":\"Figure 22\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":39717,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 5.6. Feature Importance Analysis of K-Means model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage22.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/2b001fc2d1a2773cdff1a291.png\"},{\"id\":67416683,\"identity\":\"7bb64eeb-55a2-4cf7-8b42-98e4932d2583\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:44:01\",\"extension\":\"png\",\"order_by\":23,\"title\":\"Figure 23\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":6735,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 6.1. Training and Testing scores using CNN model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage23.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/878aa1f92206e55902b31976.png\"},{\"id\":67416818,\"identity\":\"9a2649d1-f6f2-4044-9fca-936dba1c11da\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:52:01\",\"extension\":\"png\",\"order_by\":24,\"title\":\"Figure 24\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":27264,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 6.2. Classification report of CNN model.\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage24.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/cec2cda1aa5596d3c8d51c09.png\"},{\"id\":67416104,\"identity\":\"5218b3c4-cd9c-4dc0-9612-f88862f834eb\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":25,\"title\":\"Figure 25\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":49500,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 6.3. ROC Curve of CNN model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage25.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/70072402a414ff313b55e88f.png\"},{\"id\":67416105,\"identity\":\"d740e527-1925-483e-8552-03be782d098a\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":26,\"title\":\"Figure 26\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":8047,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 6.4. AUC-ROC and Average Precision score of CNN model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage26.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/92b2360166c3ae1e0476625b.png\"},{\"id\":67416092,\"identity\":\"81e4af5c-ac61-4db2-ab1d-a4a780c18eb0\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 16:36:01\",\"extension\":\"png\",\"order_by\":27,\"title\":\"Figure 27\",\"display\":\"\",\"copyAsset\":false,\"role\":\"figure\",\"size\":33954,\"visible\":true,\"origin\":\"\",\"legend\":\"\\u003cp\\u003eFig. 6.5. Feature Importance Analysis of CNN model\\u003c/p\\u003e\",\"description\":\"\",\"filename\":\"floatimage27.png\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/1369b7c889a11bc021d7c78a.png\"},{\"id\":67417635,\"identity\":\"1d4793db-18c9-4eb2-8595-1822d98d89d4\",\"added_by\":\"auto\",\"created_at\":\"2024-10-24 17:08:10\",\"extension\":\"pdf\",\"order_by\":0,\"title\":\"\",\"display\":\"\",\"copyAsset\":false,\"role\":\"manuscript-pdf\",\"size\":1133892,\"visible\":true,\"origin\":\"\",\"legend\":\"\",\"description\":\"\",\"filename\":\"manuscript.pdf\",\"url\":\"https://assets-eu.researchsquare.com/files/rs-5314340/v1/af30b570-5aec-466d-9f87-4f5b89b56756.pdf\"}],\"financialInterests\":\"The authors declare potential competing interests as follows: Nil\",\"formattedTitle\":\"\\u003cp\\u003eAnalysis of Different Machine Learning Models for Credit Card Fraud Detection\\u003c/p\\u003e\",\"fulltext\":[{\"header\":\"I. INTRODUCTION\",\"content\":\"\\u003cp\\u003eThe rapid growth of online financial transactional methods are seen in the recent times and adopted widely because it’s easy, reliable, and faster in multiple aspects compared to traditional payment methods. Among this online credit card fraud has been a concerning issue that challenges the security and integrity of information that can be circulated through internet. This paper will help future peers in understanding and choosing \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e according to their build requirements.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eA. Background on credit card frauds\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eCredit card frauds have become a significant threat in the coming digital age, possessing an enormous financial risk to individual, businesses and the global financial system. As e-commerce and digital transactions grow with time so does the fraudulent activities. Credit card fraud generally occurs when unauthorized individual gain access to card information through various means like data breach, skimming devices, or phishing attacks. These scammers then use the stolen information to make unauthorized purchases or even cash withdrawals, often resulting in financial losses for cardholder and merchants. The problem goes beyond the loss of money as it affects the trust in digital payment systems, and potentially leads to long term economic instability if left unchecked.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eB. Current challenges in detection\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eThe detection and prevention of credit card fraud presents several challenges for developers and organizations trying to deal with it. One of the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"primary\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eprimary\\u003c/em\\u003e obstacles is working with high dynamic nature of fraudulent activities, with scammers always changing and adapting new methods to cheat the detection system. This makes it necessary to keep evolving our detection methods to stay ahead of emerging threats and avoid before it even takes place. The number of genuine transactions vastly outnumber fraudulent one, this results in having a dataset where fraud transactions represent very minute number of the whole dataset. This imbalance creates biased \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e that prioritize the majority class, which might miss critical fraud transaction. Additionally, the sensitive nature of financial data often limits access to real world datasets, making it very difficult for researchers and developers to build and test a \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel.\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eC. Our approach and its significance\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eOur approach to address this issue involves a performance analysis of multiple machine learning \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e applied to credit card fraud detection. Using a dataset from \\u003cem class=\\\"Highlight ht6bbde3a5-ff65-4ca4-808a-27bcf7eafcf3\\\" highlight=\\\"true\\\" htmatch=\\\"kaggle\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eKaggle\\u003c/em\\u003e named “Credit Card Transactions Fraud Detection Dataset” (\\u003cem class=\\\"Highlight htf42ccfb9-5a20-4c00-a242-49e5af408730\\\" highlight=\\\"true\\\" htmatch=\\\"bra*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eBrandon,\\u003c/em\\u003e \\u003cspan citationid=\\\"CR3\\\" class=\\\"CitationRef\\\"\\u003e2022\\u003c/span\\u003e) which mimics real world transaction pattern while preserving user’s privacy, we implemented a unique methodology where we evaluate the effectiveness of different \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e like: regression \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel,\\u003c/em\\u003e decision tree \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel,\\u003c/em\\u003e clustering \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e and convolutional neural network (CNN). We compare the performance of these \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e across multiple metrics, such as classification report, confusion metrics, AUC-ROC scores and feature importance analysis, through this we aim to find relative strengths and weaknesses in the context of credit card fraud detection. This performance evaluation contributes to providing help in ongoing efforts for improvement in fraud detection systems and offers valuable guidance to future peers in selecting and implementing appropriate \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e according to needs for similar security applications.\\u003c/p\\u003e\"},{\"header\":\"II. LITERATURE REVIEW\",\"content\":\"\\u003cp\\u003e\\u003cem\\u003eA. Credit Card Fraud Detection using Machine Learning and Data Science, DOI: ISSN: 2278 − 0181\\u003c/em\\u003e, (S P Maniraj, \\u003cspan citationid=\\\"CR8\\\" class=\\\"CitationRef\\\"\\u003e2019\\u003c/span\\u003e)\\u003c/p\\u003e\\u003cp\\u003eFraud detection in credit card transactions has been a subject of extensive research due to its significant financial implications. Previous studies have explored various data mining applications and machine learning techniques for automated fraud detection. Supervised and unsupervised learning methods have been applied to this domain, with varying degrees of success. Some researchers have utilized outlier mining and distance sum algorithms to predict fraudulent transactions in emulated credit card transaction datasets. While these methods have shown promise in certain areas, they have not provided a consistent and permanent solution to the fraud detection problem.\\u003c/p\\u003e\\u003cp\\u003eMore recent approaches have incorporated advanced techniques such as hybrid data mining/complex network classification algorithms. These methods have demonstrated effectiveness in detecting \\u003cem class=\\\"Highlight htf42ccfb9-5a20-4c00-a242-49e5af408730\\\" highlight=\\\"true\\\" htmatch=\\\"illegal\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eillegal\\u003c/em\\u003e instances in real card transaction datasets, particularly for medium-sized online transactions. Efforts have also been made to improve the alert feedback interaction in fraudulent transaction detection systems. Artificial Genetic Algorithms have been explored as a novel approach, showing accuracy in identifying fraudulent transactions while minimizing false alerts. However, these methods often face challenges related to classification problems with variable misclassification costs. The ongoing research in this field continues to seek more robust and adaptable solutions to address the evolving nature of credit card fraud.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eB. A Research Paper on Credit Card Fraud Detection\\u003c/em\\u003e, (BORA MEHAR SRI SATYA TEJA, \\u003cspan citationid=\\\"CR2\\\" class=\\\"CitationRef\\\"\\u003e2022\\u003c/span\\u003e)\\u003c/p\\u003e\\u003cp\\u003eThe paper explores various techniques used in credit card fraud detection, including outlier detection, unsupervised outlier detection, peer group analysis, and breakpoint analysis. Outlier detection identifies abnormal transactions that deviate from a user's typical behaviour, but it may misclassify legitimate unusual transactions. Unsupervised outlier detection focuses on understanding customer transaction patterns without predicting specific outcomes. Peer group analysis compares entities with similar characteristics to identify anomalies. Breakpoint analysis examines structural changes in data to detect anomalies.\\u003c/p\\u003e\\u003cp\\u003eThe authors note that while supervised learning methods are commonly used in fraud detection, they may fail in certain \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"case*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003ecases.\\u003c/em\\u003e The paper highlights the challenge of class imbalance in fraud detection datasets, where genuine transactions significantly outnumber fraudulent ones. This imbalance can lead to difficulties in accurately identifying fraudulent activities. The researchers also discuss the concept of \\\"concept drift,\\\" where transaction patterns change over time, further complicating the fraud detection process. To address these challenges, the paper proposes using machine learning algorithms such as Decision Trees and \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests, along with techniques like oversampling to mitigate class imbalance issues.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eC. A machine learning based credit card fraud detection using the GA algorithm for feature selection, DOI\\u003c/em\\u003e: \\u003cspan class=\\\"ExternalRef\\\"\\u003e\\u003cspan class=\\\"RefSource\\\"\\u003e10.1186/s40537-022-00573-8\\u003c/span\\u003e\\u003cspan address=\\\"10.1186/s40537-022-00573-8\\\" targettype=\\\"DOI\\\" class=\\\"RefTarget\\\"\\u003e\\u003c/span\\u003e\\u003c/span\\u003e, (Emmanuel Ileberi, \\u003cspan citationid=\\\"CR4\\\" class=\\\"CitationRef\\\"\\u003e2022\\u003c/span\\u003e)\\u003c/p\\u003e\\u003cp\\u003eThe literature \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"survey\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003esurvey\\u003c/em\\u003e on credit card fraud detection reveals a growing interest in machine learning techniques to address this critical issue in financial security. Researchers have explored various approaches, including supervised and unsupervised learning methods, to improve the accuracy and efficiency of fraud detection systems. Several studies have focused on the application of traditional machine learning algorithms such as Support Vector Machines (SVM), Decision Trees, and Neural Networks. These methods have shown promising results in identifying fraudulent transactions, although they often face challenges related to imbalanced datasets and the dynamic nature of fraud patterns.\\u003c/p\\u003e\\u003cp\\u003eRecent research has increasingly turned towards ensemble methods and hybrid approaches to enhance fraud detection capabilities. \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forest and Gradient Boosting algorithms have gained popularity due to their ability to handle complex, high-dimensional data and their robustness against overfitting. Additionally, some studies have explored the potential of deep learning techniques, including Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM) networks, to capture intricate patterns in transaction data. These advanced methods have demonstrated improved performance in detecting subtle fraud patterns that may be missed by traditional approaches.\\u003c/p\\u003e\\u003cp\\u003eA significant trend in the literature is the focus on feature engineering and selection techniques to improve \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance. Researchers have employed various methods, including Principal Component Analysis (PCA), Genetic Algorithms, and domain-specific feature extraction, to identify the most relevant attributes for fraud detection. Moreover, there is a growing emphasis on developing real-time fraud detection systems that can adapt to evolving fraud patterns and provide timely alerts. Despite these advancements, the literature highlights ongoing challenges in credit card fraud detection, including the need for more representative and up-to-date datasets, addressing class imbalance issues, and developing interpretable \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e that can provide insights into fraudulent behaviour patterns.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eD. Review of Machine Learning Approach on Credit Card Fraud Detection, DOI\\u003c/em\\u003e: \\u003cspan class=\\\"ExternalRef\\\"\\u003e\\u003cspan class=\\\"RefSource\\\"\\u003e10.1007/s44230-022-00004-0\\u003c/span\\u003e\\u003cspan address=\\\"10.1007/s44230-022-00004-0\\\" targettype=\\\"DOI\\\" class=\\\"RefTarget\\\"\\u003e\\u003c/span\\u003e\\u003c/span\\u003e, (Rejwan Bin Sulaiman, 2022)\\u003c/p\\u003e\\u003cp\\u003eThis review examines various machine learning techniques for credit card fraud detection (CCFD), focusing on their effectiveness, limitations, and privacy considerations. The paper discusses several algorithms, including \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forest (RF), Artificial Neural Networks (ANN), Support Vector Machines (SVM), and K-Nearest Neighbors (KNN). Each method demonstrates unique strengths and weaknesses in handling CCFD tasks. For instance, \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forest shows promise in handling large datasets but may be slower in real-time scenarios. ANN, particularly when used in unsupervised learning, demonstrates high accuracy and fault tolerance, making it a strong contender for CCFD applications. SVM performs well with smaller feature sets but struggles with larger volumes of data, while KNN offers high accuracy and efficiency but faces challenges with memory usage and performance degradation on extensive datasets.\\u003c/p\\u003e\\u003cp\\u003eThe review highlights a critical challenge in CCFD: balancing effective fraud detection with data privacy and confidentiality. Traditional centralized approaches to fraud detection face limitations due to data sharing restrictions imposed by regulations like GDPR. Even anonymized datasets stored locally on servers’ risk being reverse-engineered, potentially compromising user privacy. This privacy concern is a recurring theme across various machine learning approaches discussed in the paper, emphasizing the need for more secure and privacy-preserving methods in CCFD.\\u003c/p\\u003e\\u003cp\\u003eTo address these challenges, the paper proposes a hybrid approach combining Federated Learning (FL) with Artificial Neural Networks. This innovative \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e aims to train data locally on edge devices, sharing only the trained \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e among participating institutions. This approach potentially enhances fraud detection accuracy while maintaining strict privacy standards. By allowing banks and financial centres to collaborate without directly sharing sensitive customer data, the proposed method offers a promising solution to the privacy-accuracy trade-off in CCFD. The authors suggest that this hybrid \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e could significantly improve fraud detection capabilities while ensuring compliance with data protection regulations, marking a potential advancement in the field of credit card fraud detection.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eE. A Review Paper on Feature Selection in Credit Card Fraud\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eDetection\\u003c/em\\u003e, (Surbhi Bansal, \\u003cspan citationid=\\\"CR9\\\" class=\\\"CitationRef\\\"\\u003e2024\\u003c/span\\u003e)\\u003c/p\\u003e\\u003cp\\u003eCredit card fraud detection has been a subject of extensive research due to its significant economic impact. Researchers have compared the performance of various machine learning techniques such as Support Vector Machines, \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests, and Logistic Regression in detecting credit card fraud, highlighting the importance of feature selection in improving \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e accuracy. The challenge of class imbalance in fraud detection has also been addressed, with proposed methods combining techniques like SMOTE and \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003erandom\\u003c/em\\u003e under sampling. These works have emphasized the need for adaptive learning techniques in handling evolving fraud patterns.\\u003c/p\\u003e\\u003cp\\u003eFeature selection in fraud detection has seen increasing attention, with researchers exploring various approaches. The effectiveness of transaction aggregation for creating behavioural features has been demonstrated, significantly improving fraud detection \\u003cem class=\\\"Highlight ht29216696-c42e-4f00-932a-aea34347df6a\\\" highlight=\\\"true\\\" htmatch=\\\"rat*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003erates.\\u003c/em\\u003e Scalable real-time fraud detection systems using feature engineering and hybrid methods have been proposed, showcasing the importance of both domain expertise and machine learning in feature creation. More recently, Swarm \\u003cem class=\\\"Highlight htf42ccfb9-5a20-4c00-a242-49e5af408730\\\" highlight=\\\"true\\\" htmatch=\\\"intelligence\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eIntelligence\\u003c/em\\u003e techniques have been applied for feature selection in fraud detection, demonstrating improved \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance and interpretability compared to traditional methods.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eF. Credit card fraud detection using machine learning\\u003c/em\\u003e, (Mr. Thirunavukkarasu.M, \\u003cspan citationid=\\\"CR6\\\" class=\\\"CitationRef\\\"\\u003e2021\\u003c/span\\u003e)\\u003c/p\\u003e\\u003cp\\u003eCredit card fraud detection has been an active area of research due to its significant economic impact. Previous studies have compared the performance of various machine learning techniques such as Support Vector Machines, \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests, and Logistic Regression for detecting credit card fraud, with \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests often outperforming other methods. Research has also demonstrated the effectiveness of transaction aggregation combined with \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests for fraud detection, showing improved results over single transaction analysis.\\u003c/p\\u003e\\u003cp\\u003eIn recent years, machine learning approaches have gained prominence in fraud detection. Researchers have addressed the challenge of class imbalance in credit card fraud detection datasets, proposing methods that combine under sampling with different algorithms to improve overall performance. Comprehensive reviews of intelligent fraud detection techniques have highlighted the potential of ensemble methods like \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests in handling complex, high-dimensional data typical in financial transactions.\\u003c/p\\u003e\\u003cp\\u003eThe application of deep learning to credit card fraud detection has also emerged as a promising direction. Studies have explored the use of Long Short-Term Memory (LSTM) networks for sequence classification in credit card fraud detection, showing that incorporating transaction sequences can enhance detection accuracy compared to traditional methods. However, while deep learning \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e can offer improved performance, they often lack the interpretability of simpler \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e like \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"random\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eRandom\\u003c/em\\u003e Forests, which remains an important consideration in the financial industry.\\u003c/p\\u003e\"},{\"header\":\"III. OBJECTIVES\",\"content\":\"\\u003cp\\u003e\\u003cem\\u003eA. Understanding various ML \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e with respect to credit card fraud detection\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eWe aim to explore and analyze different machine learning \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels,\\u003c/em\\u003e specifically logistic regression, isolation forest, k-means clustering, and convolutional neural network, with respect to credit card fraud detection. We will understand the principle of each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e and how are they used to identify fraud transactions.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eB. Performance analysis of ML \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eWe will evaluate each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance in detecting credit card fraud. This includes assessing their ability to correctly identify fraudulent transactions while minimizing false positives. This analysis is based on factors like accuracy, precision, and recall to provide an overall view of each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e effectiveness.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eC. Assessing the effectiveness of each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e using different metrics\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eTo ensure our \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e is performing well we will use various performance metrics beyond basic accuracy. This includes confusion matrices, AUC-ROC curves and F1 scores, by using these factors we will aim to find out more about the strengths and weaknesses of each detection \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels.\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eD. Provide recommendation for the ML \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eBased on our analysis we will provide insights and recommendation on which \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e perform best for credit card fraud detection. These recommendations will consider factors such as \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance, computational requirements and ease of implementation providing guidance to future peers.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eE. Understanding features that affect the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e development\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eWe will understand the importance of different features in the dataset and their impact on the performance of each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel.\\u003c/em\\u003e This involves conducting feature importance analysis to identify which transaction characteristics are most crucial in determining whether a transaction is fraudulent or legitimate.\\u003c/p\\u003e\"},{\"header\":\"IV. PROPOSED METHODOLOGY\",\"content\":\"\\u003cp\\u003eTo develop this credit card fraud detection project using various machine learning \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e we have taken the following steps that helps us understand this project from scratch:\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eA. System overview\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eOur credit card fraud detection system follows the given workflow:\\u003c/p\\u003e\\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eData Ingestion: Raw data that’s downloaded from \\u003cem class=\\\"Highlight ht6bbde3a5-ff65-4ca4-808a-27bcf7eafcf3\\\" highlight=\\\"true\\\" htmatch=\\\"kaggle\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eKaggle\\u003c/em\\u003e is fed to the system without any preprocessing or scaling.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003ePreprocessing: The data undergoes cleaning through various methods and techniques to \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodelling\\u003c/em\\u003e can be done on the data that makes sense.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eData Scaling: The numerical features are normalized in the data so the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e can ensure to provide consistent outputs.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eApplying Pretrained \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eModels:\\u003c/em\\u003e We use four different pre trained machine learning \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e on the preprocessed data.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eClassification Report and Metrics: Performance metrics and reports are produced for each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel.\\u003c/em\\u003e\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e\\u003cp\\u003e\\u003cem\\u003eB. Dataset Description\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eThe dataset used in this project is downloaded from \\u003cem class=\\\"Highlight ht6bbde3a5-ff65-4ca4-808a-27bcf7eafcf3\\\" highlight=\\\"true\\\" htmatch=\\\"kaggle\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eKaggle\\u003c/em\\u003e the dataset originally belongs to \\u003cem class=\\\"Highlight htf42ccfb9-5a20-4c00-a242-49e5af408730\\\" highlight=\\\"true\\\" htmatch=\\\"bra*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eBrandon\\u003c/em\\u003e Harris and generated using a simulator (\\u003cem class=\\\"Highlight htf42ccfb9-5a20-4c00-a242-49e5af408730\\\" highlight=\\\"true\\\" htmatch=\\\"bra*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eBrandon,\\u003c/em\\u003e \\u003cspan citationid=\\\"CR3\\\" class=\\\"CitationRef\\\"\\u003e2022\\u003c/span\\u003e). This data consists of legitimate and fraud transactions details from Jan 2019 till Dec 2020, and consist of card details of over 1000 customers and 800 merchants. This data generated creates easy to use fraud transaction dataset which is a representation of real-life transactions it contains two files named “fraudTrain” and “fraudTest” both of them combining contains over 1.5\\u0026nbsp;million various transactions.\\u003c/p\\u003e\\u003cp\\u003e\\u003cem\\u003eC. Data Preprocessing\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eIn our preprocessing pipeline we:\\u003c/p\\u003e\\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eConvert date to datetime: The time features is converted to datetime for better interpretability.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eExtracting features from datetime: We extract additional features like hour, day and month to capture temporal patterns.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eDropping unnecessary columns: Removing redundant and non-informative columns are always helpful for better \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e interpretability.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eScaling the data: Numerical features are scaled using standard scalers to ensure all features contribute for \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e development.\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e\\u003cp\\u003e\\u003cem\\u003eD. \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eModel\\u003c/em\\u003e Description\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eWe are using four different types of \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e and they all work and train themselves using the data differently:\\u003c/p\\u003e\\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eLogistic Regression \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eModel:\\u003c/em\\u003e In statics the logistic regression \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e helps in estimating the probability of an event taking place provided on the provided dataset, and helps analyze the relationship between factors. This would fit well as the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e can mark the fake detection as odds and log them for future predictions.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eIsolation forest \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel:\\u003c/em\\u003e This algorithm is used for anomaly detection in the data with the help of binary trees. This algorithm is ideal for credit card fraud detection as it has a low time complexity and memory use that works well with huge amount of data too.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eK-Mean clustering: This is an unsupervised machine learning algorithm, which helps group unlabeled data into multiple groups or clusters. It creates a centroid in the data and based on the distance it classifies or categorize the data. This \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e will theoretically fit well as the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e and create two cluster of real and fake and predict using their centroids.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eConvolutional Neural Network: CNN comes under deep learning and is a type of neural network that usually creates 3 layers: input, hidden, and output. It will help in \\u003cem class=\\\"Highlight ht71194251-f7a6-4c2d-a145-3d9f25b46662\\\" highlight=\\\"true\\\" htmatch=\\\"local\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003eLocal\\u003c/em\\u003e Pattern Detection, and Feature Extraction and generally works well with large volume of data.\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e\\u003cp\\u003e\\u003cem\\u003eE. Training Process\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eEven though the preprocessing method for all the four \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels\\u003c/em\\u003e is the same but each one of them will undergo a different training process:\\u003c/p\\u003e\\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eLogistic Regression and Isolation Forest: They will be directly trained on the pre-processed data with default hyperparameter.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eK-Means: Here the number of clusters would be determined using elbow method before training.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eCNN: The network architecture would be modified according to the tabular data with multiple convolutional layers. The training would go on for 10 rounds with early stopping to prevent overfitting.\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e\\u003cp\\u003e\\u003cem\\u003eF. Evaluation and Analysis\\u003c/em\\u003e\\u003c/p\\u003e\\u003cp\\u003eWe will evaluate the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e using metrics such as:\\u003c/p\\u003e\\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eAccuracy: Overall correctness of the \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel.\\u003c/em\\u003e\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003ePrecision and Recall: To access \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance on minority class.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eF1-score: The harmonic mean of precision and recall.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eROC-AUC: To check \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodels’\\u003c/em\\u003e ability to distinguish between different classes.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eConfusion Matrix: To visualize \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performance across all outcomes.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eFeature Importance Analysis: To check which feature in the dataset is most important for fraud detection.\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e\\u003cp\\u003eFinally, we will note down all the results and check how each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e performs in various metrics and also note down the time and computational \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"power\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003epower\\u003c/em\\u003e that was required for each \\u003cem class=\\\"Highlight ht2ecd8aa4-09dc-4ddc-8bb0-28c2efee0ea2\\\" highlight=\\\"true\\\" htmatch=\\\"model*\\\" htloopnumber=\\\"559071958\\\" style=\\\"font-style: inherit;\\\"\\u003emodel\\u003c/em\\u003e to give the final predictions.\\u003c/p\\u003e\\u003cp\\u003eTABLE\\u0026nbsp;I\\u003c/p\\u003e\\n\\u003cp\\u003eRequirements for deepfake detection model\\u003c/p\\u003e\\n\\u003ctable border=\\\"1\\\" cellspacing=\\\"0\\\" cellpadding=\\\"0\\\" width=\\\"331\\\"\\u003e\\n \\u003cthead\\u003e\\n \\u003ctr\\u003e\\n \\u003ctd style=\\\"width: 42.9003%;\\\"\\u003e\\n \\u003cp\\u003eHardware Requirements\\u003c/p\\u003e\\n \\u003c/td\\u003e\\n \\u003ctd style=\\\"width: 57.0997%;\\\"\\u003e\\n \\u003cp\\u003eSoftware Requirements\\u003c/p\\u003e\\n \\u003c/td\\u003e\\n \\u003c/tr\\u003e\\n \\u003c/thead\\u003e\\n \\u003ctbody\\u003e\\n \\u003ctr\\u003e\\n \\u003ctd style=\\\"width: 42.9003%;\\\"\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eGraphic Card (Recommended):\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- NVIDIA GPU with CUDA support (Optional but recommended).\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eCompute Resources:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- 8 core CPU.\\u003c/p\\u003e\\n \\u003cp\\u003e- Adequate RAM (8GB or above)\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eStorage:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- SSD with at least 20GB free space\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eNetwork Infrastructure:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- High-speed Internet Connection.\\u003c/p\\u003e\\n \\u003c/td\\u003e\\n \\u003ctd style=\\\"width: 57.0997%;\\\"\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eOperating System:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- Windows 10/11, macOS, or Linux (Ubuntu 18.04 or later recommended)\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eCUDA Toolkit:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- Version compatible with PyTorch and GPU (Optional: works only with graphic cards)\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eNecessary Library:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e- numpy\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e- scikit-learn \\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e- matplotlib\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e- seaborn\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e- pandas\\u003c/p\\u003e\\n \\u003cp\\u003e\\u0026nbsp;\\u003c/p\\u003e\\n \\u003cp\\u003e\\u003cstrong\\u003eDevelopment Tools:\\u003c/strong\\u003e\\u003c/p\\u003e\\n \\u003cp\\u003e-\\u0026nbsp;\\u0026nbsp;Anaconda\\u003c/p\\u003e\\n \\u003cp\\u003e-\\u0026nbsp;Jupyter Notebook\\u003c/p\\u003e\\n \\u003c/td\\u003e\\n \\u003c/tr\\u003e\\n \\u003c/tbody\\u003e\\n\\u003c/table\\u003e\\n\\u003cp\\u003eThese are general requirements\\u003c/p\\u003e\"},{\"header\":\"V. RESULTS AND DISCUSSION\",\"content\":\"\\u003cp\\u003eEvaluation of each model is necessary to understand and rank the models accordingly. As discussed earlier we will evaluate all the four model on different metrics:\\u003c/p\\u003e \\u003cp\\u003e \\u003cul\\u003e \\u003cli\\u003e \\u003cp\\u003eTraining and Testing accuracy: It\\u0026rsquo;s the proportion of correct prediction by total number of cases. It\\u0026rsquo;s used check training vs testing set to assess overfitting.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eClassification Report: It\\u0026rsquo;s a summary of the key classification\\u0026rsquo;s metrics including precisions, recall score, and F1-score for each class. It helps us provide a comprehensive view of model\\u0026rsquo;s performance\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eAUC-ROC and Average Precision Score: The AUC-ROC measures the model\\u0026rsquo;s ability to differentiate between classes among different threshold. The average precision scores summarize the precision-recall curve as the weighted mean of precisions. achieved at each threshold\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eConfusion Matrix: It\\u0026rsquo;s a table showcasing the number of correct and incorrect predictions made by the model. This helps us provide a breakdown of model\\u0026rsquo;s performance and understand error types.\\u003c/p\\u003e \\u003c/li\\u003e \\u003cli\\u003e \\u003cp\\u003eFeature Importance: This is the measure of the features which contributes to the prediction of the model. This helps us provide transaction characteristics and that provides insights for feature engineering and model interpretation.\\u003c/p\\u003e \\u003c/li\\u003e \\u003c/ul\\u003e \\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eA. Logistic Regression Model Performance\\u003c/em\\u003e \\u003c/p\\u003e\\u003cp\\u003eImages are available in the Figures carousel.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eB. Isolation Forest Model Performance\\u003c/em\\u003e \\u003c/p\\u003e\\u003cp\\u003eImages are available in the Figures carousel.\\u003c/p\\u003e\\u003cp\\u003e \\u003cem\\u003eC. K-Mean Model Performance\\u003c/em\\u003e \\u003c/p\\u003e \\u003cp\\u003eImages are available in the Figures carousel.\\u003c/p\\u003e \\u003cp\\u003e \\u003cem\\u003eD. Convolutional Neural Network Model Performance\\u003c/em\\u003e \\u003c/p\\u003e \\u003cp\\u003eImages are available in the Figures carousel.\\u003c/p\\u003e\\u003cp\\u003e \\u003c/p\\u003e \\u003cp\\u003eFinal Analysis of all the 4 models and their performance.\\u003c/p\\u003e \\u003cp\\u003e \\u003cdiv class=\\\"gridtable\\\"\\u003e\\u003ctable float=\\\"No\\\" id=\\\"Tabb\\\" border=\\\"1\\\"\\u003e \\u003ccolgroup cols=\\\"4\\\"\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c1\\\" colnum=\\\"1\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c2\\\" colnum=\\\"2\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c3\\\" colnum=\\\"3\\\"\\u003e\\u003c/div\\u003e \\u003cdiv align=\\\"left\\\" class=\\\"colspec\\\" colname=\\\"c4\\\" colnum=\\\"4\\\"\\u003e\\u003c/div\\u003e \\u003cthead\\u003e \\u003ctr\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c1\\\"\\u003e \\u003cp\\u003eLogistic Regression\\u003c/p\\u003e \\u003c/th\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eK- Mean\\u003c/p\\u003e \\u003c/th\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003eIsolation forest\\u003c/p\\u003e \\u003c/th\\u003e \\u003cth align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003eCNN\\u003c/p\\u003e \\u003c/th\\u003e \\u003c/tr\\u003e \\u003c/thead\\u003e \\u003ctbody\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\"\\u003e \\u003cp\\u003e- Accuracy: 0.88\\u003c/p\\u003e \\u003cp\\u003e- F1 score: 0.94\\u003c/p\\u003e \\u003cp\\u003e- Recall: 0.88\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003e- Accuracy: 0.55\\u003c/p\\u003e \\u003cp\\u003e- F1 score: 0.71\\u003c/p\\u003e \\u003cp\\u003e- Recall: 0.55\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e- Accuracy: 0.98\\u003c/p\\u003e \\u003cp\\u003e- F1 score: 0.99\\u003c/p\\u003e \\u003cp\\u003e- Recall: 0.99\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e- Accuracy: 0.98\\u003c/p\\u003e \\u003cp\\u003e- F1 score: 0.99\\u003c/p\\u003e \\u003cp\\u003e- Recall: 0.99\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\"\\u003e \\u003cp\\u003eComputational Power: Low\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eComputational Power: Low\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003eComputational Power: Medium\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003eComputational Power: High\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\"\\u003e \\u003cp\\u003e- AUC-ROC score: 0.91\\u003c/p\\u003e \\u003cp\\u003e- Average Precision Score: 0.15\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003e- AUC-ROC score: 0.52\\u003c/p\\u003e \\u003cp\\u003e- Average Precision Score: 0.005\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003e- AUC-ROC score: 0.54\\u003c/p\\u003e \\u003cp\\u003e- Average Precision Score: 0.006\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003e- AUC-ROC score: 0.99\\u003c/p\\u003e \\u003cp\\u003e- Average Precision Score: 0.80\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003ctr\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c1\\\"\\u003e \\u003cp\\u003eTop Feature:\\u003c/p\\u003e \\u003cp\\u003eAmount\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c2\\\"\\u003e \\u003cp\\u003eTop Feature:\\u003c/p\\u003e \\u003cp\\u003eGender_M\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c3\\\"\\u003e \\u003cp\\u003eTop Feature:\\u003c/p\\u003e \\u003cp\\u003eCategory_personal_care\\u003c/p\\u003e \\u003c/td\\u003e \\u003ctd align=\\\"left\\\" colname=\\\"c4\\\"\\u003e \\u003cp\\u003eTop Feature:\\u003c/p\\u003e \\u003cp\\u003eAmount\\u003c/p\\u003e \\u003c/td\\u003e \\u003c/tr\\u003e \\u003c/tbody\\u003e \\u003c/colgroup\\u003e \\u003c/table\\u003e\\u003c/div\\u003e \\u003c/p\\u003e \"},{\"header\":\"VI. CONCLUSION\",\"content\":\"\\u003cp\\u003eThis study compares four machine learning models- Logistic Regression, K-Means clustering, Isolation Forest, and convolutional Neural Network (CNN) for credit card fraud detection. We evaluated these models using various set of metrics including accuracy, F1 score, recall, AUC-ROC score.\\u003c/p\\u003e \\u003cp\\u003eOur results reveal the performance of various models. The CNN model is able to generate a model with accuracy of 0.98 F1 score of 0.99, and AUC-ROC of 0.99. However, this superior performance comes at cost of high computational power. The logistic regression comes out as a good model with good performance showcasing scores with accuracy of 0.88, F1 score of 0.94, AUC-ROC of 0.91 and also has low computational power. Therefore, these two models emerge as a viable option for real-time fraud detection as well where accuracy is important and computational power is optional.\\u003c/p\\u003e \\u003cp\\u003eInterestingly, the Isolation Forest model achieves a high accuracy of .98 compared to CNN, but its low AUC-ROC score shows that there might be some potential issue with class separations. This tells us that it\\u0026rsquo;s important to consider multiple metrics in evaluating model performance, particularly in imbalanced classification problems like fraud detection. The K-Means clustering performs poorly across all metrics showcasing its not an ideal model to predict credit card fraud detection, this also indicates that unsupervised learning methods may not fit well with problems like credit card fraud detections.\\u003c/p\\u003e \\u003cp\\u003eThese models explain the trade-off between models\\u0026rsquo; complexity and performance in credit card fraud detection. Where models like CNN provides higher detection capability but models like Logistic regression offer strong balance between accuracy and computational efficiency. Finally, the choices of the models should be done based on specific requirements and constraints of fraud detection system that is needed to be developed.\\u003c/p\\u003e \\u003cp\\u003eThis study helps contributing into the ongoing studies and development that happening around credit card fraud detection. Future Work can explore ensemble modelling techniques that uses strength of different models to improve the detection mechanism and develop computational efficient models that can run on any device with minimal requirements.\\u003c/p\\u003e \"},{\"header\":\"Declarations\",\"content\":\"\\u003cp\\u003eACKNOLODGEMENT\\u003c/p\\u003e \\u003cp\\u003eI would like to extend my sincere and heartfelt thanks to my professor Dr. Pallavi who guided me by reviewing and providing feedback throughout this research and actively encouraged me to complete this work. I would also take a moment to appreciate Mr. Arun Samanta for taking out his time in reviewing my work and providing me a plagiarism report on the paper to keep this work original. The journey would not have been completed without the resources and knowledge provided by these faculties at presidency university.\\u003c/p\\u003e \\u003cp\\u003eI am also very grateful to the services provided by OpenAI and Anthropic for their resources and tools for various help that includes resources collection, content paraphrasing and debugging coding errors. Finally, I would like to thank Google scholar and Research gate for providing relevant articles which helped later during the development of project.\\u003c/p\\u003e\"},{\"header\":\"References\",\"content\":\"\\u003col\\u003e\\u003cli\\u003e\\u003cspan\\u003eAlEmad M (2022) Credit Card Fraud Detection Using Machine Learning. RIT Digital Institutional Repository\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eBORA MEHAR SRI SATYA TEJA BM (2022) A Research Paper on Credit Card Fraud Detection. Int Res J Eng Technol, 1\\u0026ndash;4\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eBrandon H (2022) \\u003cem\\u003eSynthetic Credit Card Transaction Generator used in the Sparkov program\\u003c/em\\u003e. Retrieved from GitHub: \\u003cspan class=\\\"ExternalRef\\\"\\u003e\\u003cspan class=\\\"RefSource\\\"\\u003ehttps://github.com/namebrandon/Sparkov_Data_Generation\\u003c/span\\u003e\\u003cspan address=\\\"https://github.com/namebrandon/Sparkov_Data_Generation\\\" targettype=\\\"URL\\\" class=\\\"RefTarget\\\"\\u003e\\u003c/span\\u003e\\u003c/span\\u003e\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eEmmanuel Ileberi YS (2022) A machine learning based credit card fraud detection using the GA algorithm for feature selection. J Big data, 2\\u0026ndash;15\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eHarris B (2020) \\u003cem\\u003eCredit Card Transactions Fraud Detection Dataset.\\u003c/em\\u003e Retrieved from Kaggle: \\u003cspan class=\\\"ExternalRef\\\"\\u003e\\u003cspan class=\\\"RefSource\\\"\\u003ehttps://www.kaggle.com/datasets/kartik2112/fraud-detection\\u003c/span\\u003e\\u003cspan address=\\\"https://www.kaggle.com/datasets/kartik2112/fraud-detection\\\" targettype=\\\"URL\\\" class=\\\"RefTarget\\\"\\u003e\\u003c/span\\u003e\\u003c/span\\u003e\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eMr. Thirunavukkarasu.M AN (2021) CREDIT CARD FRAUD DETECTION USING MACHINE LEARNING. Int J Comput Sci Mob Comput, 2\\u0026ndash;7\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eRejwan B, Sulaiman VS (2022) Review of Machine Learning Approach on Credit Card Fraud Detection. Human-Centric Intell Syst, 1\\u0026ndash;12\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eManiraj SP, A. S (2019) Credit Card Fraud Detection using Machine Learning and Data Science. Int J Eng Res Technol, 2\\u0026ndash;4\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eSurbhi Bansal RH (2024) A Review Paper on Feature Selection in Credit Card Fraud Detection. \\u003cem\\u003eInternational joint conference on computing sciences\\u003c/em\\u003e, 1\\u0026ndash;5\\u003c/span\\u003e\\u003c/li\\u003e \\u003cli\\u003e\\u003cspan\\u003eVaishnavi N, Dornadulaa GS (2019) Credit Card fraud Detection using Machine Learning Algorithms. \\u003cem\\u003eInternational Conference on Recent Trends in Advanced Computing\\u003c/em\\u003e, 3\\u0026ndash;9\\u003c/span\\u003e\\u003c/li\\u003e\\u003c/ol\\u003e\"}],\"fulltextSource\":\"\",\"fullText\":\"\",\"funders\":[],\"hasAdminPriorityOnWorkflow\":false,\"hasManuscriptDocX\":true,\"hasOptedInToPreprint\":true,\"hasPassedJournalQc\":\"\",\"hasAnyPriority\":true,\"hideJournal\":true,\"highlight\":\"\",\"institution\":\"Presidency University\",\"isAcceptedByJournal\":false,\"isAuthorSuppliedPdf\":false,\"isDeskRejected\":\"\",\"isHiddenFromSearch\":false,\"isInQc\":false,\"isInWorkflow\":false,\"isPdf\":false,\"isPdfUpToDate\":true,\"isWithdrawnOrRetracted\":false,\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"researchsquare\",\"isNatureJournal\":false,\"hasQc\":true,\"allowDirectSubmit\":true,\"externalIdentity\":\"\",\"sideBox\":\"\",\"snPcode\":\"\",\"submissionUrl\":\"/submission\",\"title\":\"Research Square\",\"twitterHandle\":\"researchsquare\",\"acdcEnabled\":true,\"dfaEnabled\":false,\"editorialSystem\":\"\",\"reportingPortfolio\":\"\",\"inReviewEnabled\":false,\"inReviewRevisionsEnabled\":true},\"keywords\":\"Credit card fraud, machine learning, logistic regression, isolation forest, k-mean clustering, convolutional neural network, financial security\",\"lastPublishedDoi\":\"10.21203/rs.3.rs-5314340/v1\",\"lastPublishedDoiUrl\":\"https://doi.org/10.21203/rs.3.rs-5314340/v1\",\"license\":{\"name\":\"CC BY 4.0\",\"url\":\"https://creativecommons.org/licenses/by/4.0/\"},\"manuscriptAbstract\":\"\\u003cp\\u003eThe increase in number of online transactions has led to a significant amount of credit card fraud over the past decade. Unauthorized use of one\\u0026rsquo;s credit card information by stealing the information through dark web or scam calls, poses a major risk to both customer and businesses, particularly in e-commerce setting. This paper presents a comparative analysis of multiple machine learning models for credit card fraud detection, including logistic regression, isolation forest, K \\u0026ndash; mean clustering, and convolutional neural networks. With a highly unbalanced dataset we aim to evaluate these models\\u0026rsquo; performance in differentiating between genuine and fraudulent transactions based on features such as transaction history, user details, and merchant information. Our experiment results will help provide insights into effectiveness of each model for finding patterns to distinguish between real and fake that can be applied to real world data. This research contributes to the field of financial security by offering guidance on model selection for credit card fraud detection and related applications. View this project here.\\u003c/p\\u003e\",\"manuscriptTitle\":\"Analysis of Different Machine Learning Models for Credit Card Fraud Detection\",\"msid\":\"\",\"msnumber\":\"\",\"nonDraftVersions\":[{\"code\":1,\"date\":\"2024-10-24 16:35:56\",\"doi\":\"10.21203/rs.3.rs-5314340/v1\",\"editorialEvents\":[{\"type\":\"communityComments\",\"content\":0}],\"status\":\"published\",\"journal\":{\"display\":true,\"email\":\"info@researchsquare.com\",\"identity\":\"researchsquare\",\"isNatureJournal\":false,\"hasQc\":true,\"allowDirectSubmit\":true,\"externalIdentity\":\"\",\"sideBox\":\"\",\"snPcode\":\"\",\"submissionUrl\":\"/submission\",\"title\":\"Research Square\",\"twitterHandle\":\"researchsquare\",\"acdcEnabled\":true,\"dfaEnabled\":false,\"editorialSystem\":\"\",\"reportingPortfolio\":\"\",\"inReviewEnabled\":false,\"inReviewRevisionsEnabled\":true}}],\"origin\":\"\",\"ownerIdentity\":\"6ab0a75f-8767-439a-8348-6f282a564134\",\"owner\":[],\"postedDate\":\"October 24th, 2024\",\"published\":true,\"recentEditorialEvents\":[],\"rejectedJournal\":[],\"revision\":\"\",\"amendment\":\"\",\"status\":\"posted\",\"subjectAreas\":[],\"tags\":[],\"updatedAt\":\"2024-10-24T16:35:56+00:00\",\"versionOfRecord\":[],\"versionCreatedAt\":\"2024-10-24 16:35:56\",\"video\":\"\",\"vorDoi\":\"\",\"vorDoiUrl\":\"\",\"workflowStages\":[]},\"version\":\"v1\",\"identity\":\"rs-5314340\",\"journalConfig\":\"researchsquare\"},\"__N_SSP\":true},\"page\":\"/article/[identity]/[[...version]]\",\"query\":{\"redirect\":\"/article/rs-5314340\",\"identity\":\"rs-5314340\",\"version\":[\"v1\"]},\"buildId\":\"qtupq5eGEP_6zYnWcrvyt\",\"isFallback\":false,\"isExperimentalCompile\":false,\"dynamicIds\":[84888],\"gssp\":true,\"scriptLoader\":[]}","source_license":"CC-BY-4.0","license_restricted":false}