100+ datasets found
  1. Credit Card Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Dec 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arif Miah (2025). Credit Card Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/miadul/credit-card-fraud-detection-dataset
    Explore at:
    zip(113829 bytes)Available download formats
    Dataset updated
    Dec 17, 2025
    Authors
    Arif Miah
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    šŸ“Œ Dataset Description

    **Credit Card Fraud Detection Dataset **

    This dataset contains 10,000 credit card transactions designed to support research and experimentation in fraud detection using machine learning. The data realistically simulates both legitimate and fraudulent transactions, while maintaining a naturally imbalanced class distribution, which reflects real-world financial systems.

    The dataset is suitable for binary classification tasks, where the objective is to predict whether a transaction is fraudulent (1) or legitimate (0) based on transaction behavior, risk indicators, and cardholder information.

    šŸŽÆ Objective

    To build and evaluate machine learning models that can identify fraudulent credit card transactions using transaction-level features such as amount, time, location mismatch, device trust, and transaction velocity.

    šŸ“Š Dataset Characteristics

    • Total Records: 10,000
    • Total Features: 10
    • Target Variable: is_fraud
    • Class Distribution: Highly imbalanced (fraud ā‰ˆ 4–5%)
    • Data Type: Synthetic (privacy-safe)

    🧾 Feature Description

    Feature NameDescription
    transaction_idUnique identifier for each transaction
    amountTransaction amount
    transaction_hourHour of transaction (0–23)
    merchant_categoryType of merchant
    foreign_transactionIndicates if transaction is international (0/1)
    location_mismatchBilling vs transaction location mismatch (0/1)
    device_trust_scoreTrust score of the device (0–100)
    velocity_last_24hNumber of transactions in last 24 hours
    cardholder_ageAge of the cardholder
    is_fraudTarget variable (0 = Normal, 1 = Fraud)

    šŸ” Potential Use Cases

    • Credit card fraud detection systems
    • Imbalanced classification problems
    • Feature engineering and risk analysis
    • Model comparison (Logistic Regression, Random Forest, XGBoost, etc.)
    • Anomaly detection research

    šŸ“ˆ Recommended Evaluation Metrics

    Due to class imbalance, the following metrics are recommended:

    • Precision
    • Recall
    • F1-score
    • ROC-AUC
    • Confusion Matrix

    āš ļø Notes

    • This dataset is synthetically generated and does not contain any real customer data.
    • It is intended for educational, research, and practice purposes only.
  2. t

    Credit Card Fraud Detection

    • test.researchdata.tuwien.ac.at
    • zenodo.org
    • +1more
    csv, json, pdf +2
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja (2025). Credit Card Fraud Detection [Dataset]. http://doi.org/10.82556/yvxj-9t22
    Explore at:
    text/markdown, csv, pdf, txt, jsonAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    TU Wien
    Authors
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 28, 2025
    Description

    Below is a draft DMP–style description of your credit‐card fraud detection experiment, modeled on the antiquities example:

    1. Dataset Description

    Research Domain
    This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous credit‐card transactions in real time to reduce financial losses and improve trust in digital payment systems.

    Purpose
    The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraud‐detection algorithms and support future research on anomaly detection in transaction data.

    Data Sources
    We used the publicly available credit‐card transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.

    Method of Dataset Preparation

    1. Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepo’s requirements.

    2. Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).

    3. Splitting: Programmatically derived three subsets—training (70%), validation (15%), test (15%)—using range‐based filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.

    4. Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from ā€œYā€/ā€œNā€ to 1/0 and dropped non‐feature identifiers (actionnr, merchant_id).

    5. Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the held‐out test set.

    2. Technical Details

    Dataset Structure

    • The raw data is a single CSV with columns:

      • actionnr (integer transaction ID)

      • merchant_id (string)

      • average_amount_transaction_day (float)

      • transaction_amount (float)

      • is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)

      • total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)

    Naming Conventions

    • All columns use lowercase snake_case.

    • Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.

    • Files in the code repo follow a clear structure:

      ā”œā”€ā”€ data/         # local copies only; raw data lives in DBRepo 
      ā”œā”€ā”€ notebooks/Task.ipynb 
      ā”œā”€ā”€ models/rf_model_v1.joblib 
      ā”œā”€ā”€ outputs/        # confusion_matrix.png, roc_curve.png, predictions.csv 
      ā”œā”€ā”€ README.md 
      ā”œā”€ā”€ requirements.txt 
      └── codemeta.json 
      

    Required Software

    • Python 3.9+

    • pandas, numpy (data handling)

    • scikit-learn (modeling, metrics)

    • matplotlib (visualizations)

    • dbrepo‐client.py (DBRepo API)

    • requests (TU WRD API)

    Additional Resources

    3. Further Details

    Data Limitations

    • Highly imbalanced: only ~0.17% of transactions are fraudulent.

    • Anonymized PCA features (V1–V28) hidden; we extended with domain features but cannot reverse engineer raw variables.

    • Time‐bounded: only covers two days of transactions, may not capture seasonal patterns.

    Licensing and Attribution

    • Raw data: CC-0 (per Kaggle terms)

    • Code & notebooks: MIT License

    • Model artifacts & outputs: CC-BY 4.0

    • DUWRD records include ORCID identifiers for the author.

    Recommended Uses

    • Benchmarking new fraud‐detection algorithms on a standard imbalanced dataset.

    • Educational purposes: demonstrating model‐training pipelines, FAIR data practices.

    • Extension: adding time‐series or deep‐learning models.

    Known Issues

    • Possible temporal leakage if date/time features not handled correctly.

    • Model performance may degrade on live data due to concept drift.

    • Binary flags may oversimplify nuanced transaction outcomes.

  3. Credit Card Fraud Detection Dataset

    • kaggle.com
    zip
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ghanshyam Saini (2025). Credit Card Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/credit-card-fraud-detection-dataset
    Explore at:
    zip(69155672 bytes)Available download formats
    Dataset updated
    May 15, 2025
    Authors
    Ghanshyam Saini
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Credit Card Fraud Detection Dataset (European Cardholders, September 2013)

    As a data contributor, I'm sharing this crucial dataset focused on the detection of fraudulent credit card transactions. Recognizing these illicit activities is paramount for protecting customers and the integrity of financial systems.

    About the Dataset:

    This dataset encompasses credit card transactions made by European cardholders during a two-day period in September 2013. It presents a real-world scenario with a significant class imbalance, where fraudulent transactions are considerably less frequent than legitimate ones. Out of a total of 284,807 transactions, only 492 are instances of fraud, representing a mere 0.172% of the entire dataset.

    Content of the Data:

    Due to confidentiality concerns, the majority of the input features in this dataset have undergone a Principal Component Analysis (PCA) transformation. This means the original meaning and context of features V1, V2, ..., V28 are not directly provided. However, these principal components capture the variance in the underlying transaction data.

    The only features that have not been transformed by PCA are:

    • Time: Numerical. Represents the number of seconds elapsed between each transaction and the very first transaction recorded in the dataset.
    • Amount: Numerical. The transaction amount in Euros (€). This feature could be valuable for cost-sensitive learning approaches.

    The target variable for this classification task is:

    • Class: Integer. Takes the value 1 in the case of a fraudulent transaction and 0 otherwise.

    Important Note on Evaluation:

    Given the substantial class imbalance (far more legitimate transactions than fraudulent ones), traditional accuracy metrics based on the confusion matrix can be misleading. It is strongly recommended to evaluate models using the Area Under the Precision-Recall Curve (AUPRC), as this metric is more sensitive to the performance on the minority class (fraudulent transactions).

    How to Use This Dataset:

    1. Download the dataset file (likely in CSV format).
    2. Load the data using libraries like Pandas.
    3. Understand the class imbalance: Be aware that fraudulent transactions are rare.
    4. Explore the features: Analyze the distributions of 'Time', 'Amount', and the PCA-transformed features (V1-V28).
    5. Address the class imbalance: Consider using techniques like oversampling the minority class, undersampling the majority class, or using specialized algorithms designed for imbalanced datasets.
    6. Build and train binary classification models to predict the 'Class' variable.
    7. Evaluate your models using AUPRC to get a meaningful assessment of performance in detecting fraud.

    Acknowledgements and Citation:

    This dataset has been collected and analyzed through a research collaboration between Worldline and the Machine Learning Group (MLG) of ULB (UniversitƩ Libre de Bruxelles).

    When using this dataset in your research or projects, please cite the following works as appropriate:

    • Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015.
    • Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon.
    • Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE.
    • Andrea Dal Pozzolo. Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi).
    • Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-AĆ«l Le Borgne, Olivier Caelen, Yannis Mazzer, Gianluca Bontempi. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier.
    • Fabrizio Carcillo, Yann-AĆ«l Le Borgne, Olivier Caelen, Gianluca Bontempi. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing.
    • Bertrand Lebichot, Yann-AĆ«l Le Borgne, Liyun He, Frederic OblĆ©, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019.
    • Fabrizio Carcillo, Yann-AĆ«l Le Borgne, Olivier Caelen, Frederic OblĆ©, Gianluca Bontempi *Combining Unsupervised and Supervised...
  4. Credit Card Fraud Detection

    • kaggle.com
    zip
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhadra Mohit (2024). Credit Card Fraud Detection [Dataset]. https://www.kaggle.com/datasets/bhadramohit/credit-card-fraud-detection
    Explore at:
    zip(1830392 bytes)Available download formats
    Dataset updated
    Oct 21, 2024
    Authors
    Bhadra Mohit
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Credit Card Fraud: Analysis and Prevention Overview Credit card fraud represents a significant threat to the integrity of financial transactions and consumer trust in digital commerce. As the reliance on credit cards for everyday purchases continues to grow, so does the sophistication of fraudsters exploiting vulnerabilities in the system. This project aims to analyze patterns of credit card fraud, understand the factors contributing to fraudulent activities, and explore effective methods for detection and prevention.

    Dataset Description The dataset comprises 100,000 transactions generated to simulate real-world credit card activity. Each entry includes the following features:

    TransactionID: A unique identifier for each transaction, ensuring traceability. TransactionDate: The date and time when the transaction occurred, allowing for temporal analysis. Amount: The monetary value of the transaction, which can help identify unusually large transactions that may indicate fraud. MerchantID: An identifier for the merchant involved in the transaction, useful for assessing merchant-related fraud patterns. TransactionType: Indicates whether the transaction was a purchase or a refund, providing context for the activity. Location: The geographic location of the transaction, facilitating analysis of fraud trends by region. IsFraud: A binary target variable indicating whether the transaction is fraudulent (1) or legitimate (0), essential for supervised learning models. Analysis Objectives Exploratory Data Analysis (EDA):

    Examine the distribution of transaction amounts and types. Identify trends in transaction dates and locations. Analyze the ratio of fraudulent to legitimate transactions. Pattern Recognition:

    Use clustering techniques to group transactions and identify unusual patterns. Explore correlations between transaction features and the occurrence of fraud. Fraud Detection Modeling:

    Implement machine learning algorithms (e.g., logistic regression, decision trees, random forests) to build predictive models that can classify transactions as fraudulent or legitimate. Evaluate model performance using metrics such as accuracy, precision, recall, and the F1 score. Feature Importance Analysis:

    Determine which features contribute most significantly to the detection of fraud, aiding in the refinement of fraud detection systems. Potential Solutions Real-time Monitoring Systems: Develop systems capable of analyzing transactions in real-time, flagging suspicious activities based on learned patterns and thresholds. Consumer Education: Promote awareness among consumers about the signs of credit card fraud and best practices for safeguarding personal information. Collaboration with Merchants: Work closely with merchants to implement better security measures, such as enhanced verification processes for high-risk transactions. Regulatory Compliance: Ensure compliance with regulations and standards (e.g., PCI DSS) to enhance security protocols across the payment ecosystem. Conclusion Understanding and addressing credit card fraud is vital for maintaining consumer confidence and the overall health of the financial system. Through rigorous analysis and the application of advanced machine learning techniques, this project aims to contribute valuable insights and practical solutions for combating credit card fraud effectively.

  5. h

    creditcard-fraud-detection

    • huggingface.co
    Updated Aug 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keith Gauvin (2024). creditcard-fraud-detection [Dataset]. https://huggingface.co/datasets/kgauvin603/creditcard-fraud-detection
    Explore at:
    Dataset updated
    Aug 29, 2024
    Authors
    Keith Gauvin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    kgauvin603/creditcard-fraud-detection dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. a

    Credit Card Fraud Detection Dataset

    • agentsfordata.com
    Updated Dec 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agents for Data (2025). Credit Card Fraud Detection Dataset [Dataset]. https://www.agentsfordata.com/datasets/dat_019b59d0-a0ca-75e2-9e20-d6a8e4c27c00/credit-card-fraud-detection-dataset
    Explore at:
    csv, application/x-parquetAvailable download formats
    Dataset updated
    Dec 27, 2025
    Dataset authored and provided by
    Agents for Data
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    284,807 European credit card transactions from September 2013 with PCA-anonymized features. Contains 492 frauds (0.172% fraud rate) - the gold standard benchmark for imbalanced classification and anomaly detection ML research.

  7. g

    Credit Card Fraud Detection

    • gts.ai
    json
    Updated Aug 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED (2024). Credit Card Fraud Detection [Dataset]. https://gts.ai/dataset-download/page/61/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Aug 10, 2024
    Dataset authored and provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    An anonymized Credit Card Fraud Detection dataset designed for building, training, and evaluating machine learning models for financial fraud prevention.

  8. Fraud Detection

    • kaggle.com
    zip
    Updated Sep 12, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Chauhan (2022). Fraud Detection [Dataset]. https://www.kaggle.com/datasets/whenamancodes/fraud-detection
    Explore at:
    zip(69155672 bytes)Available download formats
    Dataset updated
    Sep 12, 2022
    Authors
    Aman Chauhan
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.

    About Data

    The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

    It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.

    Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.

    Acknowledgements

    The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (UniversitƩ Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project

    Please cite the following works:

    Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015

    Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon

    Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE

    Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)

    Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aƫl; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier

    Carcillo, Fabrizio; Le Borgne, Yann-Aƫl; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing

    Bertrand Lebichot, Yann-Aƫl Le Borgne, Liyun He, Frederic OblƩ, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019

    Fabrizio Carcillo, Yann-Aƫl Le Borgne, Olivier Caelen, Frederic OblƩ, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019

    Yann-Aƫl Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook

    Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic OblƩ, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics

  9. Credit Card Fraud Detection

    • zenodo.org
    csv
    Updated Dec 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luqi Liu; Luqi Liu (2022). Credit Card Fraud Detection [Dataset]. http://doi.org/10.5281/zenodo.7395559
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 5, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luqi Liu; Luqi Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset from https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

    The dataset contains transactions made by credit cards in September 2013 by European cardholders.
    This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

  10. Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sameerk (2024). Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/sameerk2004/fraud-detection-dataset
    Explore at:
    zip(169109 bytes)Available download formats
    Dataset updated
    Nov 9, 2024
    Authors
    Sameerk
    Description

    The dataset is generated using the Faker library to simulate transaction data. It contains several columns that represent both user and transaction information, including features for detecting fraudulent activities. The data includes a mix of categorical, numerical, and datetime values, which need to be processed for machine learning.

  11. Credit Card Fraud Detection

    • kaggle.com
    zip
    Updated Mar 10, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaushal Nandaniya (2026). Credit Card Fraud Detection [Dataset]. https://www.kaggle.com/datasets/kaushalnandania/credit-card-fraud-detection
    Explore at:
    zip(211766642 bytes)Available download formats
    Dataset updated
    Mar 10, 2026
    Authors
    Kaushal Nandaniya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context This dataset contains simulated credit card transactions featuring both legitimate and fraudulent activities from January 1st, 2019, through December 31st, 2020. It captures the transaction behaviors of 1,000 synthetic customers interacting with a pool of 800 merchants. With roughly 1.85 million transactions in total, the dataset reflects the highly imbalanced nature of real-world financial fraud, providing features such as transaction time, amount, merchant category, and geographical coordinates.

    Sources The data was synthetically generated using the Sparkov Data Generation tool created by Brandon Harris. It simulates normal user transaction habits, merchant locations, and specific fraud scenarios to create a highly realistic environment for predictive modeling. The data is available via HuggingFace (NeerajCodz/creditCardFraudDetection).

    Inspiration The primary inspiration behind this dataset is to provide a robust, large-scale, and accessible resource for evaluating anomaly detection and machine learning algorithms. Credit card fraud is a massive challenge in financial cybersecurity; this dataset allows researchers, students, and data scientists to tackle extreme class imbalances, test novel transaction-monitoring strategies, and ultimately advance the state-of-the-art in automated fraud detection without compromising sensitive personal data.

  12. g

    Data from: Credit Card Transactions Dataset

    • gts.ai
    json
    Updated Jun 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED (2024). Credit Card Transactions Dataset [Dataset]. https://gts.ai/dataset-download/page/39/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 20, 2024
    Dataset authored and provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Access the Credit Card Transactions Dataset featuring structured transaction records suitable for fraud detection, risk modeling, anomaly detection, and financial analytics. Designed for machine learning and AI-driven fintech applications.

  13. Credit Card Fraud Detection

    • kaggle.com
    zip
    Updated Dec 24, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    shayan naveed (2019). Credit Card Fraud Detection [Dataset]. https://www.kaggle.com/datasets/shayannaveed/credit-card-fraud-detection
    Explore at:
    zip(69155672 bytes)Available download formats
    Dataset updated
    Dec 24, 2019
    Authors
    shayan naveed
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. The data sets contains transactions made by credit cards by cardholders. This dataset we have found 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the frauds account for 0.172% of all transactions. It contains only numerical input variables which are the result of a PCA transformation. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA is Time and Amount. Feature Time contains the seconds between each transaction and the first transaction in the dataset. The feature Amount is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature Class is the response variable and it takes value 1 in case of fraud and 0 otherwise. Time Number of seconds elapsed between this transaction and the first transaction in the dataset V1….V28 may be result of a PCA Dimensionality reduction to protect user identities and sensitive features(v1-v28) Amount Transaction amount Class The value 1 is for fraudulent transactions, value 0 is for nonfraudulent transactions

  14. G

    Credit Card Fraud Detection

    • gomask.ai
    csv, json
    Updated Jan 15, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GoMask.ai (2026). Credit Card Fraud Detection [Dataset]. https://gomask.ai/marketplace/datasets/credit-card-fraud-detection
    Explore at:
    csv(10 MB), jsonAvailable download formats
    Dataset updated
    Jan 15, 2026
    Dataset provided by
    GoMask.ai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - 2025
    Area covered
    Global
    Variables measured
    is_fraud, entry_mode, card_number, merchant_id, cardholder_id, currency_code, cardholder_age, transaction_id, is_international, transaction_city, and 7 more
    Description

    This dataset provides detailed, labeled records of simulated credit card transactions, including transaction amounts, merchant and cardholder information, and fraud indicators. It is ideal for developing and benchmarking machine learning models aimed at detecting fraudulent activity and reducing financial risk in payment systems. The inclusion of transaction context and cardholder demographics supports advanced analytics and feature engineering.

  15. Credit Card Transactions Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Oct 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rupeswara Babu Sangoju (2023). Credit Card Transactions Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/rupeswarababusangoju/credit-card-transactions-fraud-detection-dataset
    Explore at:
    zip(63490622 bytes)Available download formats
    Dataset updated
    Oct 21, 2023
    Authors
    Rupeswara Babu Sangoju
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Rupeswara Babu Sangoju

    Released under Apache 2.0

    Contents

  16. h

    credit-card-fraud-detection

    • huggingface.co
    Updated Jan 4, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kimberly Lin (2026). credit-card-fraud-detection [Dataset]. https://huggingface.co/datasets/jyunyilin/credit-card-fraud-detection
    Explore at:
    Dataset updated
    Jan 4, 2026
    Authors
    Kimberly Lin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Credit Card Fraud Detection – Processed Dataset

    This dataset contains preprocessed credit card transaction data prepared for fraud detection tasks.

      Data Description
    

    The dataset is derived from anonymized transaction records and includes numerical features (V1–V28), transaction amount, and time-based information.

      Preprocessing Steps
    

    Feature scaling and normalization Handling class imbalance Feature selection based on correlation analysis Removal of irrelevant… See the full description on the dataset page: https://huggingface.co/datasets/jyunyilin/credit-card-fraud-detection.

  17. C

    Credit Card Fraud Detection Platform Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Jan 9, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2026). Credit Card Fraud Detection Platform Report [Dataset]. https://www.archivemarketresearch.com/reports/credit-card-fraud-detection-platform-14208
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Jan 9, 2026
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2026 - 2034
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Credit Card Fraud Detection Platform market was valued at USD XXX million in 2024 and is projected to reach USD XXX million by 2033, with an expected CAGR of XX % during the forecast period.

  18. h

    Credit_Card_Fraud_Analysis_Project

    • huggingface.co
    Updated Feb 8, 2026
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bar Vero (2026). Credit_Card_Fraud_Analysis_Project [Dataset]. https://huggingface.co/datasets/Barvero/Credit_Card_Fraud_Analysis_Project
    Explore at:
    Dataset updated
    Feb 8, 2026
    Authors
    Bar Vero
    Description

    Credit Card Fraud Detection Analysis and Preprocessing -

    Introduction, Data Source, and Project Goal This project presents an Exploratory Data Analysis (EDA) and strategic data preparation for a credit card fraud detection dataset. The dataset, sourced from Kaggle, contains over 280,000 records. The primary challenge identified is extreme class imbalance, as less than 0.2% of transactions are fraudulent. The goal is to prepare the data for a classification model capable of predicting whether… See the full description on the dataset page: https://huggingface.co/datasets/Barvero/Credit_Card_Fraud_Analysis_Project.

  19. Credit Card Fraud Detection Platform Market Forecast to 2033

    • univdatos.com
    Updated Nov 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UnivDatos (2025). Credit Card Fraud Detection Platform Market Forecast to 2033 [Dataset]. https://univdatos.com/reports/credit-card-fraud-detection-platform-market
    Explore at:
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    UnivDatos
    License

    https://univdatos.com/privacy-policyhttps://univdatos.com/privacy-policy

    Description

    The Global Credit Card Fraud Detection Platform Market was valued at USD 4,332.80 million in 2024 and is expected to grow at a CAGR of around 13.86% during 2025-2033.

  20. Credit Card Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Feb 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshiya Kishore (2024). Credit Card Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/arshiyakishore/credit-card-fraud-detection-dataset
    Explore at:
    zip(69076754 bytes)Available download formats
    Dataset updated
    Feb 17, 2024
    Authors
    Arshiya Kishore
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Arshiya Kishore

    Released under MIT

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Arif Miah (2025). Credit Card Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/miadul/credit-card-fraud-detection-dataset
Organization logo

Credit Card Fraud Detection Dataset

High-quality financial dataset for machine learning classification

Explore at:
zip(113829 bytes)Available download formats
Dataset updated
Dec 17, 2025
Authors
Arif Miah
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

šŸ“Œ Dataset Description

**Credit Card Fraud Detection Dataset **

This dataset contains 10,000 credit card transactions designed to support research and experimentation in fraud detection using machine learning. The data realistically simulates both legitimate and fraudulent transactions, while maintaining a naturally imbalanced class distribution, which reflects real-world financial systems.

The dataset is suitable for binary classification tasks, where the objective is to predict whether a transaction is fraudulent (1) or legitimate (0) based on transaction behavior, risk indicators, and cardholder information.

šŸŽÆ Objective

To build and evaluate machine learning models that can identify fraudulent credit card transactions using transaction-level features such as amount, time, location mismatch, device trust, and transaction velocity.

šŸ“Š Dataset Characteristics

  • Total Records: 10,000
  • Total Features: 10
  • Target Variable: is_fraud
  • Class Distribution: Highly imbalanced (fraud ā‰ˆ 4–5%)
  • Data Type: Synthetic (privacy-safe)

🧾 Feature Description

Feature NameDescription
transaction_idUnique identifier for each transaction
amountTransaction amount
transaction_hourHour of transaction (0–23)
merchant_categoryType of merchant
foreign_transactionIndicates if transaction is international (0/1)
location_mismatchBilling vs transaction location mismatch (0/1)
device_trust_scoreTrust score of the device (0–100)
velocity_last_24hNumber of transactions in last 24 hours
cardholder_ageAge of the cardholder
is_fraudTarget variable (0 = Normal, 1 = Fraud)

šŸ” Potential Use Cases

  • Credit card fraud detection systems
  • Imbalanced classification problems
  • Feature engineering and risk analysis
  • Model comparison (Logistic Regression, Random Forest, XGBoost, etc.)
  • Anomaly detection research

šŸ“ˆ Recommended Evaluation Metrics

Due to class imbalance, the following metrics are recommended:

  • Precision
  • Recall
  • F1-score
  • ROC-AUC
  • Confusion Matrix

āš ļø Notes

  • This dataset is synthetically generated and does not contain any real customer data.
  • It is intended for educational, research, and practice purposes only.
Search
Clear search
Close search
Google apps
Main menu