98 datasets found

Card fraud in the U.S. versus rest of the world 2014-2023, with global...
statista.com
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Card fraud in the U.S. versus rest of the world 2014-2023, with global forecasts 2028 [Dataset]. https://www.statista.com/statistics/1264329/value-fraudulent-card-transactions-worldwide/
Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2024
Area covered
United States
Description
Payment card fraud - including both credit cards and debit cards - is forecast to grow by over ** billion U.S. dollars between 2022 and 2028. Especially outside the United States, the amount of fraudulent payments almost doubled from 2014 to 2021. In total, fraudulent card payments reached ** billion U.S. dollars in 2021. Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018.
Annual card fraud - credit cards and debit cards combined - worldwide...
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Annual card fraud - credit cards and debit cards combined - worldwide 2014-2023 [Dataset]. https://www.statista.com/statistics/1394119/global-card-fraud-losses/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2024
Area covered
Worldwide
Description
Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018. It was estimated that merchants and card acquirers lost well over ** billion U.S. dollars, with - so the source adds - roughly ** billion U.S. dollar coming from the United States alone. Note that the figures provided here included both credit card fraud and debit card fraud. The source does not separate between the two, and also did not provide figures on the United States - a country known for its reliance on credit cards.
Fraud Detection Dataset
kaggle.com
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sameerk (2024). Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/sameerk2004/fraud-detection-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 9, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sameerk
Description
The dataset is generated using the Faker library to simulate transaction data. It contains several columns that represent both user and transaction information, including features for detecting fraudulent activities. The data includes a mix of categorical, numerical, and datetime values, which need to be processed for machine learning.
Credit card fraud detection
kaggle.com
Updated Jun 19, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dileep (2019). Credit card fraud detection [Dataset]. https://www.kaggle.com/datasets/dileep070/anomaly-detection
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 19, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dileep
Description
Dataset

This dataset was created by Dileep

Contents
t
Credit Card Fraud Detection
test.researchdata.tuwien.ac.at
zenodo.org
+1more
csv, json, pdf +2
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja (2025). Credit Card Fraud Detection [Dataset]. http://doi.org/10.82556/yvxj-9t22
Explore at:
text/markdown, csv, pdf, txt, jsonAvailable download formats
Unique identifier
https://doi.org/10.82556/yvxj-9t22
Dataset updated
Apr 28, 2025
Dataset provided by
TU Wien
Authors
Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 28, 2025
Description
Below is a draft DMP–style description of your credit‐card fraud detection experiment, modeled on the antiquities example:

1. Dataset Description

Research Domain
This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous credit‐card transactions in real time to reduce financial losses and improve trust in digital payment systems.

Purpose
The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraud‐detection algorithms and support future research on anomaly detection in transaction data.

Data Sources
We used the publicly available credit‐card transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.

Method of Dataset Preparation

Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepo’s requirements.

Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).

Splitting: Programmatically derived three subsets—training (70%), validation (15%), test (15%)—using range‐based filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.

Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from “Y”/“N” to 1/0 and dropped non‐feature identifiers (actionnr, merchant_id).

Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the held‐out test set.

2. Technical Details

Dataset Structure

The raw data is a single CSV with columns:

actionnr (integer transaction ID)

merchant_id (string)

average_amount_transaction_day (float)

transaction_amount (float)

is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)

total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)

Naming Conventions

All columns use lowercase snake_case.

Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.

Files in the code repo follow a clear structure:

├── data/ # local copies only; raw data lives in DBRepo ├── notebooks/Task.ipynb ├── models/rf_model_v1.joblib ├── outputs/ # confusion_matrix.png, roc_curve.png, predictions.csv ├── README.md ├── requirements.txt └── codemeta.json

Required Software

Python 3.9+

pandas, numpy (data handling)

scikit-learn (modeling, metrics)

matplotlib (visualizations)

dbrepo‐client.py (DBRepo API)

requests (TU WRD API)

Additional Resources

Original dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud

Scikit-learn docs: https://scikit-learn.org/stable

DBRepo API guide: via the starter notebook’s dbrepo_client.py template

TU WRD REST API spec: https://test.researchdata.tuwien.ac.at/api/docs

3. Further Details

Data Limitations

Highly imbalanced: only ~0.17% of transactions are fraudulent.

Anonymized PCA features (V1–V28) hidden; we extended with domain features but cannot reverse engineer raw variables.

Time‐bounded: only covers two days of transactions, may not capture seasonal patterns.

Licensing and Attribution

Raw data: CC-0 (per Kaggle terms)

Code & notebooks: MIT License

Model artifacts & outputs: CC-BY 4.0

DUWRD records include ORCID identifiers for the author.

Recommended Uses

Benchmarking new fraud‐detection algorithms on a standard imbalanced dataset.

Educational purposes: demonstrating model‐training pipelines, FAIR data practices.

Extension: adding time‐series or deep‐learning models.

Known Issues

Possible temporal leakage if date/time features not handled correctly.

Model performance may degrade on live data due to concept drift.

Binary flags may oversimplify nuanced transaction outcomes.
Identity theft complaints, by nature of crime U.S. 2022
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Identity theft complaints, by nature of crime U.S. 2022 [Dataset]. https://www.statista.com/statistics/194017/identity-theft-complaints-in-the-us-by-nature-of-crime/
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
United States
Description
In 2022, about ******* complaints filed with the Federal Trade Commission (FTC) were due to credit card fraud in the United States. An additional ****** complaints were filed with the FTC due to government documents/benefits fraud.
U.S. most common financial cybercrime or fraud victims 2023
statista.com
Updated Sep 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). U.S. most common financial cybercrime or fraud victims 2023 [Dataset]. https://www.statista.com/statistics/1460422/financial-cybercrime-common-fraud-us/
Explore at:
Dataset updated
Sep 15, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Sep 15, 2023 - Sep 18, 2023
Area covered
United States
Description
A September 2023 survey of American adults found that the most frequently experienced type of financial cybercrime was credit card fraud, reported by roughly 64 percent of respondents. The breach of financial data was ranked second, followed by account hacking.
C
Credit Card Fraud Detection Platform Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jul 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Credit Card Fraud Detection Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/credit-card-fraud-detection-platform-1982870
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global credit card fraud detection platform market is experiencing robust growth, driven by the increasing prevalence of digital transactions and the sophistication of fraudulent activities. The market, estimated at $15 billion in 2025, is projected to maintain a healthy Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $45 billion by 2033. This expansion is fueled by several key factors: the rising adoption of e-commerce and mobile payments, the increasing volume of online transactions, the growing need for robust security measures among businesses to protect customer data and prevent financial losses, and the continuous evolution of fraud techniques necessitating advanced detection capabilities. Furthermore, the increasing regulatory scrutiny and compliance requirements are pushing organizations to invest heavily in sophisticated fraud detection systems. The market is segmented by deployment (cloud-based and on-premise), by organization size (small, medium, and large enterprises), and by industry vertical (banking, financial services, and insurance, retail, healthcare, and others). Key players in this dynamic market include established companies like Kount, ClearSale, Stripe Radar, Riskified, and FICO, alongside emerging technology providers like Akkio and Dataiku. These companies are constantly innovating to improve detection accuracy, reduce false positives, and offer seamless integration with existing payment processing systems. While challenges remain, such as the rising complexity of fraud schemes and the need to balance security with user experience, the market is poised for continued strong growth, driven by technological advancements in machine learning, artificial intelligence, and big data analytics. The increasing adoption of real-time fraud detection and advanced analytics capabilities will further shape the market landscape in the coming years, creating opportunities for both established and emerging players.
Fraud Detection in Financial Transactions
kaggle.com
Updated Jan 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darshan Dalvi (2025). Fraud Detection in Financial Transactions [Dataset]. https://www.kaggle.com/datasets/darshandalvi12/fraud-detection-in-financial-transactions/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 17, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Darshan Dalvi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Credit Card Fraud Detection Dataset (Updated)

This dataset contains 284,807 transactions from a credit card company, where 492 transactions are fraudulent. The data is highly imbalanced, with only a small fraction of transactions being fraudulent. The dataset is commonly used to build and evaluate fraud detection models.

Dataset Details:

Number of Transactions: 284,807

Fraudulent Transactions: 492 (Highly Imbalanced)

Features:

28 anonymized features (V1 to V28)

Transaction amount

Timestamp

Label:

0: Legitimate

1: Fraudulent

Data Preprocessing:

SMOTE (Synthetic Minority Oversampling Technique) has been applied to address the class imbalance in the dataset, generating synthetic examples for the minority class (fraudulent transactions).

Additional Operations: Various preprocessing steps were performed, including data cleaning and feature engineering, to ensure the quality of the dataset for model training.

Processed Files:

The dataset has been split into training and testing sets and saved in the following files: - X_train.csv: Feature data for the training set - X_test.csv: Feature data for the testing set - y_train.csv: Labels for the training set (fraudulent or legitimate) - y_test.csv: Labels for the testing set

This updated dataset is ready to be used for training and evaluating machine learning models, specifically designed for credit card fraud detection tasks.

This description highlights the key aspects of the dataset, including its preprocessing steps and the availability of the processed files for ease of use.
D
Credit Card Generator Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Oct 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Credit Card Generator Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/credit-card-generator-market
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Oct 5, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Credit Card Generator Market Outlook

The global credit card generator market is projected to experience robust growth with a market size of approximately USD 580 million in 2023, and it is anticipated to reach USD 1.2 billion by 2032, exhibiting a compound annual growth rate (CAGR) of 8.5%. The rising need for secure and efficient credit card testing tools, driven by the expansion of e-commerce and digital transactions, forms a significant growth catalyst for this market. As online retail and digital financial services burgeon, the demand for reliable credit card generators continues to escalate, underscoring the importance of this market segment.

One of the pivotal growth drivers for the credit card generator market is the increasing complexity and sophistication of online payment systems. As e-commerce platforms and digital payment solutions proliferate worldwide, there is a growing need for comprehensive testing tools to ensure the reliability and security of these systems. Credit card generators play a crucial role in this context by providing developers and testers with the means to simulate various credit card scenarios, thereby enhancing the robustness of payment processing systems. Additionally, the rise in cyber threats and fraud necessitates stringent testing, further propelling market growth.

Another significant factor contributing to the market's expansion is the growing emphasis on fraud prevention and security. Financial institutions and businesses are increasingly investing in sophisticated tools to combat fraud and secure financial transactions. Credit card generators offer a practical solution for testing the efficacy of anti-fraud measures and ensuring that security protocols are adequately robust. By enabling the simulation of fraudulent activities and various transaction scenarios, these tools help organizations better prepare for and mitigate potential security breaches.

Furthermore, the marketing and promotional applications of credit card generators are also driving market growth. Companies leveraging digital marketing strategies use these tools to create dummy credit card numbers for various promotional activities, such as offering free trials or discounts, without exposing real customer data. This capability not only aids in marketing efforts but also ensures compliance with data privacy regulations, thereby enhancing consumer trust and brand reputation. The versatility of credit card generators in supporting both operational and marketing functions underscores their growing importance in the digital age.

Regionally, North America holds a significant share of the credit card generator market, driven by the high penetration of digital payment systems and advanced cybersecurity measures in the region. The presence of numerous financial institutions and technology companies further bolsters the market in North America. Meanwhile, Asia Pacific is expected to witness the fastest growth, fueled by the rapid digitalization of economies, increasing internet penetration, and burgeoning e-commerce activities. Europe also presents substantial opportunities due to stringent data protection regulations and the widespread adoption of digital transaction systems.

Type Analysis

The credit card generator market can be segmented by type into software and online services. Software-based credit card generators are widely used by developers and testers within organizations to simulate credit card transactions and validate payment processing systems. These tools are typically integrated into the development and testing environments, providing a controlled and secure platform for generating valid credit card numbers. The demand for software-based generators is driven by their ability to offer customizable options and advanced features, such as bulk generation and API integration, which enhance the efficiency of testing processes.

Online services, on the other hand, cater to a broader audience, including individual users, small businesses, and marketers. These services are accessible via web platforms and provide an easy-to-use interface for generating credit card numbers for various purposes, such as testing, fraud prevention, and marketing promotions. The growing popularity of online credit card generators can be attributed to their convenience, accessibility, and the increasing need for temporary and disposable credit card numbers in the digital economy. These services are particularly useful for busin
CCFD_dataset
figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nur Amirah Ishak; Keng-Hoong Ng; Gee-Kok Tong; Suraya Nurain Kalid; Kok-Chin Khor (2023). CCFD_dataset [Dataset]. http://doi.org/10.6084/m9.figshare.16695616.v3
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.16695616.v3
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Nur Amirah Ishak; Keng-Hoong Ng; Gee-Kok Tong; Suraya Nurain Kalid; Kok-Chin Khor
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The dataset has been released by [1], which had been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of Université Libre de Bruxelles (ULB) on big data mining and fraud detection. [1] Pozzolo, A. D., Caelan, O., Johnson, R. A., and Bontempi, G. (2015). Calibrating Probability with Undersampling for Unbalanced Classification. 2015 IEEE Symposium Series on Computational, pp. 159-166, doi: 10.1109/SSCI.2015.33 open source kaggle : https://www.kaggle.com/mlg-ulb/creditcardfraud
c
Data from: Credit Card Transactions Dataset
cubig.ai
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Credit Card Transactions Dataset [Dataset]. https://cubig.ai/store/products/336/credit-card-transactions-dataset
Explore at:
Dataset updated
May 28, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Credit Card Transactions Dataset includes more than 20 million credit card transactions over the decades of 2,000 U.S. resident consumers created by IBM's simulations, providing details of each transaction and fraudulent labels.

2) Data Utilization (1) Credit Card Transactions Dataset has characteristics that: • This dataset provides a variety of properties that are similar to real credit card transactions, including transaction amount, time, card information, purchase location, and store category (MCC). (2) Credit Card Transactions Dataset can be used to: • Development of Credit Card Fraud Detection Model: Using transaction history and properties, you can build a fraud (abnormal transaction) detection model based on machine learning. • Analysis of consumption patterns and risks: Long-term and diverse transaction data can be used to analyze customer consumption behavior and identify risk factors.
Data from: Measuring Crime Rates of Prisoners in Colorado, 1988-1989
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+1more
Updated Mar 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Justice (2025). Measuring Crime Rates of Prisoners in Colorado, 1988-1989 [Dataset]. https://catalog.data.gov/dataset/measuring-crime-rates-of-prisoners-in-colorado-1988-1989-5f9a6
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
National Institute of Justicehttp://nij.ojp.gov/
Description
In the late 1970s, the Rand Corporation pioneered a method of collecting crime rate statistics. They obtained reports of offending behavior--types and frequencies of crimes committed--directly from offenders serving prison sentences. The current study extends this research by exploring the extent to which variation in the methodological approach affects prisoners' self-reports of criminal activity. If the crime rates reported in this survey remained constant across methods, perhaps one of the new techniques developed would be easier and/or less expensive to administer. Also, the self-reported offending rate data for female offenders in this collection represents the first time such data has been collected for females. Male and female prisoners recently admitted to the Diagnostic Unit of the Colorado Department of Corrections were selected for participation in the study. Prisoners were given one of two different survey instruments, referred to as the long form and short form. Both questionnaires dealt with the number of times respondents committed each of eight types of crimes during a 12-month measurement period. The crimes of interest were burglary, robbery, assault, theft, auto theft, forgery/credit card and check-writing crimes, fraud, and drug dealing. The long form of the instrument focused on juvenile and adult criminal activity and covered the offender's childhood and family. It also contained questions about the offender's rap sheet as one of the bases for validating the self-reported data. The crime count sections of the long form contained questions about motivation, initiative, whether the offender usually acted alone or with others, and if the crimes recorded included crimes against people he or she knew. Long-form data are given in Part 1. The short form of the survey had fewer or no questions compared with the long form on areas such as the respondent's rap sheet, the number of crimes committed as a juvenile, the number of times the respondent was on probation or parole, the respondent's childhood experiences, and the respondent's perception of his criminal career. These data are contained in Part 2. In addition, the surveys were administered under different conditions of confidentiality. Prisoners given what were called "confidential" interviews had their names identified with the survey. Those interviewed under conditions of anonymity did not have their names associated with the survey. The short forms were all administered anonymously, while the long forms were either anonymous or confidential. In addition to the surveys, data were collected from official records, which are presented in Part 3. The official record data collection form was designed to collect detailed criminal history information, particularly during the measurement period identified in the questionnaires, plus a number of demographic and drug-use items. This information, when compared with the self-reported offense data from the measurement period in both the short and long forms, allows a validity analysis to be performed.
Synthetic credit card fraud in the U.S. 2015-2017, with forecasts up to 2020...
statista.com
Updated Jul 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Synthetic credit card fraud in the U.S. 2015-2017, with forecasts up to 2020 [Dataset]. https://www.statista.com/statistics/942383/synthetic-credit-card-fraud-usa/
Explore at:
Dataset updated
Jul 8, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
This statistic presents the value of losses due to synthetic credit card fraud in the United States from 2015 to 2017, with projections extending to 2020. Such fraud led to *** million U.S. dollars in damages in 2017, an amount which was expected to increase to nearly **** trillion U.S. dollars in 2020.
Credit Card Fraud Detection Dataset
kaggle.com
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghanshyam Saini (2025). Credit Card Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/ghnshymsaini/credit-card-fraud-detection-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ghanshyam Saini
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Credit Card Fraud Detection Dataset (European Cardholders, September 2013)

As a data contributor, I'm sharing this crucial dataset focused on the detection of fraudulent credit card transactions. Recognizing these illicit activities is paramount for protecting customers and the integrity of financial systems.

About the Dataset:

This dataset encompasses credit card transactions made by European cardholders during a two-day period in September 2013. It presents a real-world scenario with a significant class imbalance, where fraudulent transactions are considerably less frequent than legitimate ones. Out of a total of 284,807 transactions, only 492 are instances of fraud, representing a mere 0.172% of the entire dataset.

Content of the Data:

Due to confidentiality concerns, the majority of the input features in this dataset have undergone a Principal Component Analysis (PCA) transformation. This means the original meaning and context of features V1, V2, ..., V28 are not directly provided. However, these principal components capture the variance in the underlying transaction data.

The only features that have not been transformed by PCA are:

Time: Numerical. Represents the number of seconds elapsed between each transaction and the very first transaction recorded in the dataset.

Amount: Numerical. The transaction amount in Euros (€). This feature could be valuable for cost-sensitive learning approaches.

The target variable for this classification task is:

Class: Integer. Takes the value 1 in the case of a fraudulent transaction and 0 otherwise.

Important Note on Evaluation:

Given the substantial class imbalance (far more legitimate transactions than fraudulent ones), traditional accuracy metrics based on the confusion matrix can be misleading. It is strongly recommended to evaluate models using the Area Under the Precision-Recall Curve (AUPRC), as this metric is more sensitive to the performance on the minority class (fraudulent transactions).

How to Use This Dataset:

Download the dataset file (likely in CSV format).

Load the data using libraries like Pandas.

Understand the class imbalance: Be aware that fraudulent transactions are rare.

Explore the features: Analyze the distributions of 'Time', 'Amount', and the PCA-transformed features (V1-V28).

Address the class imbalance: Consider using techniques like oversampling the minority class, undersampling the majority class, or using specialized algorithms designed for imbalanced datasets.

Build and train binary classification models to predict the 'Class' variable.

Evaluate your models using AUPRC to get a meaningful assessment of performance in detecting fraud.

Acknowledgements and Citation:

This dataset has been collected and analyzed through a research collaboration between Worldline and the Machine Learning Group (MLG) of ULB (Université Libre de Bruxelles).

When using this dataset in your research or projects, please cite the following works as appropriate:

Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015.

Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon.

Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE.

Andrea Dal Pozzolo. Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi).

Fabrizio Carcillo, Andrea Dal Pozzolo, Yann-Aël Le Borgne, Olivier Caelen, Yannis Mazzer, Gianluca Bontempi. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier.

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Gianluca Bontempi. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing.

Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019.

Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi *Combining Unsupervised and Supervised...
f
Example of the data set used in this article.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Oct 28, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Feng, Yuming; Gao, Zilin; Onasanya, Babatunde Oluwaseun; Liao, Xiaofeng; Jiang, Shan (2024). Example of the data set used in this article. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001376103
Explore at:
Dataset updated
Oct 28, 2024
Authors
Feng, Yuming; Gao, Zilin; Onasanya, Babatunde Oluwaseun; Liao, Xiaofeng; Jiang, Shan
Description
Credit card fraud identification is an important issue in risk prevention and control for banks and financial institutions. In order to establish an efficient credit card fraud identification model, this article studied the relevant factors that affect fraud identification. A credit card fraud identification model based on neural networks was constructed, and in-depth discussions and research were conducted. First, the layers of neural networks were deepened to improve the prediction accuracy of the model; second, this paper increase the hidden layer width of the neural network to improve the prediction accuracy of the model. This article proposes a new fusion neural network model by combining deep neural networks and wide neural networks, and applies the model to credit card fraud identification. The characteristic of this model is that the accuracy of prediction and F1 score are relatively high. Finally, use the random gradient descent method to train the model. On the test set, the proposed method has an accuracy of 96.44% and an F1 value of 96.17%, demonstrating good fraud recognition performance. After comparison, the method proposed in this paper is superior to machine learning models, ensemble learning models, and deep learning models.
G
Credit Card Transaction Fraud Flags
gomask.ai
csv, json
Updated Jul 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GoMask.ai (2025). Credit Card Transaction Fraud Flags [Dataset]. https://gomask.ai/marketplace/datasets/credit-card-transaction-fraud-flags
Explore at:
json, csv(10 MB)Available download formats
Dataset updated
Jul 12, 2025
Dataset provided by
GoMask.ai
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2024 - 2025
Area covered
Global
Variables measured
amount, currency, entry_mode, fraud_flag, fraud_score, merchant_id, terminal_id, card_present, merchant_name, transaction_id, and 11 more
Description
This dataset provides detailed credit card transaction records enriched with fraud suspicion flags, risk scores, and contextual information such as merchant, location, and transaction method. It is ideal for developing, training, and evaluating fraud detection models, as well as for analyzing transaction patterns and identifying emerging fraud tactics in the financial sector.
M
Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud
ceicdata.com
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2025). Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud [Dataset]. https://www.ceicdata.com/en/malaysia/ecommerce-consumer-survey/consumers-security-credit-carddebit-cardbank-fraud
Explore at:
Dataset updated
Jan 15, 2025
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2018
Area covered
Malaysia
Description
Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data was reported at 63.900 % in 2018. Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data is updated yearly, averaging 63.900 % from Dec 2018 (Median) to 2018, with 1 observations. Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data remains active status in CEIC and is reported by Malaysian Communications and Multimedia Commission. The data is categorized under Global Database’s Malaysia – Table MY.S026: E-Commerce Consumer Survey.
G
Credit Card Fraud Alerts
gomask.ai
csv, json
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GoMask.ai (2025). Credit Card Fraud Alerts [Dataset]. https://gomask.ai/marketplace/datasets/credit-card-fraud-alerts
Explore at:
json, csv(10 MB)Available download formats
Dataset updated
Aug 20, 2025
Dataset provided by
GoMask.ai
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2024 - 2025
Area covered
Global
Variables measured
alert_id, currency, account_id, alert_status, pattern_type, merchant_city, merchant_name, alert_datetime, alert_priority, merchant_state, and 11 more
Description
This dataset provides detailed records of credit card fraud alerts, including suspicious transaction details, merchant information, alert status, and investigator actions. It enables financial institutions to detect, investigate, and respond to fraudulent activities efficiently, supporting enterprise risk management and loss mitigation.
G
Credit Card Fraud Patterns
gomask.ai
csv, json
Updated Jul 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GoMask.ai (2025). Credit Card Fraud Patterns [Dataset]. https://gomask.ai/marketplace/datasets/credit-card-fraud-patterns
Explore at:
json, csv(10 MB)Available download formats
Dataset updated
Jul 12, 2025
Dataset provided by
GoMask.ai
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2024 - 2025
Area covered
Global
Variables measured
is_fraud, device_id, is_online, entry_mode, fraud_type, card_number, merchant_id, cardholder_id, currency_code, location_city, and 11 more
Description
This dataset contains simulated credit card transaction records, including detailed information on transaction amounts, merchant details, geolocation, device usage, and fraud labels. It is designed for training and evaluating fraud detection models, supporting the identification of both typical and anomalous transaction patterns. The dataset is ideal for fintech AI development, security analytics, and research into payment fraud behaviors.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Card fraud in the U.S. versus rest of the world 2014-2023, with global forecasts 2028 [Dataset]. https://www.statista.com/statistics/1264329/value-fraudulent-card-transactions-worldwide/

Card fraud in the U.S. versus rest of the world 2014-2023, with global forecasts 2028

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jun 25, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Dec 2024

Area covered

United States

Description

Payment card fraud - including both credit cards and debit cards - is forecast to grow by over ** billion U.S. dollars between 2022 and 2028. Especially outside the United States, the amount of fraudulent payments almost doubled from 2014 to 2021. In total, fraudulent card payments reached ** billion U.S. dollars in 2021. Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018.

Clear search

Close search

Google apps

Main menu

Card fraud in the U.S. versus rest of the world 2014-2023, with global...

Annual card fraud - credit cards and debit cards combined - worldwide...

Fraud Detection Dataset

Credit card fraud detection

Dataset

Contents

Credit Card Fraud Detection

1. Dataset Description

2. Technical Details

3. Further Details

Identity theft complaints, by nature of crime U.S. 2022

U.S. most common financial cybercrime or fraud victims 2023

Credit Card Fraud Detection Platform Report

Fraud Detection in Financial Transactions

Credit Card Fraud Detection Dataset (Updated)

Dataset Details:

Data Preprocessing:

Processed Files:

Credit Card Generator Market Report | Global Forecast From 2025 To 2033

Credit Card Generator Market Outlook

Type Analysis

CCFD_dataset

Data from: Credit Card Transactions Dataset

Data from: Measuring Crime Rates of Prisoners in Colorado, 1988-1989

Synthetic credit card fraud in the U.S. 2015-2017, with forecasts up to 2020...

Credit Card Fraud Detection Dataset

Credit Card Fraud Detection Dataset (European Cardholders, September 2013)

Example of the data set used in this article.

Credit Card Transaction Fraud Flags

Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud

Credit Card Fraud Alerts

Credit Card Fraud Patterns

Card fraud in the U.S. versus rest of the world 2014-2023, with global forecasts 2028