Facebook
TwitterPayment card fraud - including both credit cards and debit cards - is forecast to grow by over ** billion U.S. dollars between 2022 and 2028. Especially outside the United States, the amount of fraudulent payments almost doubled from 2014 to 2021. In total, fraudulent card payments reached ** billion U.S. dollars in 2021. Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Digital payments are evolving, but so are cyber criminals.
According to the Data Breach Index, more than 5 million records are being stolen on a daily basis, a concerning statistic that shows - fraud is still very common both for Card-Present and Card-not Present type of payments.
In today’s digital world where trillions of Card transaction happens per day, detection of fraud is challenging.
This Dataset sourced by some unnamed institute.
Feature Explanation:
distance_from_home - the distance from home where the transaction happened.
distance_from_last_transaction - the distance from last transaction happened.
ratio_to_median_purchase_price - Ratio of purchased price transaction to median purchase price.
repeat_retailer - Is the transaction happened from same retailer.
used_chip - Is the transaction through chip (credit card).
used_pin_number - Is the transaction happened by using PIN number.
online_order - Is the transaction an online order.
fraud - Is the transaction fraudulent.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Below is a draft DMP–style description of your credit‐card fraud detection experiment, modeled on the antiquities example:
Research Domain
This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous credit‐card transactions in real time to reduce financial losses and improve trust in digital payment systems.
Purpose
The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraud‐detection algorithms and support future research on anomaly detection in transaction data.
Data Sources
We used the publicly available credit‐card transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.
Method of Dataset Preparation
Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepo’s requirements.
Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).
Splitting: Programmatically derived three subsets—training (70%), validation (15%), test (15%)—using range‐based filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.
Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from “Y”/“N” to 1/0 and dropped non‐feature identifiers (actionnr, merchant_id).
Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the held‐out test set.
Dataset Structure
The raw data is a single CSV with columns:
actionnr (integer transaction ID)
merchant_id (string)
average_amount_transaction_day (float)
transaction_amount (float)
is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)
total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)
Naming Conventions
All columns use lowercase snake_case.
Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.
Files in the code repo follow a clear structure:
├── data/ # local copies only; raw data lives in DBRepo
├── notebooks/Task.ipynb
├── models/rf_model_v1.joblib
├── outputs/ # confusion_matrix.png, roc_curve.png, predictions.csv
├── README.md
├── requirements.txt
└── codemeta.json
Required Software
Python 3.9+
pandas, numpy (data handling)
scikit-learn (modeling, metrics)
matplotlib (visualizations)
dbrepo‐client.py (DBRepo API)
requests (TU WRD API)
Additional Resources
Original dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud
Scikit-learn docs: https://scikit-learn.org/stable
DBRepo API guide: via the starter notebook’s dbrepo_client.py template
TU WRD REST API spec: https://test.researchdata.tuwien.ac.at/api/docs
Data Limitations
Highly imbalanced: only ~0.17% of transactions are fraudulent.
Anonymized PCA features (V1–V28) hidden; we extended with domain features but cannot reverse engineer raw variables.
Time‐bounded: only covers two days of transactions, may not capture seasonal patterns.
Licensing and Attribution
Raw data: CC-0 (per Kaggle terms)
Code & notebooks: MIT License
Model artifacts & outputs: CC-BY 4.0
DUWRD records include ORCID identifiers for the author.
Recommended Uses
Benchmarking new fraud‐detection algorithms on a standard imbalanced dataset.
Educational purposes: demonstrating model‐training pipelines, FAIR data practices.
Extension: adding time‐series or deep‐learning models.
Known Issues
Possible temporal leakage if date/time features not handled correctly.
Model performance may degrade on live data due to concept drift.
Binary flags may oversimplify nuanced transaction outcomes.
Facebook
Twitterhttps://coinlaw.io/privacy-policy/https://coinlaw.io/privacy-policy/
Imagine this: you're sitting at a coffee shop, enjoying a latte, and casually checking your email. Suddenly, a notification pops up, your credit card has been charged $500 for something you didn’t buy. Scenarios like this are becoming alarmingly common. Credit card fraud is a modern menace, evolving with every...
Facebook
TwitterIn 2024, damage caused by credit card fraud reported by Japanese companies amounted to **** billion Japanese yen, reaching a new decade high. Losses caused by illegal credit card use increased from about **** billion yen in the previous year.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Credit Card Transactions Dataset includes more than 20 million credit card transactions over the decades of 2,000 U.S. resident consumers created by IBM's simulations, providing details of each transaction and fraudulent labels.
2) Data Utilization (1) Credit Card Transactions Dataset has characteristics that: • This dataset provides a variety of properties that are similar to real credit card transactions, including transaction amount, time, card information, purchase location, and store category (MCC). (2) Credit Card Transactions Dataset can be used to: • Development of Credit Card Fraud Detection Model: Using transaction history and properties, you can build a fraud (abnormal transaction) detection model based on machine learning. • Analysis of consumption patterns and risks: Long-term and diverse transaction data can be used to analyze customer consumption behavior and identify risk factors.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset, commonly known as creditcard.csv, contains anonymized credit card transactions made by European cardholders in September 2013. It includes 284,807 transactions, with 492 labeled as fraudulent. Due to confidentiality constraints, features have been transformed using PCA, except for 'Time' and 'Amount'.
This dataset was used in the research article titled "A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning for Credit Card Fraud Detection". The study proposes an ensemble model integrating techniques such as Autoencoders, Isolation Forest, Local Outlier Factor, and supervised classifiers including XGBoost and Random Forest, aiming to improve the detection of rare fraudulent patterns while maintaining efficiency and scalability.
Key Features:
30 numerical input features (V1–V28, Time, Amount) Class label indicating fraud (1) or normal (0) Imbalanced class distribution typical in real-world fraud detection Use Case: Ideal for benchmarking and evaluating anomaly detection and classification algorithms in highly imbalanced data scenarios.
Source: Originally published by the Machine Learning Group at Université Libre de Bruxelles.
Facebook
TwitterData on the total annual value of fraud losses on debit and credit cards issued in the United Kingdom (UK) from 2002 to 2020, abroad and in the United Kingdom (UK) shows that the total value of annual fraud losses on UK issued debit and credit cards fluctuated overall during the period under observation, reaching a value of 574.2 million British pounds as of 2020. The smallest value of fraud losses on debit and credit cards occurred in 2011, when credit and debit card fraud losses amounting to *** million British pounds were recorded.
Facebook
TwitterThis statistic presents the value of losses due to synthetic credit card fraud in the United States from 2015 to 2017, with projections extending to 2020. Such fraud led to *** million U.S. dollars in damages in 2017, an amount which was expected to increase to nearly **** trillion U.S. dollars in 2020.
Facebook
TwitterCard fraud losses across the world increased by more than 10 percent between 2020 and 2021, the largest increase since 2018. It was estimated that merchants and card acquirers lost well over 30 billion U.S. dollars, with - so the source adds - roughly 12 billion U.S. dollar coming from the United States alone. Note that the figures provided here included both credit card fraud and debit card fraud. The source does not separate between the two, and also did not provide figures on the United States - a country known for its reliance on credit cards.
Facebook
TwitterIn 2024, damage caused by credit card fraud reported by Japanese companies amounted to **** billion Japanese yen. With about **** billion yen, fraudulent use of credit card numbers accounted for the largest amount of losses.
Facebook
Twitterhttps://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The global credit card fraud detection platform market is experiencing robust growth, driven by the escalating volume of digital transactions and the increasing sophistication of fraud techniques. The market, valued at approximately $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033. This substantial growth is fueled by several key factors. The rising adoption of e-commerce and mobile payments creates a larger attack surface for fraudsters, necessitating advanced detection solutions. Furthermore, the increasing prevalence of sophisticated fraud schemes, such as synthetic identity theft and account takeover, demands more intelligent and adaptive fraud detection systems. The market is segmented by screening type (manual and automatic) and application (personal and enterprise), with automatic screening and enterprise applications driving the majority of growth due to their scalability and efficiency. The competitive landscape is dynamic, with established players like FICO, Mastercard, and Visa competing alongside innovative startups such as Forter and Feedzai. These companies continuously develop AI-powered solutions leveraging machine learning and big data analytics to identify and prevent fraudulent transactions effectively. Regional growth varies, with North America and Europe currently holding significant market share, but Asia-Pacific is expected to experience rapid expansion in the coming years due to rising digital adoption and economic growth in countries like India and China. The continued growth of the credit card fraud detection platform market hinges on several factors. The increasing demand for real-time fraud detection capabilities is driving the adoption of cloud-based solutions and the integration of advanced analytics. Regulatory compliance requirements, particularly around data privacy and security, also contribute to market growth. However, challenges remain. The cost of implementing and maintaining these sophisticated systems can be prohibitive for smaller businesses. Moreover, the constant evolution of fraud techniques necessitates ongoing investment in research and development to stay ahead of emerging threats. The market’s future trajectory will depend on the continued innovation in fraud detection technologies, the ability to adapt to evolving fraud tactics, and the successful integration of these solutions across various industries and geographies.
Facebook
TwitterUnlike many other countries, the United States did not see a surge in the “card-not-present” fraud rate immediately after migrating to chip-card technology. Instead, the U.S. card-not-present fraud rate of non-prepaid debit cards has increased gradually over the past decade. Merchants’ and cardholders’ card-not-present fraud loss rates have increased for both dual- and single-message networks, while issuers’ card-not-present fraud loss rate has increased for single-message networks.
Facebook
TwitterThis statistic shows the total damage due to credit card fraud in the Netherlands from 2013 to 2018 (in million euros). In 2018, credit card fraud in the Netherlands led to a total damage of approximately 3.4 million euros. Like in other European countries, credit and debit cards are a popular form of digital payment methods for either physical purchases in brick-and-mortar-stores or for online purchases.
Consumers from the Benelux countries are familiar with credit cards and possess them. Ever since the launch of digital payments, the payments industry tried to create a secure environment for financial transactions. Debit and credit card fraud comes in different kinds, of which phishing, skimming and identity theft are the most common ones.
It is predicted that technology as EMV (Europay MasterCard Visa, a credit card technology that uses computer chips to authenticate chip-card transactions) should make some payments safer, but fraud could still remain a problem for the near future. During a survey in 2015, approximately one third of Dutch respondents indicated they were fairly concerned about surveillance via payment cards.
Facebook
TwitterThis dataset was created by Dileep
Facebook
TwitterThis dataset was created by Shubham Lipare
Facebook
TwitterDuring the period between 2016 and 2018, over ** percent of credit card fraud prosecutions at the court in China were on malicious overdraft. The financial fraud cases here refer to fraud crimes in which credit card, insurance, securities or other financial products are involved.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data was reported at 63.900 % in 2018. Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data is updated yearly, averaging 63.900 % from Dec 2018 (Median) to 2018, with 1 observations. Malaysia Consumers: Security: Credit Card/Debit Card/Bank Fraud data remains active status in CEIC and is reported by Malaysian Communications and Multimedia Commission. The data is categorized under Global Database’s Malaysia – Table MY.S026: E-Commerce Consumer Survey.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title: Credit Card Transactions Dataset for Fraud Detection (Used in: A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning)Description:This dataset, commonly known as creditcard.csv, contains anonymized credit card transactions made by European cardholders in September 2013. It includes 284,807 transactions, with 492 labeled as fraudulent. Due to confidentiality constraints, features have been transformed using PCA, except for 'Time' and 'Amount'.This dataset was used in the research article titled "A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning for Credit Card Fraud Detection". The study proposes an ensemble model integrating techniques such as Autoencoders, Isolation Forest, Local Outlier Factor, and supervised classifiers including XGBoost and Random Forest, aiming to improve the detection of rare fraudulent patterns while maintaining efficiency and scalability.Key Features:30 numerical input features (V1–V28, Time, Amount)Class label indicating fraud (1) or normal (0)Imbalanced class distribution typical in real-world fraud detectionUse Case:Ideal for benchmarking and evaluating anomaly detection and classification algorithms in highly imbalanced data scenarios.Source:Originally published by the Machine Learning Group at Université Libre de Bruxelles.https://www.kaggle.com/mlg-ulb/creditcardfraudLicense:This dataset is distributed for academic and research purposes only. Please cite the original source when using the dataset.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed records of credit card fraud alerts, including suspicious transaction details, merchant information, alert status, and investigator actions. It enables financial institutions to detect, investigate, and respond to fraudulent activities efficiently, supporting enterprise risk management and loss mitigation.
Facebook
TwitterPayment card fraud - including both credit cards and debit cards - is forecast to grow by over ** billion U.S. dollars between 2022 and 2028. Especially outside the United States, the amount of fraudulent payments almost doubled from 2014 to 2021. In total, fraudulent card payments reached ** billion U.S. dollars in 2021. Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018.