100+ datasets found
  1. Nature of crime: fraud and computer misuse

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Nature of crime: fraud and computer misuse [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/datasets/natureofcrimefraudandcomputermisuse
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Annual data on the nature of fraud and computer misuse offences. Data for the year ending March 2021 and March 2022 are from the Telephone-operated Crime Survey for England and Wales (TCSEW).

  2. Card fraud in the U.S. versus rest of the world 2014-2023, with global...

    • statista.com
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Card fraud in the U.S. versus rest of the world 2014-2023, with global forecasts 2028 [Dataset]. https://www.statista.com/statistics/1264329/value-fraudulent-card-transactions-worldwide/
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2024
    Area covered
    United States
    Description

    Payment card fraud - including both credit cards and debit cards - is forecast to grow by over ** billion U.S. dollars between 2022 and 2028. Especially outside the United States, the amount of fraudulent payments almost doubled from 2014 to 2021. In total, fraudulent card payments reached ** billion U.S. dollars in 2021. Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018.

  3. Crime in England and Wales: Additional tables on fraud and cybercrime

    • ons.gov.uk
    xlsx
    Updated Apr 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2019). Crime in England and Wales: Additional tables on fraud and cybercrime [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/datasets/crimeinenglandandwalesexperimentaltables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Apr 25, 2019
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Estimates from Crime Survey for England and Wales (CSEW) on fraud and computer misuse. Also data from Home Office police recorded crime on the number of online offences recorded by the police and Action Fraud figures broken down by police force area.

    These tables were formerly known as Experimental tables.

    Please note: This set of tables are no longer produced. All content previously released within these tables has, or will be, redistributed among other sets of tables.

  4. Fraud Detection Dataset

    • kaggle.com
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sameerk (2024). Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/sameerk2004/fraud-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sameerk
    Description

    The dataset is generated using the Faker library to simulate transaction data. It contains several columns that represent both user and transaction information, including features for detecting fraudulent activities. The data includes a mix of categorical, numerical, and datetime values, which need to be processed for machine learning.

  5. Consumer fraud report rate, by state U.S. 2022

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Consumer fraud report rate, by state U.S. 2022 [Dataset]. https://www.statista.com/statistics/302313/consumer-fraud-report-rate-in-the-us/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2022
    Area covered
    United States
    Description

    In 2022, the District of Columbia was the state with the highest rate of consumer fraud and other related problems, with a rate of ***** reports per 100,000 of the population. North Dakota had the lowest rate of consumer fraud reports in that year, at *** reports per 100,000 of the population.

  6. Fraud Statistics - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Dec 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2016). Fraud Statistics - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/plymouth-city-council-fraud-statistics-2015
    Explore at:
    Dataset updated
    Dec 19, 2016
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Data showing fraud statistics in Plymouth.

  7. Bank Account Fraud Dataset Suite (NeurIPS 2022)

    • kaggle.com
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sérgio Jesus (2023). Bank Account Fraud Dataset Suite (NeurIPS 2022) [Dataset]. https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sérgio Jesus
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Bank Account Fraud (BAF) suite of datasets has been published at NeurIPS 2022 and it comprises a total of 6 different synthetic bank account fraud tabular datasets. BAF is a realistic, complete, and robust test bed to evaluate novel and existing methods in ML and fair ML, and the first of its kind!

    This suite of datasets is: - Realistic, based on a present-day real-world dataset for fraud detection; - Biased, each dataset has distinct controlled types of bias; - Imbalanced, this setting presents a extremely low prevalence of positive class; - Dynamic, with temporal data and observed distribution shifts;
    - Privacy preserving, to protect the identity of potential applicants we have applied differential privacy techniques (noise addition), feature encoding and trained a generative model (CTGAN).

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2F4271ec763b04362801df2660c6e2ec30%2FScreenshot%20from%202022-11-29%2017-42-41.png?generation=1669743799938811&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2Faf502caf5b9e370b869b85c9d4642c5c%2FScreenshot%20from%202022-12-15%2015-17-59.png?generation=1671117525527314&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2Ff3789bd484ee392d648b7809429134df%2FScreenshot%20from%202022-11-29%2017-40-58.png?generation=1669743681526133&alt=media" alt="">

    Each dataset is composed of: - 1 million instances; - 30 realistic features used in the fraud detection use-case; - A column of “month”, providing temporal information about the dataset; - Protected attributes, (age group, employment status and % income).

    Detailed information (datasheet) on the suite: https://github.com/feedzai/bank-account-fraud/blob/main/documents/datasheet.pdf

    Check out the github repository for more resources and some example notebooks: https://github.com/feedzai/bank-account-fraud

    Read the NeurIPS 2022 paper here: https://arxiv.org/abs/2211.13358

    Learn more about Feedzai Research here: https://research.feedzai.com/

    Please, use the following citation of BAF dataset suite @article{jesusTurningTablesBiased2022, title={Turning the {{Tables}}: {{Biased}}, {{Imbalanced}}, {{Dynamic Tabular Datasets}} for {{ML Evaluation}}}, author={Jesus, S{\'e}rgio and Pombal, Jos{\'e} and Alves, Duarte and Cruz, Andr{\'e} and Saleiro, Pedro and Ribeiro, Rita P. and Gama, Jo{\~a}o and Bizarro, Pedro}, journal={Advances in Neural Information Processing Systems}, year={2022} }

  8. d

    Fraud Detection 2022-23 - Dataset - data.sa.gov.au

    • data.sa.gov.au
    Updated Jul 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Fraud Detection 2022-23 - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/fraud-detection-2022-23-defencesa
    Explore at:
    Dataset updated
    Jul 1, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    South Australia
    Description

    Fraud detected in Defence SA for 2022-23 Financial Year.

  9. w

    Fraud Statistics

    • data.wu.ac.at
    • data.gov.uk
    csv
    Updated Dec 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Plymouth City Council (2016). Fraud Statistics [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/ZGQwMTRiNDctYWFmNC00Mjk2LThkMWMtZTY4MzBjMDAzZWI0
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 19, 2016
    Dataset provided by
    Plymouth City Council
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Data showing fraud statistics in Plymouth.

  10. d

    Telecommunication scam criminal data

    • data.gov.tw
    api, csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Police Administration, Telecommunication scam criminal data [Dataset]. https://data.gov.tw/en/datasets/98176
    Explore at:
    api, csvAvailable download formats
    Dataset authored and provided by
    National Police Administration
    License

    https://data.gov.tw/licensehttps://data.gov.tw/license

    Description

    Provide telecommunications fraud case data (This data is preliminary statistics at the beginning of each quarter, for reference only, the accurate statistics are based on the annual crime statistics data of this department).

  11. Annual card fraud - credit cards and debit cards combined - worldwide...

    • statista.com
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Annual card fraud - credit cards and debit cards combined - worldwide 2014-2023 [Dataset]. https://www.statista.com/statistics/1394119/global-card-fraud-losses/
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2024
    Area covered
    Worldwide
    Description

    Card fraud losses across the world increased by more than ** percent between 2020 and 2021, the largest increase since 2018. It was estimated that merchants and card acquirers lost well over ** billion U.S. dollars, with - so the source adds - roughly ** billion U.S. dollar coming from the United States alone. Note that the figures provided here included both credit card fraud and debit card fraud. The source does not separate between the two, and also did not provide figures on the United States - a country known for its reliance on credit cards.

  12. Nature of fraud and computer misuse in England and Wales: appendix tables

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2024). Nature of fraud and computer misuse in England and Wales: appendix tables [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/datasets/natureoffraudandcomputermisuseinenglandandwalesappendixtables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 6, 2024
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Data from the Crime Survey for England and Wales (CSEW) and the National Fraud Intelligence Bureau (NFIB), including numbers of incidents and characteristics of victims.

  13. S

    E-Commerce Fraud Statistics And Facts (2025)

    • sci-tech-today.com
    Updated Aug 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sci-Tech Today (2025). E-Commerce Fraud Statistics And Facts (2025) [Dataset]. https://www.sci-tech-today.com/stats/e-commerce-fraud-statistics/
    Explore at:
    Dataset updated
    Aug 19, 2025
    Dataset authored and provided by
    Sci-Tech Today
    License

    https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Introduction

    E-Commerce Fraud Statistics: When you shop online, you probably think about getting the best deal, fast delivery, or whether the product will match the description. But there’s a whole other side to e-commerce that most shopping people never see, the world of fraud. And trust me, these numbers will shock you, because you did to me. These e-commerce fraud statistics aren’t just random figures in a report; they show the real damage that scammers are causing to businesses and even regular customers like us.

    Over the years, fraud in online shopping has gone from the stolen credit card to a multi-billion-dollar global problem. We’re talking billions lost every single year, and it’s only getting worse. These statistics tell a story about how criminals work, where they strike the most, and which types of fraud cost businesses the most money. If you’ve ever wondered just how big the problem is, or what kinds of tricks fraudsters are using, let’s get started.

  14. t

    Credit Card Fraud Detection

    • test.researchdata.tuwien.ac.at
    • zenodo.org
    • +1more
    csv, json, pdf +2
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja (2025). Credit Card Fraud Detection [Dataset]. http://doi.org/10.82556/yvxj-9t22
    Explore at:
    text/markdown, csv, pdf, txt, jsonAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    TU Wien
    Authors
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 28, 2025
    Description

    Below is a draft DMP–style description of your credit‐card fraud detection experiment, modeled on the antiquities example:

    1. Dataset Description

    Research Domain
    This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous credit‐card transactions in real time to reduce financial losses and improve trust in digital payment systems.

    Purpose
    The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraud‐detection algorithms and support future research on anomaly detection in transaction data.

    Data Sources
    We used the publicly available credit‐card transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.

    Method of Dataset Preparation

    1. Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepo’s requirements.

    2. Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).

    3. Splitting: Programmatically derived three subsets—training (70%), validation (15%), test (15%)—using range‐based filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.

    4. Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from “Y”/“N” to 1/0 and dropped non‐feature identifiers (actionnr, merchant_id).

    5. Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the held‐out test set.

    2. Technical Details

    Dataset Structure

    • The raw data is a single CSV with columns:

      • actionnr (integer transaction ID)

      • merchant_id (string)

      • average_amount_transaction_day (float)

      • transaction_amount (float)

      • is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)

      • total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)

    Naming Conventions

    • All columns use lowercase snake_case.

    • Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.

    • Files in the code repo follow a clear structure:

      ├── data/         # local copies only; raw data lives in DBRepo 
      ├── notebooks/Task.ipynb 
      ├── models/rf_model_v1.joblib 
      ├── outputs/        # confusion_matrix.png, roc_curve.png, predictions.csv 
      ├── README.md 
      ├── requirements.txt 
      └── codemeta.json 
      

    Required Software

    • Python 3.9+

    • pandas, numpy (data handling)

    • scikit-learn (modeling, metrics)

    • matplotlib (visualizations)

    • dbrepo‐client.py (DBRepo API)

    • requests (TU WRD API)

    Additional Resources

    3. Further Details

    Data Limitations

    • Highly imbalanced: only ~0.17% of transactions are fraudulent.

    • Anonymized PCA features (V1V28) hidden; we extended with domain features but cannot reverse engineer raw variables.

    • Time‐bounded: only covers two days of transactions, may not capture seasonal patterns.

    Licensing and Attribution

    • Raw data: CC-0 (per Kaggle terms)

    • Code & notebooks: MIT License

    • Model artifacts & outputs: CC-BY 4.0

    • DUWRD records include ORCID identifiers for the author.

    Recommended Uses

    • Benchmarking new fraud‐detection algorithms on a standard imbalanced dataset.

    • Educational purposes: demonstrating model‐training pipelines, FAIR data practices.

    • Extension: adding time‐series or deep‐learning models.

    Known Issues

    • Possible temporal leakage if date/time features not handled correctly.

    • Model performance may degrade on live data due to concept drift.

    • Binary flags may oversimplify nuanced transaction outcomes.

  15. Medicaid Fraud Control Units (MFCUs)

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Aug 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health & Human Services (2025). Medicaid Fraud Control Units (MFCUs) [Dataset]. https://catalog.data.gov/dataset/medicaid-fraud-control-units-mfcu-annual-spending-and-performance-statistics-ddfe3
    Explore at:
    Dataset updated
    Aug 11, 2025
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Description

    Medicaid Fraud Control Units (MFCU or Unit) investigate and prosecute Medicaid fraud as well as patient abuse and neglect in health care facilities. OIG certifies, and annually recertifies, each MFCU. OIG collects information about MFCU operations and assesses whether they comply with statutes, regulations, and OIG policy. OIG also analyzes MFCU performance based on 12 published performance standards and recommends program improvements where appropriate.

  16. Fraud Detection in Financial Transactions

    • kaggle.com
    Updated Jan 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darshan Dalvi (2025). Fraud Detection in Financial Transactions [Dataset]. https://www.kaggle.com/datasets/darshandalvi12/fraud-detection-in-financial-transactions/discussion?sort=undefined
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Darshan Dalvi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Credit Card Fraud Detection Dataset (Updated)

    This dataset contains 284,807 transactions from a credit card company, where 492 transactions are fraudulent. The data is highly imbalanced, with only a small fraction of transactions being fraudulent. The dataset is commonly used to build and evaluate fraud detection models.

    Dataset Details:

    • Number of Transactions: 284,807
    • Fraudulent Transactions: 492 (Highly Imbalanced)
    • Features:
      • 28 anonymized features (V1 to V28)
      • Transaction amount
      • Timestamp
    • Label:
      • 0: Legitimate
      • 1: Fraudulent

    Data Preprocessing:

    • SMOTE (Synthetic Minority Oversampling Technique) has been applied to address the class imbalance in the dataset, generating synthetic examples for the minority class (fraudulent transactions).
    • Additional Operations: Various preprocessing steps were performed, including data cleaning and feature engineering, to ensure the quality of the dataset for model training.

    Processed Files:

    The dataset has been split into training and testing sets and saved in the following files: - X_train.csv: Feature data for the training set - X_test.csv: Feature data for the testing set - y_train.csv: Labels for the training set (fraudulent or legitimate) - y_test.csv: Labels for the testing set

    This updated dataset is ready to be used for training and evaluating machine learning models, specifically designed for credit card fraud detection tasks.

    This description highlights the key aspects of the dataset, including its preprocessing steps and the availability of the processed files for ease of use.

  17. C

    Credit Card Fraud Detection Platform Report

    • archivemarketresearch.com
    doc, pdf, ppt
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archive Market Research (2025). Credit Card Fraud Detection Platform Report [Dataset]. https://www.archivemarketresearch.com/reports/credit-card-fraud-detection-platform-57120
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Mar 14, 2025
    Dataset authored and provided by
    Archive Market Research
    License

    https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global credit card fraud detection platform market is experiencing robust growth, driven by the escalating volume of digital transactions and the increasing sophistication of fraud techniques. The market, valued at approximately $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033. This substantial growth is fueled by several key factors. The rising adoption of e-commerce and mobile payments creates a larger attack surface for fraudsters, necessitating advanced detection solutions. Furthermore, the increasing prevalence of sophisticated fraud schemes, such as synthetic identity theft and account takeover, demands more intelligent and adaptive fraud detection systems. The market is segmented by screening type (manual and automatic) and application (personal and enterprise), with automatic screening and enterprise applications driving the majority of growth due to their scalability and efficiency. The competitive landscape is dynamic, with established players like FICO, Mastercard, and Visa competing alongside innovative startups such as Forter and Feedzai. These companies continuously develop AI-powered solutions leveraging machine learning and big data analytics to identify and prevent fraudulent transactions effectively. Regional growth varies, with North America and Europe currently holding significant market share, but Asia-Pacific is expected to experience rapid expansion in the coming years due to rising digital adoption and economic growth in countries like India and China. The continued growth of the credit card fraud detection platform market hinges on several factors. The increasing demand for real-time fraud detection capabilities is driving the adoption of cloud-based solutions and the integration of advanced analytics. Regulatory compliance requirements, particularly around data privacy and security, also contribute to market growth. However, challenges remain. The cost of implementing and maintaining these sophisticated systems can be prohibitive for smaller businesses. Moreover, the constant evolution of fraud techniques necessitates ongoing investment in research and development to stay ahead of emerging threats. The market’s future trajectory will depend on the continued innovation in fraud detection technologies, the ability to adapt to evolving fraud tactics, and the successful integration of these solutions across various industries and geographies.

  18. CerditCard fraud dataset

    • kaggle.com
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wasiq Ali (2025). CerditCard fraud dataset [Dataset]. https://www.kaggle.com/datasets/wasiqaliyasir/cerditcard-fraud-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Wasiq Ali
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Credit Card Fraud Detection Dataset

    Uncover fraudulent transactions with this anonymized, PCA-transformed dataset. Perfect for building and testing fraud detection algorithms!

    Dataset Overview

    • Objective: Detect fraudulent credit card transactions using anonymized features- - - -

    • Samples: 1,000 transactions

    • Features: 7 columns (5 PCA components + Transaction Amount + Target)

    Class Distribution:

    • Legit (Class 0): 993 transactions (~99.3%)

    • Fraud (Class 1): 7 transactions (~0.7%)

    • Key Challenge: Extreme class imbalance – realistic representation of fraud patterns

    Features Description

    Feature Description Characteristics

    V1-V5 Anonymized principal components PCA-transformed numerical features; preserves >transaction patterns while hiding sensitive details Amount Transaction value Highly variable (min: $0.20, max: $1,916.06); critical for fraud analysis Class Target variable Binary labels: • 0 = Legitimate transaction • 1 = Fraudulent transaction Key Insights & Patterns

    Fraud Indicators:

    • Fraudulent transactions occur across diverse amounts (low: $1.83 → high: $1,916)

    • No obvious amount threshold for fraud – requires nuanced modeling

    Sample fraud cases:

    1. V1:0.579, V2:-0.384, Amount:1916.06

    2. V1:1.023, V2:-0.638, Amount:1094.42

    Data Characteristics:
    1. V1-V5 Distributions:

    2. V1: Concentrated near zero (mean ≈ -0.1)

    3. V2: Wider spread (mean ≈ 0.05)

    4. V3-V5: Asymmetric distributions

    Amount Distribution:

    1. Right-skewed – most transactions < $500

    2.Fraud cases span low and high values

    Class Imbalance:

     - Severe skew: 993:7 legit-to-fraud ratio
    
     - Models must optimize for recall/precision over accuracy
    
    Analysis Challenges

    ⚠️ Class Imbalance: Standard accuracy metrics misleading

    🔍 Feature Interpretation: PCA components lack real-world context

    📊 Non-linear Patterns: Complex interactions between V1-V5

    ⚡ High Stakes: False negatives (missed fraud) costlier than false positives

    Recommended Applications Fraud Detection Models:

    Logistic Regression (with class weighting)

    Random Forests / XGBoost (handle non-linearities)

    Isolation Forests (anomaly detection)

    Evaluation Focus:

    Precision-Recall Curves > ROC-AUC

    F2-Score (prioritize recall)

    Confusion matrix analysis

    Advanced Techniques:

    SMOTE/ADASYN for oversampling

    Autoencoders for anomaly detection

    Feature engineering: Amount-to-Var ratios

    Dataset Source & Ethics Origin: Synthetic dataset mirroring real-world financial patterns

    Anonymization: Original features transformed via PCA for privacy compliance

    Bias Consideration: Geographic/cultural biases possible in source data

    Potential Use Cases

    🏦 Banking: Real-time transaction monitoring systems

    📱 FinTech Apps: Fraud detection APIs for payment gateways

    🎓 Education: Imbalanced classification tutorials

    🏆 Kaggle Competitions: Lightweight fraud detection challenge

    Example Project Idea "Minimalist Fraud Detector":

    # python
    from imblearn.pipeline import make_pipeline
    from sklearn.ensemble import RandomForestClassifier
    
    model = make_pipeline(
      RobustScaler(), 
      SMOTE(sampling_strategy=0.3), 
      RandomForestClassifier(class_weight={0:1, 1:15}) 
    )
    Optimize for: Recall @ Precision > 0.85
    

    Dataset Summary markdown | Feature | Mean | Std | Min | Max | |----------|----------|----------|-----------|-----------| | V1 | -0.11 | 1.02 | -3.24 | 3.85 | | V2 | 0.05 | 1.01 | -2.94 | 2.60 | | V3 | 0.02 | 0.98 | -3.02 | 2.95 |
    | Amount | 250.32 | 190.19 | 0.20 | 1916.06 |

  19. Credit card fraud detection

    • kaggle.com
    Updated Jun 19, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dileep (2019). Credit card fraud detection [Dataset]. https://www.kaggle.com/datasets/dileep070/anomaly-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 19, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dileep
    Description

    Dataset

    This dataset was created by Dileep

    Contents

  20. c

    Financial Payment Services Fraud Dataset

    • cubig.ai
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Financial Payment Services Fraud Dataset [Dataset]. https://cubig.ai/store/products/547/financial-payment-services-fraud-dataset
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Financial Payment Services Fraud Data Dataset is based on a real-world financial transaction simulation and was collected to detect fraudulent activities across various types of payments and transfers. It includes key financial data such as transaction time, type, amount, sender and recipient information, and account balances before and after each transaction. Each transaction is labeled as either fraudulent or legitimate.

    2) Data Utilization (1) Characteristics of the Financial Payment Services Fraud Data Dataset: • With its large-scale transaction records, detailed account information, and diverse transaction types, this dataset is well-suited for developing and testing financial fraud detection models.

    (2) Applications of the Financial Payment Services Fraud Data Dataset: • Real-time Fraud Detection: The dataset can be used to train machine learning classification models that quickly detect and prevent fraudulent transactions in real-world financial service environments. • Risky Transaction Pattern Analysis: By analyzing patterns according to transaction type, amount, and account, the dataset can support the advancement of fraud prevention policies and anomaly monitoring systems.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Office for National Statistics (2025). Nature of crime: fraud and computer misuse [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/crimeandjustice/datasets/natureofcrimefraudandcomputermisuse
Organization logo

Nature of crime: fraud and computer misuse

Explore at:
11 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License

Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically

Description

Annual data on the nature of fraud and computer misuse offences. Data for the year ending March 2021 and March 2022 are from the Telephone-operated Crime Survey for England and Wales (TCSEW).

Search
Clear search
Close search
Google apps
Main menu