100+ datasets found
  1. Fraud Detection Transactions Dataset

    • kaggle.com
    zip
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samay Ashar (2025). Fraud Detection Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/samayashar/fraud-detection-transactions-dataset
    Explore at:
    zip(2104444 bytes)Available download formats
    Dataset updated
    Feb 21, 2025
    Authors
    Samay Ashar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description

    This dataset is designed to help data scientists and machine learning enthusiasts develop robust fraud detection models. It contains realistic synthetic transaction data, including user information, transaction types, risk scores, and more, making it ideal for binary classification tasks with models like XGBoost and LightGBM.

    šŸ“Œ Key Features

    1. 21 features capturing various aspects of a financial transaction
    2. Realistic structure with numerical, categorical, and temporal data
    3. Binary fraud labels (0 = Not Fraud, 1 = Fraud)
    4. Designed for high accuracy with XGBoost and other ML models
    5. Useful for anomaly detection, risk analysis, and security research

    šŸ“Œ Columns in the Dataset

    Column NameDescription
    Transaction_IDUnique identifier for each transaction
    User_IDUnique identifier for the user
    Transaction_AmountAmount of money involved in the transaction
    Transaction_TypeType of transaction (Online, In-Store, ATM, etc.)
    TimestampDate and time of the transaction
    Account_BalanceUser's current account balance before the transaction
    Device_TypeType of device used (Mobile, Desktop, etc.)
    LocationGeographical location of the transaction
    Merchant_CategoryType of merchant (Retail, Food, Travel, etc.)
    IP_Address_FlagWhether the IP address was flagged as suspicious (0 or 1)
    Previous_Fraudulent_ActivityNumber of past fraudulent activities by the user
    Daily_Transaction_CountNumber of transactions made by the user that day
    Avg_Transaction_Amount_7dUser's average transaction amount in the past 7 days
    Failed_Transaction_Count_7dCount of failed transactions in the past 7 days
    Card_TypeType of payment card used (Credit, Debit, Prepaid, etc.)
    Card_AgeAge of the card in months
    Transaction_DistanceDistance between the user's usual location and transaction location
    Authentication_MethodHow the user authenticated (PIN, Biometric, etc.)
    Risk_ScoreFraud risk score computed for the transaction
    Is_WeekendWhether the transaction occurred on a weekend (0 or 1)
    Fraud_LabelTarget variable (0 = Not Fraud, 1 = Fraud)

    šŸ“Œ Potential Use Cases

    1. Fraud detection model training
    2. Anomaly detection in financial transactions
    3. Risk scoring systems for banks and fintech companies
    4. Feature engineering and model explainability research
  2. Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Ali Siddiqui (2025). Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/amanalisiddiqui/fraud-detection-dataset
    Explore at:
    zip(186385521 bytes)Available download formats
    Dataset updated
    Mar 28, 2025
    Authors
    Aman Ali Siddiqui
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The dataset contains the records of financial transactions for fraud detection. (6.3 Million Records)

    Some of these records were flagged false by existing algorithms.

    Further approaches could be used to feature engineer properties that could further strengthen the fraud detection algorithms as well as find out where the existing algorithm lacks.

    CASH-IN: is the process of increasing the balance of account by paying in cash to a merchant.

    CASH-OUT: is the opposite process of CASH-IN, it means to withdraw cash from a merchant which decreases the balance of the account.

    DEBIT: is similar process than CASH-OUT and involves sending the money from the mobile money service to a bank account.

    PAYMENT: is the process of paying for goods or services to merchants which decreases the balance of the account and increases the balance of the receiver.

    TRANSFER: is the process of sending money to another user of the service through the mobile money platform

    Citation for original work

  3. h

    Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Electric Sheep, Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset [Dataset]. https://huggingface.co/datasets/electricsheepafrica/Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset
    Explore at:
    Dataset authored and provided by
    Electric Sheep
    License

    https://choosealicense.com/licenses/gpl/https://choosealicense.com/licenses/gpl/

    Area covered
    Nigeria
    Description

    Nigerian Financial Fraud Detection Dataset (Enhanced)

      Overview
    

    This is a comprehensive synthetic financial fraud detection dataset specifically engineered for the Nigerian fintech ecosystem. The dataset contains 5,000,000 transactions with 45 advanced features including sophisticated user behaviour analytics, device intelligence, risk scoring, and temporal patterns tailored for Nigerian financial fraud detection.

      We have found that a lot of people are unable to use the full… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/Nigerian-Financial-Transactions-and-Fraud-Detection-Dataset.
    
  4. Financial Transactions Dataset for Fraud Detection

    • kaggle.com
    zip
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aryan Kumar (2025). Financial Transactions Dataset for Fraud Detection [Dataset]. https://www.kaggle.com/datasets/aryan208/financial-transactions-dataset-for-fraud-detection
    Explore at:
    zip(290256858 bytes)Available download formats
    Dataset updated
    May 2, 2025
    Authors
    Aryan Kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains 5 million synthetically generated financial transactions designed to simulate real-world behavior for fraud detection research and machine learning applications. Each transaction record includes fields such as:

    Transaction Details: ID, timestamp, sender/receiver accounts, amount, type (deposit, transfer, etc.)

    Behavioral Features: time since last transaction, spending deviation score, velocity score, geo-anomaly score

    Metadata: location, device used, payment channel, IP address, device hash

    Fraud Indicators: binary fraud label (is_fraud) and type of fraud (e.g., money laundering, account takeover)

    The dataset follows realistic fraud patterns and behavioral anomalies, making it suitable for:

    Binary and multiclass classification models

    Fraud detection systems

    Time-series anomaly detection

    Feature engineering and model explainability

  5. d

    Fraud Detection - Dataset - data.sa.gov.au

    • data.sa.gov.au
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fraud Detection - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/fraud-detection-defencesa
    Explore at:
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    South Australia
    Description

    Fraud detected in Defence SA since 2012-13.

  6. t

    Credit Card Fraud Detection

    • test.researchdata.tuwien.ac.at
    • zenodo.org
    • +1more
    csv, json, pdf +2
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja (2025). Credit Card Fraud Detection [Dataset]. http://doi.org/10.82556/yvxj-9t22
    Explore at:
    text/markdown, csv, pdf, txt, jsonAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    TU Wien
    Authors
    Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja; Ajdina Grizhja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 28, 2025
    Description

    Below is a draft DMP–style description of your credit‐card fraud detection experiment, modeled on the antiquities example:

    1. Dataset Description

    Research Domain
    This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous credit‐card transactions in real time to reduce financial losses and improve trust in digital payment systems.

    Purpose
    The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraud‐detection algorithms and support future research on anomaly detection in transaction data.

    Data Sources
    We used the publicly available credit‐card transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.

    Method of Dataset Preparation

    1. Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepo’s requirements.

    2. Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).

    3. Splitting: Programmatically derived three subsets—training (70%), validation (15%), test (15%)—using range‐based filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.

    4. Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from ā€œYā€/ā€œNā€ to 1/0 and dropped non‐feature identifiers (actionnr, merchant_id).

    5. Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the held‐out test set.

    2. Technical Details

    Dataset Structure

    • The raw data is a single CSV with columns:

      • actionnr (integer transaction ID)

      • merchant_id (string)

      • average_amount_transaction_day (float)

      • transaction_amount (float)

      • is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)

      • total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)

    Naming Conventions

    • All columns use lowercase snake_case.

    • Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.

    • Files in the code repo follow a clear structure:

      ā”œā”€ā”€ data/         # local copies only; raw data lives in DBRepo 
      ā”œā”€ā”€ notebooks/Task.ipynb 
      ā”œā”€ā”€ models/rf_model_v1.joblib 
      ā”œā”€ā”€ outputs/        # confusion_matrix.png, roc_curve.png, predictions.csv 
      ā”œā”€ā”€ README.md 
      ā”œā”€ā”€ requirements.txt 
      └── codemeta.json 
      

    Required Software

    • Python 3.9+

    • pandas, numpy (data handling)

    • scikit-learn (modeling, metrics)

    • matplotlib (visualizations)

    • dbrepo‐client.py (DBRepo API)

    • requests (TU WRD API)

    Additional Resources

    3. Further Details

    Data Limitations

    • Highly imbalanced: only ~0.17% of transactions are fraudulent.

    • Anonymized PCA features (V1–V28) hidden; we extended with domain features but cannot reverse engineer raw variables.

    • Time‐bounded: only covers two days of transactions, may not capture seasonal patterns.

    Licensing and Attribution

    • Raw data: CC-0 (per Kaggle terms)

    • Code & notebooks: MIT License

    • Model artifacts & outputs: CC-BY 4.0

    • DUWRD records include ORCID identifiers for the author.

    Recommended Uses

    • Benchmarking new fraud‐detection algorithms on a standard imbalanced dataset.

    • Educational purposes: demonstrating model‐training pipelines, FAIR data practices.

    • Extension: adding time‐series or deep‐learning models.

    Known Issues

    • Possible temporal leakage if date/time features not handled correctly.

    • Model performance may degrade on live data due to concept drift.

    • Binary flags may oversimplify nuanced transaction outcomes.

  7. h

    Synthetic-Financial-Datasets-For-Fraud-Detection

    • huggingface.co
    Updated Aug 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Purshotam Lalwani (2024). Synthetic-Financial-Datasets-For-Fraud-Detection [Dataset]. https://huggingface.co/datasets/purulalwani/Synthetic-Financial-Datasets-For-Fraud-Detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2024
    Authors
    Purshotam Lalwani
    Description

    purulalwani/Synthetic-Financial-Datasets-For-Fraud-Detection dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. G

    Claim Fraud Detection Dataset

    • gomask.ai
    csv, json
    Updated Nov 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GoMask.ai (2025). Claim Fraud Detection Dataset [Dataset]. https://gomask.ai/marketplace/datasets/claim-fraud-detection-dataset
    Explore at:
    csv(10 MB), jsonAvailable download formats
    Dataset updated
    Nov 11, 2025
    Dataset provided by
    GoMask.ai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - 2025
    Area covered
    Global
    Variables measured
    claim_id, policy_id, claim_date, claimant_id, policy_type, claim_amount, claim_status, claimant_age, incident_date, incident_type, and 16 more
    Description

    This synthetic insurance claim fraud detection dataset contains detailed records of claims, including incident specifics, claimant demographics, policy details, and fraud indicators. Designed for developing and testing machine learning models, it enables insurers and researchers to identify patterns of fraudulent activity and improve risk assessment strategies.

  9. Fastag Fraud Detection Datasets

    • kaggle.com
    zip
    Updated Jan 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prathamesh Pradeep Dessai (2024). Fastag Fraud Detection Datasets [Dataset]. https://www.kaggle.com/datasets/thegoanpanda/fastag-fraud-detection-datesets-fictitious
    Explore at:
    zip(108830 bytes)Available download formats
    Dataset updated
    Jan 16, 2024
    Authors
    Prathamesh Pradeep Dessai
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Nature of Data: This dataset contains fictitious data designed for educational and testing purposes in fraud detection algorithms. It does not represent real-world financial transactions or individuals.

    Purpose of Creation: The dataset was generated to provide a realistic example for developing and evaluating fraud detection models without relying on sensitive real-world data. It's intended for students, researchers, and practitioners to practice data analysis and machine learning techniques in a safe environment.

  10. y

    Fraud Data - Dataset - York Open Data

    • data.yorkopendata.org
    • ckan.york.staging.datopian.com
    Updated Feb 15, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). Fraud Data - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/fraud
    Explore at:
    Dataset updated
    Feb 15, 2016
    License

    Open Government Licence 2.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/2/
    License information was derived automatically

    Area covered
    York
    Description

    Data on fraud investigation carried out by City of York Council – use of powers, number of investigators and investigations, and monetary value of fraud identified. This data is required to be published in order to meet the requirements of the Local Government Transparency Code legislation.

  11. R

    Fraud Detection Dataset

    • universe.roboflow.com
    zip
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    auduod (2025). Fraud Detection Dataset [Dataset]. https://universe.roboflow.com/auduod/fraud-detection-485ip
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    auduod
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Variables measured
    Mobile Bounding Boxes
    Description

    Fraud Detection

    ## Overview
    
    Fraud Detection is a dataset for object detection tasks - it contains Mobile annotations for 2,590 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
    
  12. h

    creditcard-fraud-detection

    • huggingface.co
    Updated Aug 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keith Gauvin (2024). creditcard-fraud-detection [Dataset]. https://huggingface.co/datasets/kgauvin603/creditcard-fraud-detection
    Explore at:
    Dataset updated
    Aug 29, 2024
    Authors
    Keith Gauvin
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    kgauvin603/creditcard-fraud-detection dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    cifer-fraud-detection-mini-dataset

    • huggingface.co
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cifer (2025). cifer-fraud-detection-mini-dataset [Dataset]. https://huggingface.co/datasets/CiferAI/cifer-fraud-detection-mini-dataset
    Explore at:
    Dataset updated
    Jun 4, 2025
    Dataset authored and provided by
    Cifer
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    šŸ“Š Cifer Fraud Detection Mini Dataset

      🧠 Overview
    

    (cifer-fraud-detection-mini-dataset) The Cifer-Fraud-Detection-Mini-Dataset is a lightweight sample containing 20 transaction records, extracted from the full 21 million-row Cifer-Fraud-Detection-Dataset-AF. It is designed for quick experimentation of encrypted model training with Fully Homomorphic Encryption (FHE). Though small in size, this mini dataset retains the original schema and data structure inspired by… See the full description on the dataset page: https://huggingface.co/datasets/CiferAI/cifer-fraud-detection-mini-dataset.

  14. Fraud Detection And Prevention Market Analysis, Size, and Forecast...

    • technavio.com
    pdf
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Fraud Detection And Prevention Market Analysis, Size, and Forecast 2025-2029: North America (US and Canada), Europe (France, Germany, Italy, Russia, and UK), APAC (China, India, and Japan), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/fraud-detection-and-prevention-market-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jul 10, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    Fraud Detection And Prevention Market Size 2025-2029

    The fraud detection and prevention market size is forecast to increase by USD 122.65 billion, at a CAGR of 30.1% between 2024 and 2029.

    The market is witnessing significant growth, driven by the increasing adoption of cloud-based services. Businesses are recognizing the benefits of cloud solutions, such as real-time fraud detection, scalability, and cost savings. Additionally, technological advancements in fraud detection and prevention solutions and services are enabling organizations to better protect their assets from sophisticated fraud schemes. However, the complex IT infrastructure of modern businesses poses a challenge in implementing and integrating these solutions effectively. The complexity of the IT infrastructure, which integrates cloud computing, big data, and mobile devices, creates a vast network of devices with insufficient security features.
    To capitalize on market opportunities, companies must stay abreast of these trends and invest in advanced fraud detection technologies. Effective implementation and integration of these solutions, coupled with continuous innovation, will be crucial for businesses seeking to mitigate fraud risks and protect their reputation and financial stability. Furthermore, the constant evolution of fraud techniques necessitates continuous innovation and adaptation from solution providers. Encryption techniques and network security protocols form the foundation of robust cybersecurity defenses, while compliance regulations and penetration testing help identify vulnerabilities and strengthen security posture.
    

    What will be the Size of the Fraud Detection And Prevention Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free Sample

    The market continues to evolve, driven by the constant emergence of new threats and the need for advanced technologies to mitigate risks across various sectors. Real-time fraud alerts, anomaly detection systems, forensic accounting tools, and risk mitigation strategies are integrated into comprehensive solutions that adapt to the ever-changing fraud landscape. Entities rely on these tools to maintain regulatory compliance frameworks and incident response planning, ensuring access control management and vulnerability assessments are up-to-date. Machine learning algorithms and transaction monitoring tools enable the detection of suspicious activity, providing valuable insights into potential threats.

    Intrusion detection systems and behavioral biometrics offer real-time protection against cyberattacks and payment fraud, while identity verification methods and risk scoring models help prevent account takeover and data loss. Cybersecurity threat intelligence and authentication protocols enhance the overall security strategy, providing a layered approach to fraud prevention. Fraud investigation techniques and loss prevention metrics enable entities to respond effectively to incidents and minimize the impact of data breaches. Social engineering countermeasures and payment fraud detection solutions further fortify the fraud prevention arsenal, ensuring continuous protection against evolving threats.

    The ongoing dynamism of the market demands a proactive approach, with entities staying informed and agile to maintain a strong defense against fraudulent activities.

    How is this Fraud Detection And Prevention Industry segmented?

    The fraud detection and prevention industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Component
    
      Solutions
      Services
    
    
    End-user
    
      Large enterprise
      SMEs
    
    
    Application
    
      Transaction monitoring
      Compliance and risk management
      Identity verification
      Behavioral analytics
      Others
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        France
        Germany
        Italy
        Russia
        UK
    
    
      APAC
    
        China
        India
        Japan
    
    
      Rest of World (ROW)
    

    By Component Insights

    The Solutions segment is estimated to witness significant growth during the forecast period. The market is experiencing significant growth due to escalating cyber threats, increasing regulatory compliance requirements, and the need to mitigate financial losses. Biometric authentication, encryption techniques, machine learning algorithms, and intrusion detection systems are among the key solutions driving market expansion. Regulatory frameworks, such as the General Data Protection Regulation (GDPR) and the Health Insurance Portability and Accountability Act (HIPAA), are mandating robust incident response planning, access control management, and data breach prevention strategies. Vulnerability assessments and

  15. šŸ” RiskVault: Financial Fraud Detection Dataset

    • kaggle.com
    zip
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jainil Patel (2025). šŸ” RiskVault: Financial Fraud Detection Dataset [Dataset]. https://www.kaggle.com/datasets/jainilspatel/fraud-detection-dataset
    Explore at:
    zip(401045 bytes)Available download formats
    Dataset updated
    Jul 10, 2025
    Authors
    Jainil Patel
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset simulates real-world banking transactions, including both legitimate and fraudulent activity. It includes detailed features such as transaction amount, time, type, location, device type, and historical user behavior. Designed for binary classification, this dataset is ideal for training and evaluating machine learning models for fraud detection. This dataset contains simulated financial transactions labeled as fraudulent or legitimate. It includes the following features:

    transaction_id: Unique identifier for each transaction

    customer_id: Anonymized customer ID

    transaction_amount: Value of the transaction in currency units

    transaction_type: Type of transaction (e.g., payment, transfer)

    transaction_time: Timestamp of when the transaction occurred

    transaction_location: Region where the transaction was initiated

    device_type: Device used (e.g., mobile, POS, desktop)

    previous_transactions_count: Number of recent transactions by the same customer

    is_fraud: Target label indicating fraud (1) or not (0)

    This dataset is ideal for binary classification tasks such as fraud detection using machine learning.

  16. Credit Card Fraud Detection

    • zenodo.org
    csv
    Updated Dec 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luqi Liu; Luqi Liu (2022). Credit Card Fraud Detection [Dataset]. http://doi.org/10.5281/zenodo.7395559
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 5, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luqi Liu; Luqi Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset from https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud

    The dataset contains transactions made by credit cards in September 2013 by European cardholders.
    This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

  17. c

    Vehicle Insurance Claim Fraud Detection Dataset

    • cubig.ai
    zip
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Vehicle Insurance Claim Fraud Detection Dataset [Dataset]. https://cubig.ai/store/products/374/vehicle-insurance-claim-fraud-detection-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Vehicle Insurance Claim Fraud Detection Dataset is a tabular insurance fraud detection dataset that includes vehicle information, accident and insurance details, and claims details for vehicle insurance claims, and labels each claim as a fraudulent or not.

    2) Data Utilization (1) Vehicle Insurance Claim Fraud Detection Dataset has characteristics that: • Each row contains a variety of variables, including vehicle attributes, models, accident details, insurance type and duration, and claim history, as well as the target variable, FraudFound_P. • The data are based on real insurance claim cases and are designed to be suitable for insurance fraud detection and classification model development. (2) Vehicle Insurance Claim Fraud Detection Dataset can be used to: • Development of Insurance Fraud Detection Models: You can build a machine learning-based insurance fraud classification and prediction model by leveraging various vehicle and accident and insurance attributes. • Analyzing fraud patterns and risk factors: You can use billing data and fraud to analyze fraud patterns, risk factors, insurance policy improvements, and more.

  18. R

    Cctv Fraud Detection Dataset

    • universe.roboflow.com
    zip
    Updated Apr 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daffodil International University (2024). Cctv Fraud Detection Dataset [Dataset]. https://universe.roboflow.com/daffodil-international-university-8plfu/cctv-fraud-detection
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 6, 2024
    Dataset authored and provided by
    Daffodil International University
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Fraud Bounding Boxes
    Description

    Cctv Fraud Detection

    ## Overview
    
    Cctv Fraud Detection is a dataset for object detection tasks - it contains Fraud annotations for 3,000 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  19. G

    Payment Fraud Detection Dataset

    • gomask.ai
    csv, json
    Updated Oct 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GoMask.ai (2025). Payment Fraud Detection Dataset [Dataset]. https://gomask.ai/marketplace/datasets/payment-fraud-detection-dataset
    Explore at:
    json, csv(10 MB)Available download formats
    Dataset updated
    Oct 21, 2025
    Dataset provided by
    GoMask.ai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - 2025
    Area covered
    Global
    Variables measured
    amount, currency, device_id, entry_mode, ip_address, customer_id, fraud_label, merchant_id, customer_age, fraud_reason, and 9 more
    Description

    This dataset contains detailed synthetic payment transaction records, each labeled with ground-truth indicators of fraud. It includes transaction metadata, customer and merchant identifiers, payment methods, device and location context, and fraud reasons, making it ideal for developing and benchmarking machine learning models for payment fraud detection and risk mitigation.

  20. G

    Banking Transaction Graphs Dataset

    • gomask.ai
    csv, json
    Updated Nov 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GoMask.ai (2025). Banking Transaction Graphs Dataset [Dataset]. https://gomask.ai/marketplace/datasets/banking-transaction-graphs-dataset
    Explore at:
    csv(10 MB), jsonAvailable download formats
    Dataset updated
    Nov 24, 2025
    Dataset provided by
    GoMask.ai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - 2025
    Area covered
    Global
    Variables measured
    amount, channel, currency, timestamp, origin_country, reference_note, transaction_id, is_international, transaction_type, sender_account_id, and 6 more
    Description

    This dataset provides detailed, interconnected banking transaction records, capturing sender and receiver relationships, transaction metadata, and anomaly flags. Designed for network analytics, it enables advanced anti-money laundering (AML) detection, fraud analysis, and financial behavior modeling by representing transactions as a directed graph. The flat structure ensures easy integration with machine learning and graph analytics tools.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Samay Ashar (2025). Fraud Detection Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/samayashar/fraud-detection-transactions-dataset
Organization logo

Fraud Detection Transactions Dataset

A high-quality synthetic dataset for fraud detection using XGBoost

Explore at:
35 scholarly articles cite this dataset (View in Google Scholar)
zip(2104444 bytes)Available download formats
Dataset updated
Feb 21, 2025
Authors
Samay Ashar
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Description

This dataset is designed to help data scientists and machine learning enthusiasts develop robust fraud detection models. It contains realistic synthetic transaction data, including user information, transaction types, risk scores, and more, making it ideal for binary classification tasks with models like XGBoost and LightGBM.

šŸ“Œ Key Features

  1. 21 features capturing various aspects of a financial transaction
  2. Realistic structure with numerical, categorical, and temporal data
  3. Binary fraud labels (0 = Not Fraud, 1 = Fraud)
  4. Designed for high accuracy with XGBoost and other ML models
  5. Useful for anomaly detection, risk analysis, and security research

šŸ“Œ Columns in the Dataset

Column NameDescription
Transaction_IDUnique identifier for each transaction
User_IDUnique identifier for the user
Transaction_AmountAmount of money involved in the transaction
Transaction_TypeType of transaction (Online, In-Store, ATM, etc.)
TimestampDate and time of the transaction
Account_BalanceUser's current account balance before the transaction
Device_TypeType of device used (Mobile, Desktop, etc.)
LocationGeographical location of the transaction
Merchant_CategoryType of merchant (Retail, Food, Travel, etc.)
IP_Address_FlagWhether the IP address was flagged as suspicious (0 or 1)
Previous_Fraudulent_ActivityNumber of past fraudulent activities by the user
Daily_Transaction_CountNumber of transactions made by the user that day
Avg_Transaction_Amount_7dUser's average transaction amount in the past 7 days
Failed_Transaction_Count_7dCount of failed transactions in the past 7 days
Card_TypeType of payment card used (Credit, Debit, Prepaid, etc.)
Card_AgeAge of the card in months
Transaction_DistanceDistance between the user's usual location and transaction location
Authentication_MethodHow the user authenticated (PIN, Biometric, etc.)
Risk_ScoreFraud risk score computed for the transaction
Is_WeekendWhether the transaction occurred on a weekend (0 or 1)
Fraud_LabelTarget variable (0 = Not Fraud, 1 = Fraud)

šŸ“Œ Potential Use Cases

  1. Fraud detection model training
  2. Anomaly detection in financial transactions
  3. Risk scoring systems for banks and fintech companies
  4. Feature engineering and model explainability research
Search
Clear search
Close search
Google apps
Main menu