Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
With this dataset, you could perform various analyses like: Detecting anomalies in transaction amounts (e.g., unusually high transactions). Identifying irregular transaction types for specific accounts. Recognizing unusual patterns based on transaction timestamps or locations. Tracking spending behaviors based on merchants.
This Data set contains following columns:
Timestamp: This column records the date and time when the transaction occurred. It helps in understanding the temporal aspect of transactions, such as patterns over time, frequency, and clustering of activities.
TransactionID: An identification number assigned to each transaction. It serves as a unique identifier for referencing or tracking specific transactions.
AccountID: This field represents the unique identifier associated with the bank account involved in the transaction. It links multiple transactions to a specific account, enabling analysis on a per-account basis.
Amount: The monetary value involved in the transaction. This column provides information about the financial magnitude of each transaction, which is crucial for anomaly detection since unusually high or low values might signify irregularities.
Merchant: Specifies the entity or business involved in the transaction. This information helps in categorizing transactions (e.g., retail, online, restaurant) and identifying patterns related to specific merchants.
TransactionType: Describes the nature or category of the transaction, whether it's a withdrawal, deposit, transfer, payment, etc. This column helps in understanding the purpose or direction of the transaction.
Location: Indicates the place where the transaction occurred. It could be a physical location (e.g., city, country) or an identifier (e.g., store code, online portal), aiding in analyzing geographical spending patterns or detecting anomalies based on unusual transaction locations.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 5 million synthetically generated financial transactions designed to simulate real-world behavior for fraud detection research and machine learning applications. Each transaction record includes fields such as:
Transaction Details: ID, timestamp, sender/receiver accounts, amount, type (deposit, transfer, etc.)
Behavioral Features: time since last transaction, spending deviation score, velocity score, geo-anomaly score
Metadata: location, device used, payment channel, IP address, device hash
Fraud Indicators: binary fraud label (is_fraud) and type of fraud (e.g., money laundering, account takeover)
The dataset follows realistic fraud patterns and behavioral anomalies, making it suitable for:
Binary and multiclass classification models
Fraud detection systems
Time-series anomaly detection
Feature engineering and model explainability
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is designed to help data scientists and machine learning enthusiasts develop robust fraud detection models. It contains realistic synthetic transaction data, including user information, transaction types, risk scores, and more, making it ideal for binary classification tasks with models like XGBoost and LightGBM.
| Column Name | Description |
|---|---|
| Transaction_ID | Unique identifier for each transaction |
| User_ID | Unique identifier for the user |
| Transaction_Amount | Amount of money involved in the transaction |
| Transaction_Type | Type of transaction (Online, In-Store, ATM, etc.) |
| Timestamp | Date and time of the transaction |
| Account_Balance | User's current account balance before the transaction |
| Device_Type | Type of device used (Mobile, Desktop, etc.) |
| Location | Geographical location of the transaction |
| Merchant_Category | Type of merchant (Retail, Food, Travel, etc.) |
| IP_Address_Flag | Whether the IP address was flagged as suspicious (0 or 1) |
| Previous_Fraudulent_Activity | Number of past fraudulent activities by the user |
| Daily_Transaction_Count | Number of transactions made by the user that day |
| Avg_Transaction_Amount_7d | User's average transaction amount in the past 7 days |
| Failed_Transaction_Count_7d | Count of failed transactions in the past 7 days |
| Card_Type | Type of payment card used (Credit, Debit, Prepaid, etc.) |
| Card_Age | Age of the card in months |
| Transaction_Distance | Distance between the user's usual location and transaction location |
| Authentication_Method | How the user authenticated (PIN, Biometric, etc.) |
| Risk_Score | Fraud risk score computed for the transaction |
| Is_Weekend | Whether the transaction occurred on a weekend (0 or 1) |
| Fraud_Label | Target variable (0 = Not Fraud, 1 = Fraud) |
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides a detailed, feature-rich record of synthetic banking transactions, including transaction metadata, account and merchant information, contextual behavioral features, and fraud labels. It is ideal for developing, training, and benchmarking machine learning models for fraud detection and anomaly analysis in financial services.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Below is a draft DMPâstyle description of your creditâcard fraud detection experiment, modeled on the antiquities example:
Research Domain
This work resides in the domain of financial fraud detection and applied machine learning. We focus on detecting anomalous creditâcard transactions in real time to reduce financial losses and improve trust in digital payment systems.
Purpose
The goal is to train and evaluate a binary classification model that flags potentially fraudulent transactions. By publishing both the code and data splits via FAIR repositories, we enable reproducible benchmarking of fraudâdetection algorithms and support future research on anomaly detection in transaction data.
Data Sources
We used the publicly available creditâcard transaction dataset from Kaggle (original source: https://www.kaggle.com/mlg-ulb/creditcardfraud), which contains anonymized transactions made by European cardholders over two days in September 2013. The dataset includes 284 807 transactions, of which 492 are fraudulent.
Method of Dataset Preparation
Schema validation: Renamed columns to snake_case (e.g. transaction_amount, is_declined) so they conform to DBRepoâs requirements.
Data import: Uploaded the full CSV into DBRepo, assigned persistent identifiers (PIDs).
Splitting: Programmatically derived three subsetsâtraining (70%), validation (15%), test (15%)âusing rangeâbased filters on the primary key actionnr. Each subset was materialized in DBRepo and assigned its own PID for precise citation.
Cleaning: Converted the categorical flags (is_declined, isforeigntransaction, ishighriskcountry, isfradulent) from âYâ/âNâ to 1/0 and dropped nonâfeature identifiers (actionnr, merchant_id).
Modeling: Trained a RandomForest classifier on the training split, tuned on validation, and evaluated on the heldâout test set.
Dataset Structure
The raw data is a single CSV with columns:
actionnr (integer transaction ID)
merchant_id (string)
average_amount_transaction_day (float)
transaction_amount (float)
is_declined, isforeigntransaction, ishighriskcountry, isfradulent (binary flags)
total_number_of_declines_day, daily_chargeback_avg_amt, sixmonth_avg_chbk_amt, sixmonth_chbk_freq (numeric features)
Naming Conventions
All columns use lowercase snake_case.
Subsets are named creditcard_training, creditcard_validation, creditcard_test in DBRepo.
Files in the code repo follow a clear structure:
âââ data/ # local copies only; raw data lives in DBRepo
âââ notebooks/Task.ipynb
âââ models/rf_model_v1.joblib
âââ outputs/ # confusion_matrix.png, roc_curve.png, predictions.csv
âââ README.md
âââ requirements.txt
âââ codemeta.json
Required Software
Python 3.9+
pandas, numpy (data handling)
scikit-learn (modeling, metrics)
matplotlib (visualizations)
dbrepoâclient.py (DBRepo API)
requests (TU WRD API)
Additional Resources
Original dataset: https://www.kaggle.com/mlg-ulb/creditcardfraud
Scikit-learn docs: https://scikit-learn.org/stable
DBRepo API guide: via the starter notebookâs dbrepo_client.py template
TU WRD REST API spec: https://test.researchdata.tuwien.ac.at/api/docs
Data Limitations
Highly imbalanced: only ~0.17% of transactions are fraudulent.
Anonymized PCA features (V1âV28) hidden; we extended with domain features but cannot reverse engineer raw variables.
Timeâbounded: only covers two days of transactions, may not capture seasonal patterns.
Licensing and Attribution
Raw data: CC-0 (per Kaggle terms)
Code & notebooks: MIT License
Model artifacts & outputs: CC-BY 4.0
DUWRD records include ORCID identifiers for the author.
Recommended Uses
Benchmarking new fraudâdetection algorithms on a standard imbalanced dataset.
Educational purposes: demonstrating modelâtraining pipelines, FAIR data practices.
Extension: adding timeâseries or deepâlearning models.
Known Issues
Possible temporal leakage if date/time features not handled correctly.
Model performance may degrade on live data due to concept drift.
Binary flags may oversimplify nuanced transaction outcomes.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed, labeled records of simulated financial transactions, including account, customer, merchant, and contextual data, with explicit fraud indicators and scenario types. It enables robust development, benchmarking, and evaluation of fraud detection models for banking and fintech applications, supporting both supervised and unsupervised analytics.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset simulates realistic financial transaction patterns and generated by using python code. For the purpose of developing and testing fraud detection models. The dataset was generated to mimic a wide range of transactional scenarios across multiple categories, including retail, grocery, dining, travel, and more, making it ideal for exploring patterns that distinguish legitimate transactions from fraudulent ones.
Financial fraud is an increasingly prevalent issue, with organizations constantly seeking advanced solutions to detect and prevent suspicious activity. This dataset was inspired by real-world transaction data but was generated synthetically to avoid privacy concerns. It includes key features that play a critical role in fraud detection, such as transaction amounts, device types, geographic locations, currency, card type, and a "fraud" label indicating whether a transaction is suspicious.
Comprehensive Transaction Categories: Transactions span categories like retail (online and in-store), groceries, restaurants (fast food to premium), entertainment (streaming, gaming, events), healthcare, education, gas, and travel.
Geographic and Demographic Variety: The dataset includes diverse geographic data (countries, cities) and currency types, allowing for analysis on a global scale with varying risk profiles.
Detailed Customer Profiles: Each transaction is linked to a customer profile that includes characteristics like account age, preferred devices, typical spending range, and fraud-protection features.
Feature-Rich Data for ML and Fraud Analysis: Features like transaction velocity, merchant risk, card presence, and device fingerprints provide an enriched environment for machine learning models to detect anomalies and suspicious patterns.
Use Cases:
This dataset is designed for data scientists, analysts, and machine learning practitioners interested in: Building and training fraud detection models. Exploring financial transaction patterns and consumer behaviors. Developing and testing machine learning algorithms for anomaly detection. With this dataset, users can delve into advanced topics like feature engineering, model evaluation, and performance optimization, especially relevant to fraud detection applications in finance and e-commerce.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed, interconnected banking transaction records, capturing sender and receiver relationships, transaction metadata, and anomaly flags. Designed for network analytics, it enables advanced anti-money laundering (AML) detection, fraud analysis, and financial behavior modeling by representing transactions as a directed graph. The flat structure ensures easy integration with machine learning and graph analytics tools.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed, labeled records of financial transactions, including transaction amounts, types, geolocation, merchant details, and fraud indicators. Designed for robust fraud detection model development and benchmarking, it supports advanced analytics and machine learning in banking and payment processing. The inclusion of comprehensive transaction attributes and fraud labels makes it ideal for supervised learning and anomaly detection research.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed records of financial transactions, enriched with fraud detection indicators, device and location metadata, and merchant information. It is designed to help financial institutions identify and analyze fraudulent activities, supporting both real-time monitoring and historical pattern analysis for risk mitigation and compliance.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed, labeled records of simulated credit card transactions, including transaction amounts, merchant and cardholder information, and fraud indicators. It is ideal for developing and benchmarking machine learning models aimed at detecting fraudulent activity and reducing financial risk in payment systems. The inclusion of transaction context and cardholder demographics supports advanced analytics and feature engineering.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed credit card transaction records enriched with fraud suspicion flags, risk scores, and contextual information such as merchant, location, and transaction method. It is ideal for developing, training, and evaluating fraud detection models, as well as for analyzing transaction patterns and identifying emerging fraud tactics in the financial sector.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a fabricated dataset which is made by merging two dataset, Dataset1.csv and Dataset2.csv .
The final dataset which merged_dataset.csv is a synthetic dataset, using probabilistic imputation to handle missing values.
Balancing the Dataset: The dataset, which was initially imbalanced, was balanced using the ROSE (Random Over-Sampling Examples) package to ensure equal representation of fraudulent and non-fraudulent transactions.
This dataset was used for my group and school project report. You can check out my code for this project, through this https://github.com/slothislazy/DM_AOL
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction ⢠The Credit_Card_Transactions Dataset is a representative sample data for building fraud detection models, including anonymized real-world transaction data such as financial transaction type, amount, sender/receiver account balance, and fraud indicators.
2) Data Utilization (1) Credit_Card_Transactions Dataset has characteristics that: ⢠This dataset provides individual transaction records on a row-by-row basis, reflecting the real-world class imbalance problem with the extremely low percentage of fraudulent transactions (isFraud=1). ⢠It is an unprocessed raw data structure that allows you to directly utilize key variables such as transaction time, amount, and account change. (2) Credit_Card_Transactions Dataset can be used to: ⢠Binary classification modeling: Fraud transaction detection models can be developed by applying imbalance processing techniques such as SMOTE and undersampling, and appropriate evaluation indicators such as F1-score and ROC-AUC. ⢠Real-time anomaly detection: It can be used to build a real-time anomaly signal detection system through analysis of transaction patterns (amount, frequency, account change).
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
As per our latest research, the global Charging Session Fraud Anomaly Detection market size reached USD 685.9 million in 2024, driven by the rapid expansion of electric vehicle (EV) infrastructure and the increasing sophistication of cyber threats targeting charging networks. The market is poised to grow at a robust CAGR of 19.8% from 2025 to 2033, with the market size expected to reach USD 2.97 billion by 2033. This impressive growth is fueled by rising investments in EV charging ecosystems, growing regulatory scrutiny around energy transactions, and the critical need for robust fraud prevention mechanisms in an increasingly digitalized mobility landscape.
One of the primary growth factors for the Charging Session Fraud Anomaly Detection market is the exponential rise in EV adoption globally. As governments and private enterprises invest heavily in expanding EV charging infrastructure, the volume of transactions occurring at charging stations is surging. This increase in transaction volume inherently raises the risk of fraudulent activities, such as unauthorized access, billing manipulations, and identity theft. The need for advanced anomaly detection solutions becomes paramount, as stakeholders seek to protect revenue streams, maintain customer trust, and ensure the seamless operation of charging networks. Furthermore, the growing integration of smart grid technologies and IoT devices into EV charging infrastructure has expanded the attack surface, necessitating more sophisticated fraud detection tools that leverage artificial intelligence and machine learning to identify and mitigate evolving threats in real time.
Another significant driver for market growth is the tightening regulatory environment surrounding energy transactions and data privacy. With the proliferation of digital payment methods and the transmission of sensitive user data through charging platforms, regulators across regions such as North America, Europe, and Asia Pacific are enacting stringent guidelines to ensure transparency, security, and accountability. Compliance with standards such as the General Data Protection Regulation (GDPR) and the Payment Card Industry Data Security Standard (PCI DSS) has become a non-negotiable requirement for charging operators and service providers. As a result, organizations are increasingly investing in fraud anomaly detection solutions that not only safeguard against illicit activities but also provide comprehensive audit trails and reporting capabilities to meet regulatory mandates. This compliance-driven adoption is further accelerating the penetration of advanced fraud detection technologies across the EV charging value chain.
Technological advancements are also playing a pivotal role in shaping the Charging Session Fraud Anomaly Detection market. The integration of artificial intelligence, machine learning, and big data analytics is enabling the development of highly adaptive and predictive fraud detection systems. These systems can analyze vast datasets in real time, identify subtle patterns indicative of fraudulent behavior, and trigger automated responses to mitigate risks. Additionally, the emergence of cloud-based deployment models has democratized access to sophisticated fraud detection tools, allowing even small and medium-sized charging operators to implement robust security measures without significant upfront investments. The convergence of these technological trends is fostering innovation and driving the adoption of anomaly detection solutions across diverse segments of the EV charging ecosystem.
From a regional perspective, North America and Europe are currently leading the Charging Session Fraud Anomaly Detection market, owing to their advanced EV infrastructure, proactive regulatory frameworks, and high levels of digitalization. However, the Asia Pacific region is rapidly emerging as a key growth engine, supported by aggressive government initiatives to promote electric mobility, burgeoning urban populations, and the proliferation of smart city projects. As the competitive landscape intensifies and the threat landscape evolves, stakeholders across all regions are prioritizing investments in fraud anomaly detection technologies to safeguard their assets and ensure the long-term sustainability of their operations.
The Charging Session Fraud Anomaly Detection market is segmented by component into Software, Hardware, and
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains simulated credit card transaction records, including detailed information on transaction amounts, merchant details, geolocation, device usage, and fraud labels. It is designed for training and evaluating fraud detection models, supporting the identification of both typical and anomalous transaction patterns. The dataset is ideal for fintech AI development, security analytics, and research into payment fraud behaviors.
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
Anomaly Detection Market Size 2025-2029
The anomaly detection market size is valued to increase by USD 4.44 billion, at a CAGR of 14.4% from 2024 to 2029. Anomaly detection tools gaining traction in BFSI will drive the anomaly detection market.
Major Market Trends & Insights
North America dominated the market and accounted for a 43% growth during the forecast period.
By Deployment - Cloud segment was valued at USD 1.75 billion in 2023
By Component - Solution segment accounted for the largest market revenue share in 2023
Market Size & Forecast
Market Opportunities: USD 173.26 million
Market Future Opportunities: USD 4441.70 million
CAGR from 2024 to 2029 : 14.4%
Market Summary
Anomaly detection, a critical component of advanced analytics, is witnessing significant adoption across various industries, with the financial services sector leading the charge. The increasing incidence of internal threats and cybersecurity frauds necessitates the need for robust anomaly detection solutions. These tools help organizations identify unusual patterns and deviations from normal behavior, enabling proactive response to potential threats and ensuring operational efficiency. For instance, in a supply chain context, anomaly detection can help identify discrepancies in inventory levels or delivery schedules, leading to cost savings and improved customer satisfaction. In the realm of compliance, anomaly detection can assist in maintaining regulatory adherence by flagging unusual transactions or activities, thereby reducing the risk of penalties and reputational damage.
According to recent research, organizations that implement anomaly detection solutions experience a reduction in error rates by up to 25%. This improvement not only enhances operational efficiency but also contributes to increased customer trust and satisfaction. Despite these benefits, challenges persist, including data quality and the need for real-time processing capabilities. As the market continues to evolve, advancements in machine learning and artificial intelligence are expected to address these challenges and drive further growth.
What will be the Size of the Anomaly Detection Market during the forecast period?
Get Key Insights on Market Forecast (PDF) Request Free Sample
How is the Anomaly Detection Market Segmented ?
The anomaly detection industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Deployment
Cloud
On-premises
Component
Solution
Services
End-user
BFSI
IT and telecom
Retail and e-commerce
Manufacturing
Others
Technology
Big data analytics
AI and ML
Data mining and business intelligence
Geography
North America
US
Canada
Mexico
Europe
France
Germany
Spain
UK
APAC
China
India
Japan
Rest of World (ROW)
By Deployment Insights
The cloud segment is estimated to witness significant growth during the forecast period.
The market is witnessing significant growth, driven by the increasing adoption of advanced technologies such as machine learning algorithms, predictive modeling tools, and real-time monitoring systems. Businesses are increasingly relying on anomaly detection solutions to enhance their root cause analysis, improve system health indicators, and reduce false positives. This is particularly true in sectors where data is generated in real-time, such as cybersecurity threat detection, network intrusion detection, and fraud detection systems. Cloud-based anomaly detection solutions are gaining popularity due to their flexibility, scalability, and cost-effectiveness.
This growth is attributed to cloud-based solutions' quick deployment, real-time data visibility, and customization capabilities, which are offered at flexible payment options like monthly subscriptions and pay-as-you-go models. Companies like Anodot, Ltd, Cisco Systems Inc, IBM Corp, and SAS Institute Inc provide both cloud-based and on-premise anomaly detection solutions. Anomaly detection methods include outlier detection, change point detection, and statistical process control. Data preprocessing steps, such as data mining techniques and feature engineering processes, are crucial in ensuring accurate anomaly detection. Data visualization dashboards and alert fatigue mitigation techniques help in managing and interpreting the vast amounts of data generated.
Network traffic analysis, log file analysis, and sensor data integration are essential components of anomaly detection systems. Additionally, risk management frameworks, drift detection algorithms, time series forecasting, and performance degradation detection are vital in maintaining system performance and capacity planning.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction ⢠The Credit Card Transactions Dataset includes more than 20 million credit card transactions over the decades of 2,000 U.S. resident consumers created by IBM's simulations, providing details of each transaction and fraudulent labels.
2) Data Utilization (1) Credit Card Transactions Dataset has characteristics that: ⢠This dataset provides a variety of properties that are similar to real credit card transactions, including transaction amount, time, card information, purchase location, and store category (MCC). (2) Credit Card Transactions Dataset can be used to: ⢠Development of Credit Card Fraud Detection Model: Using transaction history and properties, you can build a fraud (abnormal transaction) detection model based on machine learning. ⢠Analysis of consumption patterns and risks: Long-term and diverse transaction data can be used to analyze customer consumption behavior and identify risk factors.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Title: Credit Card Transactions Dataset for Fraud Detection (Used in: A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning)Description:This dataset, commonly known as creditcard.csv, contains anonymized credit card transactions made by European cardholders in September 2013. It includes 284,807 transactions, with 492 labeled as fraudulent. Due to confidentiality constraints, features have been transformed using PCA, except for 'Time' and 'Amount'.This dataset was used in the research article titled "A Hybrid Anomaly Detection Framework Combining Supervised and Unsupervised Learning for Credit Card Fraud Detection". The study proposes an ensemble model integrating techniques such as Autoencoders, Isolation Forest, Local Outlier Factor, and supervised classifiers including XGBoost and Random Forest, aiming to improve the detection of rare fraudulent patterns while maintaining efficiency and scalability.Key Features:30 numerical input features (V1âV28, Time, Amount)Class label indicating fraud (1) or normal (0)Imbalanced class distribution typical in real-world fraud detectionUse Case:Ideal for benchmarking and evaluating anomaly detection and classification algorithms in highly imbalanced data scenarios.Source:Originally published by the Machine Learning Group at UniversitĂŠ Libre de Bruxelles.https://www.kaggle.com/mlg-ulb/creditcardfraudLicense:This dataset is distributed for academic and research purposes only. Please cite the original source when using the dataset.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed synthetic microtransaction logs, capturing both legitimate and fraudulent activities with realistic attack scenarios. It includes user, transaction, device, location, and fraud scenario fields, making it ideal for developing, benchmarking, and validating fraud detection models in financial microtransaction environments.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
With this dataset, you could perform various analyses like: Detecting anomalies in transaction amounts (e.g., unusually high transactions). Identifying irregular transaction types for specific accounts. Recognizing unusual patterns based on transaction timestamps or locations. Tracking spending behaviors based on merchants.
This Data set contains following columns:
Timestamp: This column records the date and time when the transaction occurred. It helps in understanding the temporal aspect of transactions, such as patterns over time, frequency, and clustering of activities.
TransactionID: An identification number assigned to each transaction. It serves as a unique identifier for referencing or tracking specific transactions.
AccountID: This field represents the unique identifier associated with the bank account involved in the transaction. It links multiple transactions to a specific account, enabling analysis on a per-account basis.
Amount: The monetary value involved in the transaction. This column provides information about the financial magnitude of each transaction, which is crucial for anomaly detection since unusually high or low values might signify irregularities.
Merchant: Specifies the entity or business involved in the transaction. This information helps in categorizing transactions (e.g., retail, online, restaurant) and identifying patterns related to specific merchants.
TransactionType: Describes the nature or category of the transaction, whether it's a withdrawal, deposit, transfer, payment, etc. This column helps in understanding the purpose or direction of the transaction.
Location: Indicates the place where the transaction occurred. It could be a physical location (e.g., city, country) or an identifier (e.g., store code, online portal), aiding in analyzing geographical spending patterns or detecting anomalies based on unusual transaction locations.