13 datasets found

Bank Transaction Dataset for Fraud Detection
kaggle.com
Updated Nov 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vala khorasani (2024). Bank Transaction Dataset for Fraud Detection [Dataset]. https://www.kaggle.com/datasets/valakhorasani/bank-transaction-dataset-for-fraud-detection
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
vala khorasani
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset provides a detailed look into transactional behavior and financial activity patterns, ideal for exploring fraud detection and anomaly identification. It contains 2,512 samples of transaction data, covering various transaction attributes, customer demographics, and usage patterns. Each entry offers comprehensive insights into transaction behavior, enabling analysis for financial security and fraud detection applications.

Key Features:

TransactionID: Unique alphanumeric identifier for each transaction.

AccountID: Unique identifier for each account, with multiple transactions per account.

TransactionAmount: Monetary value of each transaction, ranging from small everyday expenses to larger purchases.

TransactionDate: Timestamp of each transaction, capturing date and time.

TransactionType: Categorical field indicating 'Credit' or 'Debit' transactions.

Location: Geographic location of the transaction, represented by U.S. city names.

DeviceID: Alphanumeric identifier for devices used to perform the transaction.

IP Address: IPv4 address associated with the transaction, with occasional changes for some accounts.

MerchantID: Unique identifier for merchants, showing preferred and outlier merchants for each account.

AccountBalance: Balance in the account post-transaction, with logical correlations based on transaction type and amount.

PreviousTransactionDate: Timestamp of the last transaction for the account, aiding in calculating transaction frequency.

Channel: Channel through which the transaction was performed (e.g., Online, ATM, Branch).

CustomerAge: Age of the account holder, with logical groupings based on occupation.

CustomerOccupation: Occupation of the account holder (e.g., Doctor, Engineer, Student, Retired), reflecting income patterns.

TransactionDuration: Duration of the transaction in seconds, varying by transaction type.

LoginAttempts: Number of login attempts before the transaction, with higher values indicating potential anomalies.

This dataset is ideal for data scientists, financial analysts, and researchers looking to analyze transactional patterns, detect fraud, and build predictive models for financial security applications. The dataset was designed for machine learning and pattern analysis tasks and is not intended as a primary data source for academic publications.
Data from: Online Payment Fraud Detection
kaggle.com
Updated Jun 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnjaliGupta (2025). Online Payment Fraud Detection [Dataset]. https://www.kaggle.com/datasets/anjigupta05/online-payment-fraud-detection/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AnjaliGupta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by AnjaliGupta

Released under CC0: Public Domain

Contents
Data from: Online Payment Fraud Detection
kaggle.com
Updated Oct 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jainil Shah (2022). Online Payment Fraud Detection [Dataset]. https://www.kaggle.com/datasets/jainilcoder/online-payment-fraud-detection/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 26, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jainil Shah
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
To identify online payment fraud with machine learning, we need to train a machine learning model for classifying fraudulent and non-fraudulent payments. For this, we need a dataset containing information about online payment fraud, so that we can understand what type of transactions lead to fraud. For this task, I collected a dataset from Kaggle, which contains historical information about fraudulent transactions which can be used to detect fraud in online payments. Below are all the columns from the dataset I’m using here:

step: represents a unit of time where 1 step equals 1 hour type: type of online transaction amount: the amount of the transaction nameOrig: customer starting the transaction oldbalanceOrg: balance before the transaction newbalanceOrig: balance after the transaction nameDest: recipient of the transaction oldbalanceDest: initial balance of recipient before the transaction newbalanceDest: the new balance of recipient after the transaction isFraud: fraud transaction

I hope you now know about the data I am using for the online payment fraud detection task. Now in the section below, I’ll explain how we can use machine learning to detect online payment fraud using Python.
ieee-data-preprocessing
kaggle.com
Updated Mar 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed NIANG (2020). ieee-data-preprocessing [Dataset]. https://www.kaggle.com/datasets/niangmohamed/ieeedatapreprocessing
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohamed NIANG
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Mohamed NIANG

Released under CC0: Public Domain

Contents
Data from: ieeecis-fraud-detection
kaggle.com
Updated Mar 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed NIANG (2020). ieeecis-fraud-detection [Dataset]. https://www.kaggle.com/niangmohamed/ieeecis-fraud-detection/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 3, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohamed NIANG
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Mohamed NIANG

Released under CC0: Public Domain

Contents
o
Online Review Authenticity Dataset
opendatabay.com
.undefined
Updated Jul 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datasimple (2025). Online Review Authenticity Dataset [Dataset]. https://www.opendatabay.com/data/ai-ml/d7a6f4c7-c99a-4d8e-b082-914e014129f1
Explore at:
.undefinedAvailable download formats
Dataset updated
Jul 2, 2025
Dataset authored and provided by
Datasimple
Area covered
Reviews & Ratings
Description
This dataset is designed to support the creation and detection of fake reviews for online products. It comprises a collection of 40,000 product reviews, equally split between 20,000 authentic, human-generated reviews and 20,000 computer-generated fake reviews. The dataset includes information on review content, categorisation, and associated ratings, making it a valuable resource for developing and testing review integrity solutions within e-commerce and other online platforms.

Columns

review dateaset: Likely indicates the type or source of the review within the dataset.

category: Specifies the product category the review belongs to, such as 'Kindle_Store_5' or 'Books_5'.

rating: The numerical rating given in the review.

label: A classification label, possibly indicating if a review is original (OR) or computer-generated (CG).

text_: The actual textual content of the product review.

Distribution

The dataset contains a total of 40,412 unique entries, with a balanced distribution of 20,000 fake and 20,000 real product reviews. Data is typically provided in a CSV file format.

The distribution of ratings is as follows: * 1.00 - 1.20: 2,155 entries * 2.00 - 2.20: 1,967 entries * 3.00 - 3.20: 3,786 entries * 4.00 - 4.20: 7,965 entries * 4.80 - 5.00: 24,559 entries

The dataset categorisation includes: * Kindle_Store_5: 12% * Books_5: 11% * Other: 77% (31,332 entries)

Usage

This dataset is ideal for training machine learning models to identify and flag fraudulent or computer-generated product reviews. It can be utilised for: * Developing Natural Language Processing (NLP) models for sentiment analysis and text classification. * Building AI & Machine Learning solutions for fraud detection in online marketplaces. * Researching the characteristics and patterns of authentic versus fabricated consumer feedback. * Enhancing the trustworthiness and reliability of online review systems.

Coverage

The dataset has global coverage, making it applicable for systems and research worldwide. While specific time ranges for the reviews themselves are not explicitly detailed, the data's utility is broad across various product categories and review contexts within e-commerce.

License

CC-BY

Who Can Use It

This dataset is suitable for: * Data Scientists and Machine Learning Engineers: To develop and fine-tune models for fake review detection and NLP tasks. * Researchers: Studying consumer behaviour, online trust, and adversarial attacks in digital platforms. * E-commerce Businesses: To implement internal systems for maintaining review authenticity and improving customer trust. * Academics and Students: For educational purposes, projects, and academic studies in AI, NLP, and data science.

Dataset Name Suggestions

Fake Product Reviews Dataset

Online Review Authenticity Dataset

E-commerce Review Integrity Data

AI Review Detection Dataset

Customer Review Verification Set

Attributes

Original Data Source: 🚨 Fake Reviews Dataset
Fabricated Fraud Detection
kaggle.com
Updated Dec 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gilad (2019). Fabricated Fraud Detection [Dataset]. https://www.kaggle.com/giladmanor/fraud-detection/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gilad
Description
Demonstration of Synthetic data usability for Fraud Detection

This Demonstration utilized a fraud detection data set and kernel, referenced below to showcase the accuracy and safety of using the products of the kymera fabrication machine

The original data set we have used is the Synthetic Financial Datasets For Fraud Detection This file accurately mimics the original data set features while in fact generating the entire data set from scratch.
Enron Fraud Email Dataset
kaggle.com
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Advaith S Rao (2023). Enron Fraud Email Dataset [Dataset]. https://www.kaggle.com/datasets/advaithsrao/enron-fraud-email-dataset/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 28, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Advaith S Rao
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. The data has been made public and presents a diverse set of email information ranging from internal, marketing emails to spam and fraud attempts.

In the early 2000s, Leslie Kaelbling at MIT purchased the dataset and noted that, though the dataset contained scam emails, it also had several integrity problems. The dataset was updated later, but it becomes key to ensure privacy in the data while it is used to train a deep neural network model.

Though the Enron Email Dataset contains over 500K emails, one of the problems with the dataset is the availability of labeled frauds in the dataset. Label annotation is done to detect an umbrella of fraud emails accurately. Since, fraud emails fall into several types such as Phishing, Financial, Romance, Subscription, and Nigerian Prince scams, there have to be multiple heuristics used to label all types of fraudulent emails effectively.

To tackle this problem, heuristics have been used to label the Enron data corpus using email signals, and automated labeling has been performed using simple ML models on other smaller email datasets available online. These fraud annotation techniques are discussed in detail below.

To perform fraud annotation on the Enron dataset as well as provide more fraud examples for modeling, two more fraud data sources have been used, Phishing Email Dataset: https://www.kaggle.com/dsv/6090437 Social Engineering Dataset: http://aclweb.org/aclwiki

Label Annotation

To label the Enron email dataset two signals are used to filter suspicious emails and label them into fraud and non-fraud classes. Automated ML labeling Email Signals

Automated ML Labeling

The following heuristics are used to annotate labels for Enron email data using the other two data sources,

Phishing Model Annotation: A high-precision SVM model trained on the Phishing mails dataset, which is used to annotate the Phishing Label on the Enron Dataset.

Social Engineering Model Annotation: A high-precision SVM model trained on the Social Engineering mails dataset, which is used to annotate the Social Engineering Label on the Enron Dataset.

The two ML Annotator models use Term Frequency Inverse Document Frequency (TF-IDF) to embed the input text and make use of SVM models with Gaussian Kernel.

If either of the models predicted that an email was a fraud, the mail metadata was checked for several email signals. If these heuristics meet the requirements of a high-probability fraud email, we label it as a fraud email.

Email Signals

Email Signal-based heuristics are used to filter and target suspicious emails for fraud labeling specifically. The signals used were,

Person Of Interest: There is a publicly available list of email addresses of employees who were liable for the massive data leak at Enron. These user mailboxes have a higher chance of containing quality fraud emails.

Suspicious Folders: The Enron data is dumped into several folders for every employee. Folders consist of inbox, deleted_items, junk, calendar, etc. A set of folders with a higher chance of containing fraud emails, such as Deleted Items and Junk.

Sender Type: The sender type was categorized as ‘Internal’ and ‘External’ based on their email address.

Low Communication: A threshold of 4 emails based on the table below was used to define Low Communication. A user qualifies as a Low-Comm sender if their emails are below this threshold. Mails sent from low-comm senders have been assigned with a high probability of being a fraud.

Contains Replies and Forwards: If an email contains forwards or replies, a low probability was assigned for it to be a fraud email.

Manual Inspection

To ensure high-quality labels, the mismatch examples from ML Annotation have been manually inspected for Enron dataset relabeling.

Dataset Breakdown

Fraud Non-Fraud
2327 445090

Citations

Enron Dataset Title: Enron Email Dataset URL: https://www.cs.cmu.edu/~enron/ Publisher: MIT, CMU Author: Leslie Kaelbling, William W. Cohen Year: 2015

Phishing Email Detection Dataset Title: Phishing Email Detection URL: https://www.kaggle.com/dsv/6090437 DOI: 10.34740/KAGGLE/DSV/6090437 Publisher: Kaggle Author: Subhadeep Chakraborty Year: 2023

CLAIR Fraud Email Collection Title: CLAIR collection of fraud email URL: http://aclweb.org/aclwiki Author: Radev, D. Year: 2008
Users IDs
kaggle.com
Updated Oct 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arturo Garcia (2019). Users IDs [Dataset]. https://www.kaggle.com/artmatician/users-ids/notebooks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 6, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Arturo Garcia
Description
Dataset

This dataset was created by Arturo Garcia

Contents
Credit Card Fraud Dataset
kaggle.com
Updated Sep 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cyber Cop (2021). Credit Card Fraud Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/2624805
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/2624805
Dataset updated
Sep 17, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Cyber Cop
License
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Description
Context

The context of this dataset is to find fraudulent credit cards by analyzing the features. The detection of fraudulent credit card can be done using ML or DL.

Acknowledgements

The data actually collected from Weka Repository: https://weka.8497.n7.nabble.com/file/n23121/credit_fruad.arff
Ecommerce Counterfeit Products Dataset
kaggle.com
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
aimlVeera (2025). Ecommerce Counterfeit Products Dataset [Dataset]. https://www.kaggle.com/datasets/aimlveera/counterfeit-product-detection-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 4, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
aimlVeera
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Overview

This synthetic dataset was specifically designed to support machine learning research and development in counterfeit product detection and anti-fraud systems. The dataset mimics real-world patterns found in e-commerce platforms while containing no actual sensitive or proprietary information, making it ideal for educational purposes, algorithm development, and public research.

Key Features and Data Points

Product-Level Features

Basic Product Information:

Product ID, category, brand name, and pricing

Six main categories: Electronics, Fashion, Cosmetics, Pharmaceuticals, Luxury Goods, and Automotive Parts

Realistic brand variations including subtle misspellings common in counterfeit products

Seller Characteristics:

Seller ratings (1.0-5.0 scale) with counterfeits typically showing lower ratings

Review counts ranging from 0 to 10,000, with legitimate sellers having more reviews

Geographic information including seller country and shipping origin

Quality Indicators:

Number of product images (counterfeits typically have fewer)

Product description length (counterfeits often have shorter, less detailed descriptions)

Spelling errors count in product listings

Certification badges and warranty information

Domain age of seller websites

Operational Metrics:

Shipping timeframes (counterfeits often have longer delivery times)

Payment method variety (legitimate sellers offer more options)

Return policy clarity and contact information completeness

Product views, purchases, and wishlist additions

Transaction-Level Features

Transaction Details:

Unique transaction and customer identifiers

Transaction dates spanning one year of activity

Customer demographics and purchase history

Quantity and pricing information with realistic market ranges

Payment and Shipping:

Payment methods including credit cards, PayPal, cryptocurrency, and wire transfers

Shipping speeds and costs

Discount patterns and promotional activity

Risk Indicators:

Transaction velocity flags for unusual purchasing patterns

Geolocation mismatches between customer and payment information

Device fingerprint analysis for new vs. returning customers

Bulk order patterns and refund request frequencies

Key Applications

Training classification models for counterfeit product detection

Developing fraud detection algorithms for e-commerce platforms

Academic research in consumer protection and marketplace security

Building risk assessment systems for online marketplaces

Educational projects in data science and machine learning
Duplicate Analysis
kaggle.com
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alinaswe Simfukwe (2025). Duplicate Analysis [Dataset]. https://www.kaggle.com/datasets/alinaswesimfukwe/duplicate-analysis/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 2, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Alinaswe Simfukwe
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Overview:

Total Records: 749 Original Records: 700 Duplicate Records: 49 (7% of total) File Name: synthetic_claims_with_duplicates.csv Key Features:

Claim Information: Unique claim IDs (CLAIM000001 to CLAIM000700) Employee IDs (EMP0001 to EMP0700) Realistic employee names Financial Data: Amounts range: 100.00 to 20,000.00 Service codes: SVC001, SVC002, SVC003, SVC004 Departments: Finance, HR, IT, Marketing, Operations Transaction Details: Dates within the last 2 years Timestamps for submission Statuses: Submitted, Approved, Paid Random UUIDs for submitter IDs Fraud Detection: 49 exact duplicates (7%) Random distribution throughout the dataset Boolean is_duplicate flag for identification Purpose: The dataset is designed to test fraud detection systems, particularly for identifying duplicate transactions. It simulates real-world scenarios where duplicate entries might occur due to fraud or data entry errors.

Usage:

Testing duplicate transaction detection Training fraud detection models Data validation and cleaning Algorithm benchmarking The dataset is now ready for analysis in your fraud detection system.
Feature Engineering Data
kaggle.com
Updated Jul 23, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mat Leonard (2019). Feature Engineering Data [Dataset]. https://www.kaggle.com/matleonard/feature-engineering-data/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 23, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mat Leonard
Description
This dataset is a sample from the TalkingData AdTracking competition. I kept all the positive examples (where is_attributed == 1), while discarding 99% of the negative samples. The sample has roughly 20% positive examples.

For this competition, your objective was to predict whether a user will download an app after clicking a mobile app advertisement.

File descriptions

train_sample.csv - Sampled data

Data fields

Each row of the training data contains a click record, with the following features.

ip: ip address of click.

app: app id for marketing.

device: device type id of user mobile phone (e.g., iphone 6 plus, iphone 7, huawei mate 7, etc.)

os: os version id of user mobile phone

channel: channel id of mobile ad publisher

click_time: timestamp of click (UTC)

attributed_time: if user download the app for after clicking an ad, this is the time of the app download

is_attributed: the target that is to be predicted, indicating the app was downloaded

Note that ip, app, device, os, and channel are encoded.

I'm also including Parquet files with various features for use within the course.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Fraud	Non-Fraud
2327	445090

Facebook

Twitter

Click to copy link

Link copied

Cite

vala khorasani (2024). Bank Transaction Dataset for Fraud Detection [Dataset]. https://www.kaggle.com/datasets/valakhorasani/bank-transaction-dataset-for-fraud-detection

Bank Transaction Dataset for Fraud Detection

Detailed Analysis of Transactional Behavior and Anomaly Detection

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 4, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

vala khorasani

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

This dataset provides a detailed look into transactional behavior and financial activity patterns, ideal for exploring fraud detection and anomaly identification. It contains 2,512 samples of transaction data, covering various transaction attributes, customer demographics, and usage patterns. Each entry offers comprehensive insights into transaction behavior, enabling analysis for financial security and fraud detection applications.

Key Features:

TransactionID: Unique alphanumeric identifier for each transaction.
AccountID: Unique identifier for each account, with multiple transactions per account.
TransactionAmount: Monetary value of each transaction, ranging from small everyday expenses to larger purchases.
TransactionDate: Timestamp of each transaction, capturing date and time.
TransactionType: Categorical field indicating 'Credit' or 'Debit' transactions.
Location: Geographic location of the transaction, represented by U.S. city names.
DeviceID: Alphanumeric identifier for devices used to perform the transaction.
IP Address: IPv4 address associated with the transaction, with occasional changes for some accounts.
MerchantID: Unique identifier for merchants, showing preferred and outlier merchants for each account.
AccountBalance: Balance in the account post-transaction, with logical correlations based on transaction type and amount.
PreviousTransactionDate: Timestamp of the last transaction for the account, aiding in calculating transaction frequency.
Channel: Channel through which the transaction was performed (e.g., Online, ATM, Branch).
CustomerAge: Age of the account holder, with logical groupings based on occupation.
CustomerOccupation: Occupation of the account holder (e.g., Doctor, Engineer, Student, Retired), reflecting income patterns.
TransactionDuration: Duration of the transaction in seconds, varying by transaction type.
LoginAttempts: Number of login attempts before the transaction, with higher values indicating potential anomalies.

This dataset is ideal for data scientists, financial analysts, and researchers looking to analyze transactional patterns, detect fraud, and build predictive models for financial security applications. The dataset was designed for machine learning and pattern analysis tasks and is not intended as a primary data source for academic publications.

Clear search

Close search

Google apps

Main menu

Bank Transaction Dataset for Fraud Detection

Data from: Online Payment Fraud Detection

Dataset

Contents

Data from: Online Payment Fraud Detection

ieee-data-preprocessing

Dataset

Contents

Data from: ieeecis-fraud-detection

Dataset

Contents

Online Review Authenticity Dataset

Columns

Distribution

Usage

Coverage

License

Who Can Use It

Dataset Name Suggestions

Attributes

Fabricated Fraud Detection

Demonstration of Synthetic data usability for Fraud Detection

Enron Fraud Email Dataset

Label Annotation

Automated ML Labeling

Email Signals

Manual Inspection

Dataset Breakdown

Citations

Users IDs

Dataset

Contents

Credit Card Fraud Dataset

Context

Acknowledgements

Ecommerce Counterfeit Products Dataset

Overview

Key Features and Data Points

Product-Level Features

Basic Product Information:

Seller Characteristics:

Quality Indicators:

Operational Metrics:

Transaction-Level Features

Transaction Details:

Payment and Shipping:

Risk Indicators:

Key Applications

Duplicate Analysis

Feature Engineering Data

File descriptions

Data fields

Bank Transaction Dataset for Fraud Detection

Detailed Analysis of Transactional Behavior and Anomaly Detection