44 datasets found

Bank Transaction Dataset for Fraud Detection
kaggle.com
Updated Nov 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vala khorasani (2024). Bank Transaction Dataset for Fraud Detection [Dataset]. https://www.kaggle.com/datasets/valakhorasani/bank-transaction-dataset-for-fraud-detection
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
vala khorasani
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset provides a detailed look into transactional behavior and financial activity patterns, ideal for exploring fraud detection and anomaly identification. It contains 2,512 samples of transaction data, covering various transaction attributes, customer demographics, and usage patterns. Each entry offers comprehensive insights into transaction behavior, enabling analysis for financial security and fraud detection applications.

Key Features:

TransactionID: Unique alphanumeric identifier for each transaction.

AccountID: Unique identifier for each account, with multiple transactions per account.

TransactionAmount: Monetary value of each transaction, ranging from small everyday expenses to larger purchases.

TransactionDate: Timestamp of each transaction, capturing date and time.

TransactionType: Categorical field indicating 'Credit' or 'Debit' transactions.

Location: Geographic location of the transaction, represented by U.S. city names.

DeviceID: Alphanumeric identifier for devices used to perform the transaction.

IP Address: IPv4 address associated with the transaction, with occasional changes for some accounts.

MerchantID: Unique identifier for merchants, showing preferred and outlier merchants for each account.

AccountBalance: Balance in the account post-transaction, with logical correlations based on transaction type and amount.

PreviousTransactionDate: Timestamp of the last transaction for the account, aiding in calculating transaction frequency.

Channel: Channel through which the transaction was performed (e.g., Online, ATM, Branch).

CustomerAge: Age of the account holder, with logical groupings based on occupation.

CustomerOccupation: Occupation of the account holder (e.g., Doctor, Engineer, Student, Retired), reflecting income patterns.

TransactionDuration: Duration of the transaction in seconds, varying by transaction type.

LoginAttempts: Number of login attempts before the transaction, with higher values indicating potential anomalies.

This dataset is ideal for data scientists, financial analysts, and researchers looking to analyze transactional patterns, detect fraud, and build predictive models for financial security applications. The dataset was designed for machine learning and pattern analysis tasks and is not intended as a primary data source for academic publications.
Synthetic Bank Transactions
kaggle.com
zip
Updated Mar 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Harris (2021). Synthetic Bank Transactions [Dataset]. https://www.kaggle.com/radistaleks/synthetic-bank-transactions
Explore at:
zip(13820207 bytes)Available download formats
Dataset updated
Mar 20, 2021
Authors
John Harris
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Inspiration

Many projects require datasets about bank transactions to test their systems. Unfortunately, it is hard to find a dataset that would have transaction product categorization which is important for many analytical projects.

Content

There you have 4 datasets. Clients - basic information about bank users. Categories - standart transaction categories which are being by many banks worldwide. Transactions - the core of our dataset, basic information about transactions like who is the second account of transaction, category, amount, etc. Subscriptions - information about subscriptions, in other words, transactions which are made automatically.
Credit card fraud detection Date 25th of June 2015
kaggle.com
Updated Oct 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zohair ahmed (2023). Credit card fraud detection Date 25th of June 2015 [Dataset]. https://www.kaggle.com/datasets/qnqfbqfqo/credit-card-fraud-detection-date-25th-of-june-2015
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 29, 2023
Dataset provided by
Kaggle
Authors
Zohair ahmed
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset present transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.

It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, ... V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.

The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on http://mlg.ulb.ac.be/BruFence and http://mlg.ulb.ac.be/ARTML.
Brazil Bank Account Spending Dataset
kaggle.com
Updated Jul 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Süfyan Taşkın (2020). Brazil Bank Account Spending Dataset [Dataset]. https://www.kaggle.com/sufyant/brazil-bank-account-spending-dataset/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 20, 2020
Dataset provided by
Kaggle
Authors
Süfyan Taşkın
Description
Dataset

This dataset was created by Süfyan Taşkın

Contents
BitClout 50K Profiles Dump
kaggle.com
Updated Mar 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miguel Esteban Gómez (2021). BitClout 50K Profiles Dump [Dataset]. https://www.kaggle.com/michaelstevan/bitclout-50000-profiles-dump/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 28, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Miguel Esteban Gómez
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Miguel Esteban Gómez

Released under CC0: Public Domain

Contents
A
‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 5, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-phishing-dataset-for-machine-learning-2690/f1656d17/?iid=043-921&v=presentation
Explore at:
Dataset updated
Nov 5, 2019
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Phishing Dataset for Machine Learning’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shashwatwork/phishing-dataset-for-machine-learning on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

Anti-phishing refers to efforts to block phishing attacks. Phishing is a kind of cybercrime where attackers pose as known or trusted entities and contact individuals through email, text or telephone and ask them to share sensitive information. Typically, in a phishing email attack, and the message will suggest that there is a problem with an invoice, that there has been suspicious activity on an account, or that the user must login to verify an account or password. Users may also be prompted to enter credit card information or bank account details as well as other sensitive data. Once this information is collected, attackers may use it to access accounts, steal data and identities, and download malware onto the user’s computer.

Content

This dataset contains 48 features extracted from 5000 phishing webpages and 5000 legitimate webpages, which were downloaded from January to May 2015 and from May to June 2017. An improved feature extraction technique is employed by leveraging the browser automation framework (i.e., Selenium WebDriver), which is more precise and robust compared to the parsing approach based on regular expressions.

Anti-phishing researchers and experts may find this dataset useful for phishing features analysis, conducting rapid proof of concept experiments or benchmarking phishing classification models.

Acknowledgements

Tan, Choon Lin (2018), “Phishing Dataset for Machine Learning: Feature Evaluation”, Mendeley Data, V1, doi: 10.17632/h3cgnj8hft.1 Source of the Dataset.

--- Original source retains full ownership of the source dataset ---

Customer_Data

kaggle.com

Updated Mar 12, 2023

Facebook

Twitter

Click to copy link

Link copied

Cite

Alireza Rastegar (2023). Customer_Data [Dataset]. https://www.kaggle.com/datasets/alirezaai/customer-data/discussion

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Mar 12, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Alireza Rastegar

Description

Item	Description
BALANCE	Outstanding balance on the credit card account
BALANCE_FREQUENCY	How often the balance is updated
PURCHASES	Total amount of purchases made on the credit card
ONEOFF_PURCHASES	Total amount of one-time purchases made on the credit card
INSTALLMENTS_PURCHASES	Total amount of purchases made on the credit card that were paid back in installments
CASH_ADVANCE	Amount of cash withdrawn from the credit card account as a cash advance
PURCHASES_FREQUENCY	How often purchases are made on the credit card
ONEOFF_PURCHASES_FREQUENCY	How often one-time purchases are made on the credit card
PURCHASES_INSTALLMENTS_FREQUENCY	How often purchases that are paid back in installments are made on the credit card
CASH_ADVANCE_FREQUENCY	How often cash advances are taken out on the credit card
CASH_ADVANCE_TRX	Number of cash advance transactions made on the credit card account
PURCHASES_TRX	Number of purchase transactions made on the credit card account
CREDIT_LIMIT	Maximum amount of credit the customer is allowed to use on the credit card
PAYMENTS	Total amount of payments made on the credit card account
MINIMUM_PAYMENTS	Minimum amount of payments required on the credit card account
PRC_FULL_PAYMENT	Percentage of the balance that is paid in full by the customer each month
TENURE	Number of years the customer has been using the credit card account

A
‘UPI apps Transactions in 2021’ analyzed by Analyst-2
analyst-2.ai
Updated Dec 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘UPI apps Transactions in 2021’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-upi-apps-transactions-in-2021-c503/a356228b/?iid=002-537&v=presentation
Explore at:
Dataset updated
Dec 31, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘UPI apps Transactions in 2021’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ramjasmaurya/upi-apps-transactions-in-2021 on 28 January 2022.

--- Dataset description provided by original source is as follows ---

https://miro.medium.com/max/1400/1*94MvdhxeCQHoD7A4K1vlWg.png">

Unified Payments Interface (UPI) is an instant real-time payment system developed by National Payments Corporation of India (NPCI) facilitating inter-bank peer-to-peer (P2P) and person-to-merchant (P2M) transactions.NPCI is umbrella organisation for all digital payments. The interface is regulated by the Reserve Bank of India (RBI) and works by instantly transferring funds between two bank accounts on a mobile platform. As of November 2021, there are 274 banks available on UPI with a monthly volume of 4.18 billion transactions and a value of ₹7.1 trillion (US$94 billion) UPI witnessed 68 billion transactions till November 2021. The mobile-only payment system helped transact a total of ₹34.95 lakh crore (US$460 billion) during the 67 months of operation starting from 2016. As of May 2021, the platform has 150 million monthly active users in India with plans to achieve 500 million by 2025. IIT Madras is also working to integrate voice command feature that can support English and Indian vernacular language in future. The proportion of UPI transactions in total volume of digital transactions grew from 23% in 2018-19 to 55% in 2020-21 with an average value of ₹1,849 per transaction

--- Original source retains full ownership of the source dataset ---
Mt.Gox Leaked Transaction
kaggle.com
Updated Mar 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XBlock (2020). Mt.Gox Leaked Transaction [Dataset]. https://www.kaggle.com/xblock/mtgox-leaked-transaction/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 23, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
XBlock
Description
This data set is the transaction data leaked by mt.gox exchange.

First, we combine the buy and sell transaction fields of the same transaction, and then de duplicate them through transaction time, transaction account, etc. to ensure the uniqueness of each transaction data. This transaction data is very useful for analyzing the user behavior of bitcoin market.

We have done a market manipulation study using this data set.

For more details about blockchain dataset, please click here.
Data from: Network Activity Anomaly Detection
kaggle.com
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Presha Monga (2024). Network Activity Anomaly Detection [Dataset]. https://www.kaggle.com/datasets/preshamonga/network-activity-anomaly-detection/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Presha Monga
Description
In the Target column, Normal(No Attack) = 0, Neptune Attack =1

Description of the columns present in the Dataset:

duration: Length (in seconds) of the connection.

protocoltype: The protocol used in the connection (e.g., TCP, UDP, ICMP).

service: The network service on the destination (e.g., HTTP, FTP, SMTP).

flag: Status flag of the connection (e.g., SF for normal termination).

srcbytes: Number of data bytes sent from the source to the destination.

dstbytes: Number of data bytes sent from the destination to the source.

land: A binary flag indicating if the connection is to the same host (source IP equals destination IP).

wrongfragment: Number of wrong fragments in the connection.

urgent: Number of urgent packets in the connection.

hot: Number of "hot" indicators (e.g., access to sensitive files).

numfailedlogins: Number of failed login attempts.

loggedin: A binary flag indicating if the user logged in successfully (1 if yes, 0 if no).

numcompromised: Number of compromised conditions.

rootshell: A binary flag indicating if a root shell was obtained.

suattempted: A binary flag indicating if the "su" command was attempted (used for switching user privileges).

numroot: Number of "root" accesses.

numfilecreations: Number of file creation operations.

numshells: Number of shell prompts invoked.

numaccessfiles: Number of accesses to control files.

numoutboundcmds: Number of outbound commands in an FTP session.

ishostlogin: A binary flag indicating if the login belongs to a "host" user.

isguestlogin: A binary flag indicating if the login belongs to a "guest" user.

count: Number of connections to the same host as the current connection in the past two seconds.

srvcount: Number of connections to the same service as the current connection in the past two seconds.

serrorrate: Percentage of connections that have "SYN" errors.

srvserrorrate: Percentage of connections to the same service that have "SYN" errors.

rerrorrate: Percentage of connections that have "REJ" errors.

srvrerrorrate: Percentage of connections to the same service that have "REJ" errors.

samesrvrate: Percentage of connections to the same service.

diffsrvrate: Percentage of connections to different services.

srvdiffhostrate: Percentage of connections to different hosts in the same service.

dsthostcount: Number of connections to the same destination host.

dsthostsrvcount: Number of connections to the same service at the destination host.

dsthostsamesrvrate: Percentage of connections to the same service at the destination host.

dsthostdiffsrvrate: Percentage of connections to different services at the destination host.

dsthostsamesrcportrate: Percentage of connections from the same source port to the destination host.

dsthostsrvdiffhostrate: Percentage of connections to different destination hosts in the same service.

dsthostserrorrate: Percentage of connections to the destination host that have "SYN" errors.

dsthostsrvserrorrate: Percentage of connections to the same service at the destination host that have "SYN" errors.

dsthostrerrorrate: Percentage of connections to the destination host that have "REJ" errors.

dsthostsrvrerrorrate: Percentage of connections to the same service at the destination host that have "REJ" errors.

lastflag: The status of the last connection in this session.

attack: The target label, indicating if the connection is normal (0) or a Neptune attack (1).
f
Details of feature variables of the data set.
plos.figshare.com
xls
Updated Dec 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ke Peng; Yan Peng; Wenguang Li (2023). Details of feature variables of the data set. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0289724.t002
Dataset updated
Dec 8, 2023
Dataset provided by
PLOS ONE
Authors
Ke Peng; Yan Peng; Wenguang Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.
Credit Card Defaulter
kaggle.com
Updated Jun 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arsh Anwar (2021). Credit Card Defaulter [Dataset]. https://www.kaggle.com/d4rklucif3r/defaulter/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 10, 2021
Dataset provided by
Kaggle
Authors
Arsh Anwar
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
This dataset is all about credit card defaulters. It contains 5 Columns 1) ID - Id of customer 2) Default - Is the person a loan defaulter 3) Student - Is the person a student 4) Balance - balance in his/her account 5) Income - His/Her income
Financial Transaction and Risk Management Dataset
kaggle.com
Updated Jan 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziya (2025). Financial Transaction and Risk Management Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/financial-transaction-and-risk-management-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 8, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ziya
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About the Dataset This dataset contains financial transaction records and risk management data for accounting systems. It includes a variety of transactional data, such as transaction IDs, amounts, categories, and payment methods, alongside associated risk incidents like fraud, errors, and misstatements. The dataset also captures system metadata, such as user activity, transaction processing time, login frequency, and geographical region of the IP. The data is designed to simulate real-world accounting system operations and risk events, enabling the development and testing of AI-driven risk prediction models. The dataset can be used for research in real-time financial risk management, fraud detection, and improving decision-making processes in accounting systems using artificial intelligence.
Facebook Spam Dataset
kaggle.com
Updated Apr 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khaja Hussain SK (2021). Facebook Spam Dataset [Dataset]. https://www.kaggle.com/khajahussainsk/facebook-spam-dataset/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 11, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Khaja Hussain SK
Description
Context Collection of Facebook spam-legit profile and content-based data. It can be used for classification tasks.

Content The dataset can be used for building machine learning models. To collect the dataset, Facebook API and Facebook Graph API are used and the data is collected from public profiles. There are 500 legit profiles and 100 spam profiles. The list of features is as follows with Label (0-legit, 1-spam). 1. Number of friends 2. Number of followings 3. Number of Community 4. The age of the user account (in days) 5. Total number of posts shared 6. Total number of URLs shared 7. Total number of photos/videos shared 8. Fraction of the posts containing URLs 9. Fraction of the posts containing photos/videos 10. Average number of comments per post 11. Average number of likes per post 12. Average number of tags in a post (Rate of tagging) 13. Average number of hashtags present in a post

Inspiration Dataset helps the community to understand how features can help to differ Facebook legit users from spam users.
Cryptocurrency Historical Prices [Updated Daily]
kaggle.com
Updated May 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Usama Buttar (2023). Cryptocurrency Historical Prices [Updated Daily] [Dataset]. https://www.kaggle.com/datasets/usamabuttar/cryptocurrency-historical-prices-updated-daily
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 25, 2023
Dataset provided by
Kaggle
Authors
Usama Buttar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset contains a comprehensive collection of historical price records for the top 1000 cryptocurrencies. The data in this dataset is updated daily, providing a reliable and up-to-date source of information for cryptocurrency traders, researchers, and enthusiasts.

Each file in the dataset includes the following columns: date, open price, high price, low price, closing price, adjusted closing price, and trading volume. These columns provide a detailed picture of the daily price movements and trading activity of each cryptocurrency in the dataset.

The "date" column indicates the day on which the price data was recorded, while the "open" column provides the opening price of the cryptocurrency for that day. The "high" and "low" columns indicate the highest and lowest prices of the cryptocurrency on that day, respectively. The "close" column represents the closing price of the cryptocurrency on that day, while the "adjusted close" column takes into account any dividends or other corporate actions that may have affected the price. Finally, the "volume" column shows the trading volume of the cryptocurrency on that day.

With this dataset, users can analyze and visualize the performance of individual cryptocurrencies, compare them to one another, and track trends over time. The data is ideal for use in machine learning models, predictive analytics, and other data-driven applications.
CC_Fraud
kaggle.com
zip
Updated Mar 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sam Kowitt (2021). CC_Fraud [Dataset]. https://www.kaggle.com/samkowitt/cc-fraud
Explore at:
zip(69155672 bytes)Available download formats
Dataset updated
Mar 2, 2021
Authors
Sam Kowitt
Description
Context

This data-set contains >300,000 anonymized transactions. The variables are anonymized to protect the consumers information but they represent fields such as how long has the consumer had the account in a way which protects the information. Each row represents a users transaction. This data-set was built so that using the classifier you can build a model which can use the anonymized variables to predict which transactions are potentially fraudulent.

Content

The data-set contains a fraud rate of ~0.1% and thus is highly unbalanced.

The variables are as follows: Time, anonymized variables (30 variables), $ Amount, Class (Fraud Classifier)

What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Securitisation Vehicles
kaggle.com
zip
Updated Sep 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Praveen Kumar (2020). Securitisation Vehicles [Dataset]. https://www.kaggle.com/penchalaiah123/securitisation-vehicles
Explore at:
zip(16518 bytes)Available download formats
Dataset updated
Sep 12, 2020
Authors
Praveen Kumar
Description
Context

The data are detailed series underlying the Financial Accounts, ABS Cat NoAA 5232.0. They cover special purpose vehicles registered or incorporated in Australia to securitise selected assets, and whose issues are independently rated by a recognised rating agency. See :Changes to Tables-C/ in the DecemberA 1996 issue of the Bulletin for a further discussion of securitisation vehicles. Some data prior to JuneA 1993 are partly estimated.

Content

:Mortgages-C/ include both residential and non-residential mortgages.

:Other loans and placements-C/ include operating lease and lease finance receivables, secured loans to originators and loans secured by other types of assets.

Holdings of :Asset-backed bonds-C/ refers to individual securitisation vehicles-C/ holdings of asset-backed bonds issued by other securitisation vehicles.

:All other assets-C/ include cash and deposits with Australian banks and corporations registered under the Financial Sector (Collection of Data) Act 2001 and all other claims not already included.

:Other liabilities-C/ include loans and advances from Australian banks, corporations registered under the Financial Sector (Collection of Data) Act 2001 and other financial institutions, along with all other liabilities not already included.
DS4 Work - Marketing Dataset
kaggle.com
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beytullah Soylev (2024). DS4 Work - Marketing Dataset [Dataset]. https://www.kaggle.com/datasets/soylevbeytullah/customer-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Beytullah Soylev
Description
The dataset includes various features about the bank's customers:

Customer ID: Unique identifier for each credit card holder. Balance: Remaining balance in the customer's account. Balance Frequency: How often the balance is updated (score between 0 and 1, with 0 indicating infrequent updates and 1 signifying frequent updates). Purchases: Total amount of purchases made from the account. One-Off Purchases: Maximum purchase amount made in a single transaction. Installment Purchases: Amount of purchases made in installments. Cash Advance: Amount of cash advanced using the credit card. Purchases Frequency: How often purchases are made (score between 0 and 1, similar to balance frequency). One-Off Purchases Frequency: How often customers make one-time purchases. Installment Purchases Frequency: How often customers make installment purchases. Cash Advance Frequency: How often customers take cash advances. Cash Advance Transactions: Number of cash advance transactions. Purchases Transactions: Number of purchase transactions. Credit Limit: Maximum credit limit for the specific user. Payments: Total amount of payments made by the user. Minimum Payment: Minimum payment amount required by the user. Percentage of Full Payment: Percentage of the total balance paid by the user (0 indicates no payment, 100 indicates full payment). Tenure: Length of time the customer has been a credit card user.
Paid In Contributions to IBRD/IDA/IFC Trust Funds
kaggle.com
zip
Updated Nov 27, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
World Bank (2019). Paid In Contributions to IBRD/IDA/IFC Trust Funds [Dataset]. https://www.kaggle.com/theworldbank/paid-in-contributions-to-ibrd-ida-ifc-trust-funds
Explore at:
zip(229396 bytes)Available download formats
Dataset updated
Nov 27, 2019
Dataset authored and provided by
World Bank
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

A Recipient-executed Grant is a Trust Fund Grant that is provided to a third party under a grant agreement, and for which the Bank plays an operational role - i.e., the Bank normally appraises and supervises activities financed by these funds. This dataset provides data on the amount of grant funds committed in the course of a fiscal year and payments made out of a Trust Fund account to eligible recipients, in accordance with the legal agreements. In fulfilling its responsibilities, the World Bank as Trustee complies with all sanctions applicable to World Bank transactions. All definitions should be regarded at present as provisional and not final, and are subject to revision at any time. Data is provided at the individual Trust Fund level and is updated as of 04/02/2015. No further updates are planned for this particular dataset, please visit the Global Partnership and Trust Fund Operations website for more details: http://go.worldbank.org/GABMG2YEI0

Context

This is a dataset hosted by the World Bank. The organization has an open data platform found here and they update their information according the amount of data that is brought in. Explore World Bank's Financial Data using Kaggle and all of the data sources available through the World Bank organization page!

Update Frequency: This dataset is updated daily.

Acknowledgements

This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.

This dataset is distributed under a Creative Commons Attribution 3.0 IGO license.

Cover photo by Joseph Gonzalez on Unsplash
Unsplash Images are distributed under a unique Unsplash License.

This dataset is distributed under Creative Commons Attribution 3.0 IGO
Iris Species
kaggle.com
zip
Updated Sep 27, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCI Machine Learning (2016). Iris Species [Dataset]. https://www.kaggle.com/datasets/uciml/iris
Explore at:
zip(3687 bytes)Available download formats
Dataset updated
Sep 27, 2016
Dataset authored and provided by
UCI Machine Learning
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Iris dataset was used in R.A. Fisher's classic 1936 paper, The Use of Multiple Measurements in Taxonomic Problems, and can also be found on the UCI Machine Learning Repository.

It includes three iris species with 50 samples each as well as some properties about each flower. One flower species is linearly separable from the other two, but the other two are not linearly separable from each other.

The columns in this dataset are:

Id

SepalLengthCm

SepalWidthCm

PetalLengthCm

PetalWidthCm

Species

Facebook

Twitter

Click to copy link

Link copied

Cite

vala khorasani (2024). Bank Transaction Dataset for Fraud Detection [Dataset]. https://www.kaggle.com/datasets/valakhorasani/bank-transaction-dataset-for-fraud-detection

Bank Transaction Dataset for Fraud Detection

Detailed Analysis of Transactional Behavior and Anomaly Detection

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 4, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

vala khorasani

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

This dataset provides a detailed look into transactional behavior and financial activity patterns, ideal for exploring fraud detection and anomaly identification. It contains 2,512 samples of transaction data, covering various transaction attributes, customer demographics, and usage patterns. Each entry offers comprehensive insights into transaction behavior, enabling analysis for financial security and fraud detection applications.

Key Features:

TransactionID: Unique alphanumeric identifier for each transaction.
AccountID: Unique identifier for each account, with multiple transactions per account.
TransactionAmount: Monetary value of each transaction, ranging from small everyday expenses to larger purchases.
TransactionDate: Timestamp of each transaction, capturing date and time.
TransactionType: Categorical field indicating 'Credit' or 'Debit' transactions.
Location: Geographic location of the transaction, represented by U.S. city names.
DeviceID: Alphanumeric identifier for devices used to perform the transaction.
IP Address: IPv4 address associated with the transaction, with occasional changes for some accounts.
MerchantID: Unique identifier for merchants, showing preferred and outlier merchants for each account.
AccountBalance: Balance in the account post-transaction, with logical correlations based on transaction type and amount.
PreviousTransactionDate: Timestamp of the last transaction for the account, aiding in calculating transaction frequency.
Channel: Channel through which the transaction was performed (e.g., Online, ATM, Branch).
CustomerAge: Age of the account holder, with logical groupings based on occupation.
CustomerOccupation: Occupation of the account holder (e.g., Doctor, Engineer, Student, Retired), reflecting income patterns.
TransactionDuration: Duration of the transaction in seconds, varying by transaction type.
LoginAttempts: Number of login attempts before the transaction, with higher values indicating potential anomalies.

This dataset is ideal for data scientists, financial analysts, and researchers looking to analyze transactional patterns, detect fraud, and build predictive models for financial security applications. The dataset was designed for machine learning and pattern analysis tasks and is not intended as a primary data source for academic publications.

Clear search

Close search

Google apps

Main menu

Bank Transaction Dataset for Fraud Detection

Synthetic Bank Transactions

Inspiration

Content

Credit card fraud detection Date 25th of June 2015

Brazil Bank Account Spending Dataset

Dataset

Contents

BitClout 50K Profiles Dump

Dataset

Contents

‘Phishing Dataset for Machine Learning’ analyzed by Analyst-2

Context

Content

Acknowledgements

Customer_Data

‘UPI apps Transactions in 2021’ analyzed by Analyst-2

Mt.Gox Leaked Transaction

Data from: Network Activity Anomaly Detection

Details of feature variables of the data set.

Credit Card Defaulter

Financial Transaction and Risk Management Dataset

Facebook Spam Dataset

Cryptocurrency Historical Prices [Updated Daily]

CC_Fraud

Context

Content

Acknowledgements

Inspiration

Securitisation Vehicles

Context

Content

DS4 Work - Marketing Dataset

The dataset includes various features about the bank's customers:

Paid In Contributions to IBRD/IDA/IFC Trust Funds

Content

Context

Acknowledgements

Iris Species

Bank Transaction Dataset for Fraud Detection

Detailed Analysis of Transactional Behavior and Anomaly Detection