Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.
The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.
Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.
The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project
Please cite the following works:
Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015
Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon
Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE
Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)
Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier
Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing
Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019
Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019
Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook
Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset is a valuable resource for building and evaluating machine learning models to predict fraudulent transactions in an e-commerce environment. With 6.3 million rows, it provides a rich, real-world scenario for data science tasks.
The data is an excellent case study for several key challenges in machine learning, including:
Handling Imbalanced Data: The dataset is highly imbalanced, as legitimate transactions vastly outnumber fraudulent ones. This necessitates the use of specialized techniques like SMOTE or advanced models like XGBoost that can handle class imbalance effectively.
Feature Engineering: The raw data provides an opportunity to create new, more powerful features, such as transaction velocity or the ratio of account balances, which can improve model performance.
Model Evaluation: Traditional metrics like accuracy are misleading for this type of dataset. The project requires a deeper analysis using metrics such as Precision, Recall, F1-Score, and the Precision-Recall AUC to truly understand the model's effectiveness.
Key Features: The dataset includes a variety of anonymized transaction details:
amount: The value of the transaction.
type: The type of transaction (e.g., TRANSFER, CASH_OUT).
oldbalance & newbalance: The balances of the origin and destination accounts before and after the transaction.
isFraud: The target variable, a binary flag indicating a fraudulent transaction.
Facebook
Twitterhttps://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
In the rapidly evolving US fraud detection software industry, developers invest significant capital in staying ahead of increasingly sophisticated cyber threats and fraud tactics. Over the past five years, accelerated digitalization, a surge in real-time payments and the adoption of e-commerce have fueled demand for industry solutions. Emerging trends such as behavioral biometrics, deepfake detection and real-time anomaly scoring have become essential, and developers now deliver cloud-based platforms able to address emerging threats. As businesses, banks, healthcare providers and public sector organizations face rising regulatory scrutiny and compliance demands, industry revenue has grown at a CAGR of 9.1% to an estimated $26.3 billion, including anticipated growth of 5.4% in 2025 alone. The widespread adoption of contactless payment technologies, such as mobile wallets and tap-to-pay cards enabled by Near Field Communication (NFC), has introduced a fresh set of vulnerabilities. Cybercriminals are now leveraging advanced techniques to exploit weaknesses that legacy systems are not designed to detect. These threats have required fraud detection software developers to integrate novel security measures into their offerings. Meanwhile, the rapid growth of e-commerce has been a significant driver of demand for fraud detection software among retail and wholesale companies. As more consumers migrate to online shopping platforms, transaction volumes have soared, exposing retailers and wholesalers to heightened risks. This has provided industry developers with a high-growth market where they often benefit from increased pricing power, which supports profit growth. Moving forward, the industry is set for further transformation as regulatory mandates around AI-enabled fraud prevention, deepfake detection and real-time compliance reporting become widespread. Continuous M&A activity and increased demand from high-growth market segments will strengthen revenue streams. Despite ongoing competitive pressures and rapidly shifting threat landscapes, these factors are forecast to support a robust industry revenue CAGR of 5.2% through 2030, reaching an estimated $33.8 billion.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
There is a lack of public available datasets on financial services and specially in the emerging mobile money transactions domain. Financial datasets are important to many researchers and in particular to us performing research in the domain of fraud detection. Part of the problem is the intrinsically private nature of financial transactions, that leads to no publicly available datasets.
We present a synthetic dataset generated using the simulator called PaySim as an approach to such a problem. PaySim uses aggregated data from the private dataset to generate a synthetic dataset that resembles the normal operation of transactions and injects malicious behaviour to later evaluate the performance of fraud detection methods.
PaySim simulates mobile money transactions based on a sample of real transactions extracted from one month of financial logs from a mobile money service implemented in an African country. The original logs were provided by a multinational company, who is the provider of the mobile financial service which is currently running in more than 14 countries all around the world.
This synthetic dataset is scaled down 1/4 of the original dataset and it is created just for Kaggle.
This is a sample of 1 row with headers explanation:
1,PAYMENT,1060.31,C429214117,1089.0,28.69,M1591654462,0.0,0.0,0,0
step - maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (30 days simulation).
type - CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
amount - amount of the transaction in local currency.
nameOrig - customer who started the transaction
oldbalanceOrg - initial balance before the transaction
newbalanceOrig - new balance after the transaction.
nameDest - customer who is the recipient of the transaction
oldbalanceDest - initial balance recipient before the transaction. Note that there is not information for customers that start with M (Merchants).
newbalanceDest - new balance recipient after the transaction. Note that there is not information for customers that start with M (Merchants).
isFraud - This is the transactions made by the fraudulent agents inside the simulation. In this specific dataset the fraudulent behavior of the agents aims to profit by taking control or customers accounts and try to empty the funds by transferring to another account and then cashing out of the system.
isFlaggedFraud - The business model aims to control massive transfers from one account to another and flags illegal attempts. An illegal attempt in this dataset is an attempt to transfer more than 200.000 in a single transaction.
There are 5 similar files that contain the run of 5 different scenarios. These files are better explained at my PhD thesis chapter 7 (PhD Thesis Available here http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12932.
We ran PaySim several times using random seeds for 744 steps, representing each hour of one month of real time, which matches the original logs. Each run took around 45 minutes on an i7 intel processor with 16GB of RAM. The final result of a run contains approximately 24 million of financial records divided into the 5 types of categories: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.
Please refer to this dataset using the following citations:
PaySim first paper of the simulator:
E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global healthcare fraud detection and investigation software market is experiencing robust growth, driven by increasing healthcare expenditures, rising instances of fraudulent activities, and the escalating need for robust security measures within the healthcare sector. The market's expansion is fueled by technological advancements, including the adoption of artificial intelligence (AI), machine learning (ML), and big data analytics to identify complex fraud patterns and anomalies. These technologies allow for more proactive fraud detection, reducing financial losses and improving healthcare system efficiency. Furthermore, stringent government regulations and increased penalties for healthcare fraud are pushing healthcare providers and payers to adopt advanced software solutions. This market segment is witnessing a shift towards cloud-based solutions offering scalability, cost-effectiveness, and accessibility. However, challenges such as data privacy concerns, integration complexities with existing systems, and the high cost of implementing and maintaining these sophisticated systems act as restraints to market growth. Considering a conservative CAGR of 15% (a common rate for rapidly developing software markets) from a base year of 2025 with a market size of $2 billion, and a forecast period of 2025-2033, we can project significant expansion. The competitive landscape is dynamic, with established players like SAS and Fujitsu alongside specialized firms like DataWalk and WhiteHat AI competing for market share. These companies offer diverse solutions, catering to various healthcare stakeholders' needs. The ongoing innovation in fraud detection techniques and the increasing focus on interoperability among different healthcare systems will further shape the market's trajectory. The market is segmented by software type (e.g., rule-based systems, AI-powered systems), deployment mode (cloud, on-premise), and end-user (hospitals, insurance companies, government agencies). Geographic segmentation would likely show strong growth in North America and Europe initially, followed by expansion in other regions as awareness and adoption increase. The increasing sophistication of fraud schemes necessitates continuous innovation in software capabilities, ensuring the market's long-term growth potential.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Project Objectives Provider Fraud is one of the biggest problems facing Medicare. According to the government, the total Medicare spending increased exponentially due to frauds in Medicare claims. Healthcare fraud is an organized crime which involves peers of providers, physicians, beneficiaries acting together to make fraud claims.
Rigorous analysis of Medicare data has yielded many physicians who indulge in fraud. They adopt ways in which an ambiguous diagnosis code is used to adopt costliest procedures and drugs. Insurance companies are the most vulnerable institutions impacted due to these bad practices. Due to this reason, insurance companies increased their insurance premiums and as result healthcare is becoming costly matter day by day.
Healthcare fraud and abuse take many forms. Some of the most common types of frauds by providers are:
a) Billing for services that were not provided.
b) Duplicate submission of a claim for the same service.
c) Misrepresenting the service provided.
d) Charging for a more complex or expensive service than was actually provided.
e) Billing for a covered service when the service actually provided was not covered.
Problem Statement The goal of this project is to " predict the potentially fraudulent providers " based on the claims filed by them.along with this, we will also discover important variables helpful in detecting the behaviour of potentially fraud providers. further, we will study fraudulent patterns in the provider's claims to understand the future behaviour of providers.
Introduction to the Dataset For the purpose of this project, we are considering Inpatient claims, Outpatient claims and Beneficiary details of each provider. Lets s see their details :
A) Inpatient Data
This data provides insights about the claims filed for those patients who are admitted in the hospitals. It also provides additional details like their admission and discharge dates and admit d diagnosis code.
B) Outpatient Data
This data provides details about the claims filed for those patients who visit hospitals and not admitted in it.
C) Beneficiary Details Data
This data contains beneficiary KYC details like health conditions,regioregion they belong to etc.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The data in this dataset is crawled from gofundme, the largest donation-based crowdfunding website in the United States. The cleaned project descriptions are used to extract textual clues for the fraud detection model through the LIWC tool.
Facebook
Twitter
According to our latest research, the global Community Rebuild Permit Fraud Detection market size reached $1.16 billion in 2024. The market is experiencing robust expansion, driven by the increasing adoption of digital permitting systems and heightened regulatory scrutiny. With a projected compound annual growth rate (CAGR) of 14.2% from 2025 to 2033, the market is forecasted to attain a value of $3.34 billion by 2033. This growth is primarily fueled by the rising incidences of permit fraud in community rebuild initiatives, advancements in fraud detection technologies, and a global push towards transparent and accountable urban development processes.
One of the primary growth factors in the Community Rebuild Permit Fraud Detection market is the rapid digitalization of permitting processes across municipalities and regulatory agencies. As more cities and local governments migrate to digital platforms for managing construction and rebuild permits, the risk and complexity of fraudulent activities have increased. This shift has necessitated the deployment of advanced fraud detection solutions capable of analyzing large volumes of data, identifying anomalies, and providing real-time alerts. The integration of artificial intelligence (AI), machine learning, and data analytics into these solutions has significantly enhanced their effectiveness, enabling organizations to proactively detect and prevent fraudulent permit applications. Additionally, the growing awareness among stakeholders about the financial and reputational risks associated with permit fraud is driving investments in robust detection systems.
Another significant driver for the market is the tightening of regulatory frameworks and compliance requirements at both local and national levels. Governments worldwide are implementing stricter regulations to curb fraudulent activities in construction and community rebuild projects, particularly in the aftermath of natural disasters or large-scale urban redevelopment efforts. These regulations often mandate the use of secure, auditable, and transparent permitting systems, which in turn increases the demand for specialized fraud detection software and services. Furthermore, public pressure for accountability and transparency in government spending has made it imperative for municipalities and regulatory agencies to adopt state-of-the-art fraud detection mechanisms. This regulatory landscape is fostering innovation and encouraging solution providers to develop more sophisticated and customizable tools tailored to the unique needs of different jurisdictions.
The increasing frequency and scale of community rebuild projects, especially in disaster-prone regions, also contribute to the market’s growth. As urban areas expand and infrastructure ages, the volume of permit applications rises, creating more opportunities for fraudulent actors to exploit system vulnerabilities. In response, construction companies, municipalities, and regulatory bodies are prioritizing investments in fraud detection solutions to safeguard public funds, maintain project timelines, and uphold community trust. The collaboration between public and private sector entities, along with the emergence of public-private partnerships, is further accelerating the adoption of advanced fraud detection technologies. This trend is expected to continue as stakeholders recognize the long-term value of proactive fraud mitigation in ensuring the success and sustainability of community rebuild initiatives.
From a regional perspective, North America currently dominates the Community Rebuild Permit Fraud Detection market, accounting for the largest share in 2024. This leadership position is attributed to the region’s advanced digital infrastructure, high incidence of construction and rebuild projects, and stringent regulatory environment. Europe follows closely, driven by the European Union’s emphasis on transparency and anti-fraud measures in public sector projects. The Asia Pacific region is witnessing the fastest growth, fueled by rapid urbanization, increasing government investments in smart city initiatives, and a growing awareness of the need for fraud prevention in the construction sector. Meanwhile, Latin America and the Middle East & Africa are gradually adopting fraud detection solutions as part of broader efforts to modernize their permitting processes and improve governance standards.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, the original features and more background information about the data can not be provided . Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.
Acknowledgements The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project
Please cite the following works:
Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015
Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon
Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE
Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)
Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier
Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing
Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019
Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019
Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook
Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics
Facebook
TwitterI need a small help, if you vist and subscribe my website codetechguru
There's a story behind every dataset and here's your opportunity to share yours.
It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase. Content
The datasets contains transactions made by credit cards in September 2013 by european cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-senstive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise. Inspiration
Identify fraudulent credit card transactions.
Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification. Acknowledgements
The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project
Please cite the following works:
Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015
Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon
Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE
Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)
Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier
Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing
Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019
Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Synthetic Data Solution market is experiencing robust growth, projected to reach an estimated market size of approximately $1,500 million by 2025, with a Compound Annual Growth Rate (CAGR) of around 25% from 2019 to 2033. This significant expansion is primarily propelled by the increasing demand for privacy-preserving data generation, especially within sensitive sectors like financial services and healthcare, where regulations around data privacy are stringent. The retail industry is also a key driver, leveraging synthetic data for enhanced customer analytics, personalized marketing, and fraud detection without compromising consumer privacy. Furthermore, the burgeoning adoption of AI and machine learning across various industries necessitates vast amounts of high-quality training data, a need that synthetic data effectively addresses by overcoming limitations of real-world data scarcity and bias. The shift towards cloud-based solutions is also accelerating market penetration, offering scalability, flexibility, and cost-effectiveness for businesses of all sizes. Despite the promising growth trajectory, the market faces certain restraints. The complexity and cost associated with developing sophisticated synthetic data generation models, alongside concerns regarding the potential for bias inherited from the underlying real data, pose challenges. Ensuring the statistical fidelity and representativeness of synthetic data to real-world scenarios remains a critical area of focus for solution providers. However, ongoing advancements in generative adversarial networks (GANs) and other AI techniques are continuously improving the quality and realism of synthetic data. Geographically, North America currently leads the market due to its early adoption of AI technologies and strong regulatory frameworks promoting data privacy. Asia Pacific is emerging as a high-growth region, fueled by rapid digital transformation and increasing investments in AI research and development by countries like China and India. The market is characterized by intense competition among established tech giants and innovative startups, driving continuous innovation in synthetic data generation methodologies and applications. This in-depth report offers a panoramic view of the global Synthetic Data Solution market, providing a meticulous analysis of its current landscape, historical trajectory, and future potential. With a study period spanning from 2019 to 2033, and a base year of 2025, the report leverages comprehensive data from the historical period (2019-2024) to project a robust growth trajectory through the forecast period (2025-2033). The estimated market size for 2025 is projected to be in the hundreds of millions of US dollars, with significant expansion anticipated in the coming years.
Facebook
Twitterhttp://www.gnu.org/licenses/agpl-3.0.htmlhttp://www.gnu.org/licenses/agpl-3.0.html
We do the fraud-detection based on this dataset mainly to complete the project of data science problems in different features of Airbnb Boston dataset. Hope every one can enjoy👍
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Credit Card Fraud Detection Introduction Credit card fraud detection is a critical challenge in the financial sector. This project aims to build a machine learning model to identify fraudulent credit card transactions using a comprehensive dataset.
Dataset Overview The dataset contains transactions made by credit cards in September 2013 by European cardholders. It presents a significant class imbalance, with the majority of transactions being non-fraudulent.
Features:
Time: Seconds elapsed between this transaction and the first transaction in the dataset. V1 to V28: Anonymized features resulting from a PCA transformation. Amount: Transaction amount. Class: Target variable (1 for fraud, 0 for non-fraud). Steps Taken 1. Data Preprocessing Standardization: Standardized numeric features to improve model performance. Handling Imbalance: Applied SMOTE (Synthetic Minority Over-sampling Technique) to balance the dataset and ensure the model is well-trained on both classes. 2. Exploratory Data Analysis Correlation Analysis: Examined correlations between features to understand relationships and their potential impact on the model. 3. Model Building Algorithm Used: Random Forest Classifier, chosen for its robustness and high performance. Hyperparameter Tuning: Employed RandomizedSearchCV to find the best hyperparameters and enhance model accuracy. 4. Model Evaluation Confusion Matrix & Classification Report: Evaluated the model’s performance using key metrics such as precision, recall, F1-score, and overall accuracy. Feature Importance: Analyzed feature importances to identify which features contribute most to detecting fraud. Results The model achieved outstanding performance metrics:
Accuracy: 100% Precision, Recall, F1-score: 1.00 for both classes Confusion Matrix: True Negatives (TN): 9906 False Positives (FP): 8 False Negatives (FN): 9 True Positives (TP): 9757 Conclusion This project demonstrates the effectiveness of machine learning in detecting fraudulent credit card transactions. The key steps, including data preprocessing, handling class imbalance, and hyperparameter tuning, were crucial in achieving high model performance. The feature importance analysis provided valuable insights into the key indicators of fraudulent activity.
Check out the full code and detailed analysis in the GitHub Repository.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Operational Intelligence Platform (OIP) market is experiencing robust growth, projected to reach $3.20 billion in 2025 and maintain a Compound Annual Growth Rate (CAGR) of 12.04% from 2025 to 2033. This expansion is fueled by several key factors. The increasing adoption of cloud-based solutions offers scalability and cost-effectiveness, driving significant market share in the deployment type segment. Simultaneously, the burgeoning need for real-time data analysis across diverse industries like retail (demand forecasting, supply chain optimization), manufacturing (predictive maintenance, production efficiency), BFSI (fraud detection, risk management), and healthcare (patient monitoring, operational efficiency) is significantly boosting demand. Further growth is propelled by the expanding adoption of advanced analytics techniques, including AI and machine learning, for enhanced decision-making and improved operational efficiency. While initial investments in infrastructure and integration can present challenges, the long-term return on investment (ROI) and competitive advantage derived from OIP adoption are mitigating these restraints. The competitive landscape is marked by a mix of established players like OpenText, SAP, and Splunk, alongside specialized providers, indicating a healthy ecosystem for innovation and competition. The geographic distribution of the OIP market reveals a concentration in North America and Europe, reflecting higher technological adoption rates and established IT infrastructure in these regions. However, the Asia-Pacific region exhibits high growth potential, driven by rapid digital transformation and increasing investment in advanced technologies. The market's evolution is marked by a shift towards more comprehensive platforms encompassing advanced analytics, data visualization, and integration capabilities, fostering greater operational intelligence and business insights. This trend indicates a move away from siloed solutions towards holistic platforms that offer enhanced value across various departments and functions within an organization. Continued innovation in areas such as AI-powered automation and improved data security will further propel the OIP market's expansion in the coming years. This comprehensive report provides an in-depth analysis of the Operational Intelligence Platform market, offering valuable insights for businesses and investors seeking to navigate this rapidly evolving landscape. With a study period spanning from 2019 to 2033, a base year of 2025, and a forecast period from 2025 to 2033, this report utilizes data from the historical period (2019-2024) to project future market trends and opportunities. The market is valued in millions of dollars. Recent developments include: May 2022 - Mobileum Inc., one of the global leaders in analytics solutions for roaming and network services, security, risk management, testing and service assurance, and subscriber intelligence, and Digis Squared, one of the market leaders in network services and AI-assisted tools, announced a strategic partnership to bring to market a comprehensive set of network testing and cognitive optimization solutions. Digis Squared's deep expertise in developing cognitive tools to automate and analyze radio network and edge-to-edge performance and optimizing networks and capacity management to benefit the customer experience is combined with Mobileum's highly scalable and flexible telecom analytics portfolio, which enables operators to improve business performance, monitor customer experience, and access new monetization opportunities., May 2022 - UST, one of the significant digital transformation solutions providers, announced an OEM agreement with SAP that would allow it to integrate SAP Business Technology Platform (SAP BTP) into its Cogniphi AI Vision platform, which would be branded as UST Sentry Vision AI. The service would use advanced video analytics to embed predictive, contextual, and analytical capabilities into retail and manufacturing processes as a SaaS-based packaged solution that can readily connect with SAP S/4HANA and RISE with SAP., April 2022 - Quinnox, a full-spectrum IT and digital solutions provider, announced a Partner Connect agreement with Software AG, a pioneer in IoT, integration, API management, and business transformation software. This collaboration would supplement Quinnox's efforts to develop strong and highly impactful go-to-market strategies, products, and services for customers using Software AG's tools, training, and technologies to capitalize on market possibilities.. Key drivers for this market are: Growing Need for Real Time Data Analytics, Increasing Adoption of Big Data Analytics and the Internet of Things (IoT). Potential restraints include: Combining Data from Multiple Data Sources. Notable trends are: Cloud Deployment Segment is Expected to Hold Major Market Share.
Facebook
Twitter
According to our latest research, the global Householding Analytics market size reached USD 3.42 billion in 2024, reflecting robust adoption across multiple industries. The market is experiencing a healthy compound annual growth rate (CAGR) of 13.8% from 2025 to 2033. By the end of 2033, we project the Householding Analytics market to expand to USD 10.38 billion, driven by the increasing demand for advanced analytics in customer segmentation, marketing, and risk management. This growth is primarily fueled by the need for personalized customer engagement and the proliferation of big data analytics in enterprise decision-making.
The surge in demand for Householding Analytics is largely attributed to the increasing complexity of customer relationships and the growing necessity for businesses to understand household-level data. Enterprises, particularly in the BFSI and retail sectors, are leveraging these analytics to gain deeper insights into family structures, shared financial behaviors, and collective purchasing patterns. This enables organizations to tailor their products, services, and marketing strategies more effectively, thereby enhancing customer loyalty and lifetime value. The integration of artificial intelligence and machine learning algorithms with householding analytics platforms is further amplifying the accuracy and predictive capabilities of these solutions, making them indispensable for data-driven organizations.
Another key growth factor for the Householding Analytics market is the rising emphasis on fraud detection and risk assessment. Financial institutions and insurance companies are increasingly utilizing householding analytics to identify anomalous behavior patterns across related accounts, thereby mitigating the risk of fraud and improving regulatory compliance. The ability to consolidate individual data points into comprehensive household profiles allows these organizations to detect suspicious activity that might otherwise go unnoticed in isolated datasets. Additionally, regulatory requirements around data transparency and anti-money laundering are compelling organizations to adopt more sophisticated analytics tools, further accelerating market growth.
The rapid digital transformation across industries is also playing a pivotal role in propelling the adoption of Householding Analytics. As organizations transition to omnichannel engagement models, the volume of customer data generated across touchpoints has grown exponentially. Householding analytics platforms enable businesses to unify disparate data sources and extract actionable insights at the household level, facilitating targeted marketing campaigns, personalized product recommendations, and optimized resource allocation. The increasing availability of cloud-based analytics solutions is lowering the barriers to entry for small and medium enterprises (SMEs), democratizing access to advanced analytics and expanding the market’s addressable base.
From a regional perspective, North America currently dominates the Householding Analytics market, driven by the presence of leading analytics vendors and high digital maturity among enterprises. However, Asia Pacific is anticipated to witness the fastest growth over the forecast period, supported by rapid urbanization, expanding middle-class populations, and increasing investments in digital infrastructure. Europe continues to demonstrate steady growth, particularly in the BFSI and retail sectors, while Latin America and the Middle East & Africa are emerging as attractive markets due to ongoing digital transformation initiatives and rising awareness of advanced analytics solutions.
The Householding Analytics market is segmented by component into Software and Services, each playing a crucial role in the overall ecosystem. Software solutions form the backbone of the market, providing the core functionalities required for data integration, analysis,
Facebook
TwitterDue to rapid growth in field of cashless or digital transactions, credit cards are widely used in all around the world. Credit cards providers are issuing thousands of cards to their customers. Providers have to ensure all the credit card users should be genuine and real. Any mistake in issuing a card can be reason of financial crises. Due to rapid growth in cashless transaction, the chances of number of fraudulent transactions can also increasing. A Fraud transaction can be identified by analyzing various behaviors of credit card customers from previous transaction history datasets. If any deviation is noticed in spending behavior from available patterns, it is possibly of fraudulent transaction. Data mining and machine learning techniques are widely used in credit card fraud detection. In those notebooks we are presenting review of various data mining and machine learning methods which are widely used for credit card fraud detections and complete this project end to end from Data Understanding to deploy Model via API .
You are provided a synthetic dataset for a mobile payments application. In this dataset, you are provided the sender and recipient of a transaction as well as whether transactions are tagged as fraud or not fraud.
This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.
Please refer to this dataset using the following citations:
PaySim first paper of the simulator:
E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016Acknowledgements
This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.
Please refer to this dataset using the following citations:
PaySim first paper of the simulator:
E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016
Facebook
Twitter
According to our latest research, the global claim management service market size reached USD 41.5 billion in 2024, underpinned by robust digital transformation initiatives and increasing healthcare and insurance claims worldwide. The market is projected to expand at a CAGR of 8.2% from 2025 to 2033, reaching an estimated USD 81.1 billion by 2033. This growth is primarily driven by the rising complexity of insurance products, regulatory changes, and the urgent need for operational efficiency in claims processing. The market is experiencing a paradigm shift as organizations transition from legacy systems to advanced, automated claim management solutions, which is fostering innovation and market expansion on a global scale.
A key growth factor for the claim management service market is the increasing adoption of digital technologies such as artificial intelligence (AI), machine learning, and robotic process automation (RPA) in the insurance and healthcare sectors. These technologies are revolutionizing the way claims are processed, significantly reducing manual intervention and minimizing errors. Automated workflows, enhanced fraud detection, and real-time data analytics are empowering organizations to improve customer experience, expedite claim settlements, and comply with stringent regulatory requirements. As insurers and healthcare providers continue to embrace digital transformation, the demand for sophisticated claim management services is expected to surge, thereby propelling market growth over the forecast period.
Another significant driver for the claim management service market is the increasing volume and complexity of insurance claims, particularly in health, property, and casualty insurance segments. The global surge in chronic diseases, frequent natural disasters, and rising awareness of insurance coverage have all contributed to a higher number of claims being filed each year. This mounting pressure on insurers and third-party administrators (TPAs) to process claims swiftly and accurately has led to a growing reliance on specialized claim management services. These services not only streamline claim adjudication but also help organizations manage costs, reduce fraudulent claims, and maintain compliance with evolving industry standards and regulations.
Regulatory changes and compliance requirements are also playing a pivotal role in shaping the claim management service market. Governments and regulatory bodies across the globe are introducing new policies to ensure transparency, data privacy, and fair claim settlements. The increasing emphasis on compliance has compelled organizations to invest in advanced claim management solutions that offer robust audit trails, comprehensive reporting, and secure data handling capabilities. Furthermore, the integration of cloud-based platforms is enabling organizations to scale operations, enhance collaboration, and ensure business continuity, even in the face of unforeseen disruptions such as pandemics or natural disasters.
In this evolving landscape, the role of Construction Claims Consulting Service has become increasingly vital. As the construction industry faces complex challenges, including regulatory compliance, project delays, and cost overruns, specialized consulting services are essential for navigating these hurdles. These services provide expert guidance on claim preparation, negotiation, and resolution, ensuring that construction projects remain on track and within budget. By leveraging industry knowledge and experience, consulting firms help construction companies manage risks, optimize project outcomes, and maintain strong relationships with stakeholders. As the demand for infrastructure development continues to rise globally, the importance of construction claims consulting services is expected to grow, offering significant opportunities for market expansion.
From a regional perspective, North America continues to dominate the claim management service market, accounting for the largest share in 2024, followed by Europe and the Asia Pacific. The presence of established insurance and healthcare ecosystems, coupled with early adoption of advanced technologies, has positioned North America as a frontrunner in this market. However, the Asia Pacific region is projected to witness the highest growth rate over
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
AI And Machine Learning In Business Market Size 2025-2029
The AI and machine learning in business market size is valued to increase by USD 240.3 billion, at a CAGR of 24.9% from 2024 to 2029. Unprecedented advancements in AI technology and generative AI catalyst will drive the ai and machine learning in business market.
Major Market Trends & Insights
North America dominated the market and accounted for a 36% growth during the forecast period.
By Component - Solutions segment was valued at USD 24.98 billion in 2023
By Sector - Large enterprises segment accounted for the largest market revenue share in 2023
Market Size & Forecast
Market Opportunities: USD 906.25 million
Market Future Opportunities: USD 240301.30 million
CAGR from 2024 to 2029 : 24.9%
Market Summary
In the realm of business innovation, Artificial Intelligence (AI) and Machine Learning (ML) have emerged as indispensable tools, shaping industries through unprecedented advancements. The market for AI in business is experiencing a surge in growth, with an estimated 1.2 billion dollars invested in AI startups in 2020 alone. This investment fuels the proliferation of generative AI copilots and embedded AI in enterprise platforms, revolutionizing processes and enhancing productivity. However, the integration of AI and ML in businesses presents a unique challenge: the scarcity of specialized talent.
As these technologies become increasingly essential, companies are compelled to invest in workforce transformation, upskilling their employees or hiring new talent to ensure they can harness the full potential of AI. This imperative for human capital development is a testament to the transformative power of AI and ML in business, driving growth and innovation across industries.
What will be the Size of the AI And Machine Learning In Business Market during the forecast period?
Get Key Insights on Market Forecast (PDF) Request Free Sample
How is the AI And Machine Learning In Business Market Segmented ?
The AI and machine learning in business industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Component
Solutions
Services
Sector
Large enterprises
SMEs
Application
Data analytics
Predictive analytics
Cyber security
Supply chain and inventory management
Others
End-user
IT and telecom
BFSI
Retail and manufacturing
Healthcare
Others
Geography
North America
US
Canada
Europe
France
Germany
UK
APAC
Australia
China
India
Japan
South Korea
Rest of World (ROW)
By Component Insights
The solutions segment is estimated to witness significant growth during the forecast period.
The market continues to evolve, driven by advancements in big data processing, algorithm performance metrics, and scalable infrastructure. API integrations, recommendation engines, and predictive analytics tools are increasingly common, with model training datasets becoming larger and more diverse. Business process automation relies on feature engineering processes, data mining techniques, and model deployment strategies. Cloud computing platforms facilitate the use of deep learning algorithms, machine learning models, and real-time data processing. In 2023, SAP introduced Joule, an AI copilot that uses natural language processing for proactive and contextualized insights, reflecting the trend towards AI-driven automation and process optimization. This includes supply chain optimization, sales forecasting models, sentiment analysis tools, and anomaly detection systems.
Furthermore, AI-powered chatbots, data visualization dashboards, and model explainability techniques support data governance frameworks. Cybersecurity protocols and fraud detection models are also essential components of this dynamic landscape. According to a recent report, the global AI in business market is projected to reach USD267 billion by 2027, underscoring its transformative impact on industries.
Request Free Sample
The Solutions segment was valued at USD 24.98 billion in 2019 and showed a gradual increase during the forecast period.
Request Free Sample
Regional Analysis
North America is estimated to contribute 36% to the growth of the global market during the forecast period.Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.
See How AI And Machine Learning In Business Market Demand is Rising in North America Request Free Sample
The artificial intelligence (AI) and machine learning (ML) in business market is experiencing a significant surge, with North America leading the charge. The region, particularly the United States, h
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This synthetic dataset, "Fraudulent E-Commerce Transactions," is designed to simulate transaction data from an e-commerce platform with a focus on fraud detection. It contains a variety of features commonly found in transactional data, with additional attributes specifically engineered to support the development and testing of fraud detection algorithms.
The dataset is intended for use in developing and testing machine learning models for fraud detection in e-commerce transactions. It can also be used for exploratory data analysis, feature engineering, and benchmarking fraud detection algorithms.
The data is entirely synthetic, generated using Python's Faker library and custom logic to simulate realistic transaction patterns and fraudulent scenarios. The dataset is not based on real individuals or transactions and is created for educational and research purposes.
Feel free to use this dataset for data analysis, machine learning projects, or as a benchmark for fraud detection algorithms. If you use this dataset in your research or projects, please provide proper attribution.
Facebook
Twitterhttps://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
| BASE YEAR | 2024 |
| HISTORICAL DATA | 2019 - 2023 |
| REGIONS COVERED | North America, Europe, APAC, South America, MEA |
| REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
| MARKET SIZE 2024 | 57.9(USD Billion) |
| MARKET SIZE 2025 | 60.0(USD Billion) |
| MARKET SIZE 2035 | 85.0(USD Billion) |
| SEGMENTS COVERED | Service Type, Industry, Client Size, Engagement Type, Regional |
| COUNTRIES COVERED | US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA |
| KEY MARKET DYNAMICS | Regulatory compliance demands, Increasing digitalization, Growing fraud risks, Rise in global trade, Need for data analytics |
| MARKET FORECAST UNITS | USD Billion |
| KEY COMPANIES PROFILED | KPMG, Grant Thornton, RSM International, PwC, Baker Tilly, Mazars, Deloitte, Moore Global, Nexia International, Crowe, UHY International, Protiviti, HLB International, Smith & Williamson, BDO International, EY |
| MARKET FORECAST PERIOD | 2025 - 2035 |
| KEY MARKET OPPORTUNITIES | Increased demand for regulatory compliance, Adoption of advanced analytics technology, Growth in digital transformation initiatives, Rising need for fraud detection services, Expansion of Small and Medium Enterprises |
| COMPOUND ANNUAL GROWTH RATE (CAGR) | 3.6% (2025 - 2035) |
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.
The dataset contains transactions made by credit cards in September 2013 by European cardholders. This dataset presents transactions that occurred in two days, where we have 492 frauds out of 284,807 transactions. The dataset is highly unbalanced, the positive class (frauds) account for 0.172% of all transactions.
It contains only numerical input variables which are the result of a PCA transformation. Unfortunately, due to confidentiality issues, we cannot provide the original features and more background information about the data. Features V1, V2, … V28 are the principal components obtained with PCA, the only features which have not been transformed with PCA are 'Time' and 'Amount'. Feature 'Time' contains the seconds elapsed between each transaction and the first transaction in the dataset. The feature 'Amount' is the transaction Amount, this feature can be used for example-dependant cost-sensitive learning. Feature 'Class' is the response variable and it takes value 1 in case of fraud and 0 otherwise.
Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.
The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group (http://mlg.ulb.ac.be) of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on https://www.researchgate.net/project/Fraud-detection-5 and the page of the DefeatFraud project
Please cite the following works:
Andrea Dal Pozzolo, Olivier Caelen, Reid A. Johnson and Gianluca Bontempi. Calibrating Probability with Undersampling for Unbalanced Classification. In Symposium on Computational Intelligence and Data Mining (CIDM), IEEE, 2015
Dal Pozzolo, Andrea; Caelen, Olivier; Le Borgne, Yann-Ael; Waterschoot, Serge; Bontempi, Gianluca. Learned lessons in credit card fraud detection from a practitioner perspective, Expert systems with applications,41,10,4915-4928,2014, Pergamon
Dal Pozzolo, Andrea; Boracchi, Giacomo; Caelen, Olivier; Alippi, Cesare; Bontempi, Gianluca. Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE transactions on neural networks and learning systems,29,8,3784-3797,2018,IEEE
Dal Pozzolo, Andrea Adaptive Machine learning for credit card fraud detection ULB MLG PhD thesis (supervised by G. Bontempi)
Carcillo, Fabrizio; Dal Pozzolo, Andrea; Le Borgne, Yann-Aël; Caelen, Olivier; Mazzer, Yannis; Bontempi, Gianluca. Scarff: a scalable framework for streaming credit card fraud detection with Spark, Information fusion,41, 182-194,2018,Elsevier
Carcillo, Fabrizio; Le Borgne, Yann-Aël; Caelen, Olivier; Bontempi, Gianluca. Streaming active learning strategies for real-life credit card fraud detection: assessment and visualization, International Journal of Data Science and Analytics, 5,4,285-300,2018,Springer International Publishing
Bertrand Lebichot, Yann-Aël Le Borgne, Liyun He, Frederic Oblé, Gianluca Bontempi Deep-Learning Domain Adaptation Techniques for Credit Cards Fraud Detection, INNSBDDL 2019: Recent Advances in Big Data and Deep Learning, pp 78-88, 2019
Fabrizio Carcillo, Yann-Aël Le Borgne, Olivier Caelen, Frederic Oblé, Gianluca Bontempi Combining Unsupervised and Supervised Learning in Credit Card Fraud Detection Information Sciences, 2019
Yann-Aël Le Borgne, Gianluca Bontempi Reproducible machine Learning for Credit Card Fraud Detection - Practical Handbook
Bertrand Lebichot, Gianmarco Paldino, Wissam Siblini, Liyun He, Frederic Oblé, Gianluca Bontempi Incremental learning strategies for credit cards fraud detection, IInternational Journal of Data Science and Analytics