Facebook
TwitterDetails of fraud referrals relating to war pensions & compensation
Facebook
TwitterThis dataset was created by José Henrique Gaspar
Facebook
TwitterDetails of fraud referrals relating to war pensions & compensation
Facebook
Twitter
As per our latest research, the global Graph Databases for Fraud Detection market size in 2024 stands at USD 1.12 billion, with the market demonstrating robust momentum. The sector is experiencing a compound annual growth rate (CAGR) of 22.7%, positioning it for substantial expansion. By 2033, the market is forecasted to reach a remarkable USD 8.65 billion, driven by the escalating sophistication of fraudulent activities and the increasing adoption of advanced analytics and artificial intelligence across industries. The primary growth factor is the urgent need for real-time, scalable, and highly accurate fraud detection solutions that can adapt to evolving threat landscapes, especially in sectors such as BFSI, retail, and healthcare.
The proliferation of digital transactions, e-commerce, and online banking has led to a surge in complex fraud schemes, necessitating the deployment of advanced technologies like graph databases. These databases excel at mapping intricate relationships and patterns across vast datasets, making them indispensable in detecting organized fraud rings, identity theft, and money laundering operations. Their ability to visualize and analyze interconnected data points in real time significantly enhances the accuracy and speed of fraud detection, minimizing false positives and enabling proactive risk mitigation. Moreover, the integration of graph databases with machine learning and AI algorithms has further amplified their effectiveness, allowing organizations to uncover hidden patterns and anticipate fraudulent behaviors before they result in significant financial losses.
Another key growth driver for the graph databases for fraud detection market is the increasing regulatory scrutiny and compliance requirements imposed by governments and international bodies. Financial institutions, healthcare providers, and e-commerce platforms are under mounting pressure to comply with anti-money laundering (AML), know your customer (KYC), and data privacy regulations. Graph databases provide the agility and transparency needed to trace the origin and flow of transactions, ensuring robust audit trails and facilitating regulatory reporting. As regulatory frameworks continue to evolve and penalties for non-compliance become more stringent, organizations are investing heavily in advanced fraud detection infrastructure, further fueling market growth.
The rapid advancements in cloud computing and the widespread adoption of Software-as-a-Service (SaaS) models have also played a pivotal role in democratizing access to graph database solutions for fraud detection. Cloud-based deployment offers scalability, cost-effectiveness, and ease of integration with existing IT ecosystems, making it an attractive option for both large enterprises and small-to-medium businesses. Furthermore, the rise of API-driven architectures and microservices has enabled seamless interoperability between graph databases and other analytics tools, enhancing their utility across diverse industry verticals. As digital transformation accelerates globally, the demand for flexible, cloud-native fraud detection solutions is expected to witness exponential growth.
Regionally, North America continues to dominate the graph databases for fraud detection market, accounting for the largest share in 2024, followed closely by Europe and the Asia Pacific. The presence of major financial institutions, advanced technological infrastructure, and a robust regulatory environment are key factors driving adoption in these regions. However, emerging economies in Asia Pacific and Latin America are rapidly catching up, propelled by the digitalization of banking and commerce, rising cybercrime rates, and increased investments in cybersecurity. The Middle East and Africa are also witnessing steady growth, albeit from a smaller base, as governments and enterprises prioritize fraud prevention as part of their digital agendas.
The component segment of the graph databases
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
There is a lack of public available datasets on financial services and specially in the emerging mobile money transactions domain. Financial datasets are important to many researchers and in particular to us performing research in the domain of fraud detection. Part of the problem is the intrinsically private nature of financial transactions, that leads to no publicly available datasets.
We present a synthetic dataset generated using the simulator called PaySim as an approach to such a problem. PaySim uses aggregated data from the private dataset to generate a synthetic dataset that resembles the normal operation of transactions and injects malicious behaviour to later evaluate the performance of fraud detection methods.
PaySim simulates mobile money transactions based on a sample of real transactions extracted from one month of financial logs from a mobile money service implemented in an African country. The original logs were provided by a multinational company, who is the provider of the mobile financial service which is currently running in more than 14 countries all around the world.
This synthetic dataset is scaled down 1/4 of the original dataset and it is created just for Kaggle.
This is a sample of 1 row with headers explanation:
1,PAYMENT,1060.31,C429214117,1089.0,28.69,M1591654462,0.0,0.0,0,0
step - maps a unit of time in the real world. In this case 1 step is 1 hour of time. Total steps 744 (30 days simulation).
type - CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
amount - amount of the transaction in local currency.
nameOrig - customer who started the transaction
oldbalanceOrg - initial balance before the transaction
newbalanceOrig - new balance after the transaction.
nameDest - customer who is the recipient of the transaction
oldbalanceDest - initial balance recipient before the transaction. Note that there is not information for customers that start with M (Merchants).
newbalanceDest - new balance recipient after the transaction. Note that there is not information for customers that start with M (Merchants).
isFraud - This is the transactions made by the fraudulent agents inside the simulation. In this specific dataset the fraudulent behavior of the agents aims to profit by taking control or customers accounts and try to empty the funds by transferring to another account and then cashing out of the system.
isFlaggedFraud - The business model aims to control massive transfers from one account to another and flags illegal attempts. An illegal attempt in this dataset is an attempt to transfer more than 200.000 in a single transaction.
There are 5 similar files that contain the run of 5 different scenarios. These files are better explained at my PhD thesis chapter 7 (PhD Thesis Available here http://urn.kb.se/resolve?urn=urn:nbn:se:bth-12932.
We ran PaySim several times using random seeds for 744 steps, representing each hour of one month of real time, which matches the original logs. Each run took around 45 minutes on an i7 intel processor with 16GB of RAM. The final result of a run contains approximately 24 million of financial records divided into the 5 types of categories: CASH-IN, CASH-OUT, DEBIT, PAYMENT and TRANSFER.
This work is part of the research project ”Scalable resource-efficient systems for big data analytics” funded by the Knowledge Foundation (grant: 20140032) in Sweden.
Please refer to this dataset using the following citations:
PaySim first paper of the simulator:
E. A. Lopez-Rojas , A. Elmir, and S. Axelsson. "PaySim: A financial mobile money simulator for fraud detection". In: The 28th European Modeling and Simulation Symposium-EMSS, Larnaca, Cyprus. 2016
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global graph databases for fraud detection market size reached USD 2.34 billion in 2024, reflecting the rapid adoption of advanced analytics in combating increasingly sophisticated fraud schemes. The market is set to expand at a robust CAGR of 22.1% from 2025 to 2033, propelled by heightened demand for real-time fraud detection and the proliferation of complex digital transactions. By 2033, the market is forecasted to reach USD 16.91 billion, underlining the critical role of graph database technologies in modern fraud prevention strategies. This growth trajectory is primarily driven by the escalating need for advanced, scalable, and intuitive data models that can efficiently uncover hidden relationships and patterns indicative of fraudulent activities.
A significant growth factor for the graph databases for fraud detection market is the exponential rise in digital transactions and the parallel increase in cybercrime sophistication. As organizations across sectors such as banking, retail, and healthcare accelerate their digital transformation, they face unprecedented challenges in identifying and mitigating fraud that exploits complex, multi-layered networks of transactions and identities. Traditional relational databases often fall short in tracking and analyzing these intricate relationships in real time. In contrast, graph databases, with their inherent capability to map and analyze interconnected data points, offer a powerful solution for detecting anomalies and suspicious patterns indicative of fraud. The surge in online payments, peer-to-peer transactions, and digital onboarding further amplifies the need for robust fraud detection systems, positioning graph databases as a cornerstone technology for security-conscious enterprises.
Another pivotal driver of market growth is the increasing regulatory scrutiny and compliance requirements imposed by governments and industry bodies worldwide. Financial institutions, in particular, are under mounting pressure to implement stringent anti-fraud measures and demonstrate proactive risk management practices. Graph databases facilitate compliance by enabling organizations to conduct comprehensive network analyses, trace the flow of funds, and generate detailed audit trails for regulatory reporting. The ability to seamlessly integrate with existing anti-money laundering (AML) and know-your-customer (KYC) frameworks enhances their value proposition. Additionally, the shift toward open banking and the proliferation of fintech innovations create new vectors for fraud, necessitating the adoption of advanced graph-based analytics to safeguard consumer trust and organizational reputations.
The market is further buoyed by advancements in artificial intelligence (AI) and machine learning (ML), which, when combined with graph database architectures, significantly enhance the accuracy and efficiency of fraud detection mechanisms. AI-driven graph analytics can automatically learn from vast datasets, adapt to emerging fraud tactics, and provide actionable insights in real time. This synergy enables organizations to move beyond rule-based detection toward predictive and prescriptive analytics, reducing false positives and operational overhead. The growing availability of cloud-based graph database solutions also lowers entry barriers, allowing small and medium-sized enterprises (SMEs) to leverage cutting-edge fraud detection capabilities without substantial upfront investments in infrastructure.
From a regional perspective, North America continues to dominate the graph databases for fraud detection market, accounting for the largest share in 2024. This leadership is attributed to the region's advanced digital infrastructure, high adoption rates of graph technology among financial institutions, and stringent regulatory frameworks. Europe follows closely, driven by robust data protection laws and a proactive stance on cybersecurity. The Asia Pacific region is emerging as the fastest-growing market, fueled by rapid digitalization, expanding e-commerce ecosystems, and increasing awareness of fraud risks. Latin America and the Middle East & Africa, while smaller in market share, are witnessing steady growth as organizations in these regions ramp up investments in fraud prevention and digital security.
The graph databases for fraud detection market is segmented by component into softwar
Facebook
TwitterThe Corporate Financial Fraud project is a study of company and top-executive characteristics of firms that ultimately violated Securities and Exchange Commission (SEC) financial accounting and securities fraud provisions compared to a sample of public companies that did not. The fraud firm sample was identified through systematic review of SEC accounting enforcement releases from 2005-2010, which included administrative and civil actions, and referrals for criminal prosecution that were identified through mentions in enforcement release, indictments, and news searches. The non-fraud firms were randomly selected from among nearly 10,000 US public companies censused and active during at least one year between 2005-2010 in Standard and Poor's Compustat data. The Company and Top-Executive (CEO) databases combine information from numerous publicly available sources, many in raw form that were hand-coded (e.g., for fraud firms: Accounting and Auditing Enforcement Releases (AAER) enforcement releases, investigation summaries, SEC-filed complaints, litigation proceedings and case outcomes). Financial and structural information on companies for the year leading up to the financial fraud (or around year 2000 for non-fraud firms) was collected from Compustat financial statement data on Form 10-Ks, and supplemented by hand-collected data from original company 10-Ks, proxy statements, or other financial reports accessed via Electronic Data Gathering, Analysis, and Retrieval (EDGAR), SEC's data-gathering search tool. For CEOs, data on personal background characteristics were collected from Execucomp and BoardEx databases, supplemented by hand-collection from proxy-statement biographies.
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Roulette has been a cornerstone in the study of randomness and statistics since its invention, influencing not only physical casinos but also online platforms. I have created a unique dataset that simulates a roulette wheel, not only to explore the random generation of numbers but also to illustrate how certain techniques can be easily employed by online casinos for fraudulent activities.
-Temporal and Climatic Variables: Each spin is precisely recorded, integrating sports results and weather conditions that influence fraud techniques.
-Dynamic Fraud Techniques: I have created 53 different fraud techniques, including 5 advanced hybrid techniques that combine various manipulation methods. I select and change fraud techniques daily, adjusting them according to the 'peak hours' of casino traffic to reflect realistic manipulation methods.
-Influence of Historical Results: I use spin histories to determine 'hot' (more frequent) and 'cold' (less frequent) numbers, which are key to deciding the fraud techniques at any given moment.
-Distributions and Biases: The distributions of resulting numbers are adjusted based on these analyses, showing how historical information can be used to manipulate future results.
-Majority of Legitimate Spins: Almost 95% of the spins in this dataset are completely legitimate, without any manipulation, reflecting the normal operation of a roulette wheel.
-Fraud Concentrated During Peak Hours, Weeks, Months, and Days: The remaining 5% corresponds to fraudulent spins, strategically distributed during peak hours, weeks, months, and days, covering a period of one year. This proportion highlights the importance of thoroughly auditing these high-activity periods.
I would love to see more studies on this database, so I encourage everyone who reads this post to share the insights you discover.
Here is the list of strategies used in the dataset (some of them are not as intuitive as they might seem by their names):
0 == No Fraud 1. 'number_bias' 2. 'predictable_sequences' 3. 'color_omission' 4. 'low_range_bias' 5. 'sequence_repetition' 6. 'cyclic_alteration' 7. 'day_night_bias' 8. 'altered_zero_frequency' 9. 'random_alterations' 10. 'temporal_bias' 11. 'day_hour_bias' 12. 'day_of_week_bias' 13. 'day_of_month_bias' 14. 'bimodal_distribution' 15. 'fibonacci_bias' 16. 'parity_alteration' 17. 'prime_sequence' 18. 'double_sinusoidal_distribution' 19. 'normal_distribution' 20. 'time_series_patterns' 21. 'adaptive_variation' 22. 'wear_simulation' 23. 'advanced_hybrid_1' 24. 'advanced_hybrid_2' 25. 'advanced_hybrid_3' 26. 'advanced_hybrid_4' 27. 'advanced_hybrid_5' 28. 'previous_result_sum_bias' 29. 'special_dates_bias' 30. 'weighted_global_events_distribution' 31. 'previous_winning_combinations_bias' 32. 'sentiment_analysis_alteration' 33. 'weighted_day_of_month_bias' 34. 'weather_patterns_bias' 35. 'weighted_hour_of_day_distribution' 36. 'sports_events_bias' 37. 'lunar_cycles_modulation' 38. 'high_range_bias' 39. 'inverse_prime_sequence' 40. 'alternate_parity_bias' 41. 'zero_series_frequency' 42. 'game_history_bias' 43. 'gaussian_noise_modulation' 44. 'time_weighted_distribution_bias' 45. 'last_digit_bias' 46. 'cumulative_temporal_bias' 47. 'hidden_previous_results_patterns' 48. 'weighted_hot_cold_oscillation' 49. 'adaptive_hot_cold_sequence' 50. 'cold_number_mirage' 51. 'hot_number_evasion' 52. 'false_cold' 53. 'hot_deviation'
Attached is an example of analysis for a specific hour using a specific strategy, in this case, "double_sinusoidal_distribution":
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9698182%2Ff536eaa650aeebb5737a9d9a2ec53665%2Foutputexample.png?generation=1720566276284440&alt=media" alt="">
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/2627/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/2627/terms
The focus of this project was insider fraud -- crimes committed by the owners and operators of insurance companies that were established for the purposes of defrauding businesses and employees. The quantitative data for this collection were taken from a database maintained by the National Association of Insurance Commissioners (NAIC), an organization that represents state insurance departments collectively and acts as a clearinghouse for information obtained from individual departments. Created in 1988, the Regulatory Information Retrieval System (RIRS) database contains information on actions taken by state insurance departments against individuals and firms, including cease and desist orders, license revocations, fines, and penalties imposed. Data available for this project include a total of 123 actions taken against firms labeled as Multiple Employer Welfare Arrangements or Multiple Employer Trusts (MEWA/MET) in the RIRS database. Variables available in this data collection include the date action was taken, state where action was taken, dollar amount of the penalty imposed in the action, and disposition for action taken.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Discover a dataset of companies and institutions tied to corporate laundering, with names, countries, and AML network risk ratings for compliance insight.
Facebook
TwitterDatabase of allegations of fraud and dispositions of those allegations that warrant further investigation.
Facebook
Twitterhttps://www.ibisworld.com/about/termsofuse/https://www.ibisworld.com/about/termsofuse/
Rising online fraud and data breaches have supported modest but uneven growth in the Identity Theft Protection Services industry. Free protections from banks and credit agencies, subdued lending conditions and intense competition have held back paid fraud-alert plans, standalone card-protection registration and some lower-tier consumer subscriptions. Over the five years through 2025-26, industry revenue is projected to rise at a compound annual rate of 0.8% to reach £378 million, including growth of 2.2% in 2025-26. Cifas reported that identity fraud accounted for 64% of National Fraud Database filings in 2023 and almost 250,000 cases in 2024, confirming that identity misuse remains the UK’s dominant cybercrime. Growing exposure to phishing, deepfakes and account takeovers is encouraging both consumers and businesses to maintain subscriptions for monitoring, breach alerts and recovery. Profit has remained steady as companies upgrade their detection and compliance systems to meet tougher data-handling rules, although profit margins remain below those in broader information services due to high regulatory and labour costs. Policy and technological shifts are reshaping the competitive landscape. In October 2024, new Payment Systems Regulator rules made banks jointly liable for authorised push-payment fraud, spurring demand for behavioural and biometric verification tools that prevent mis-payments. Rising breach notifications have also driven corporate contracts for identity restoration, with 12,412 filed with the Information Commissioner’s Office in 2024-25. Consolidation is accelerating as global data and cybersecurity groups expand through deals, like Entrust’s acquisition of Onfido in April 2024 and Experian’s purchase of KYC360 in October 2025. Over the five years through 2030-31, industry revenue is forecast to grow at a compound annual rate of 4.5% to £471.8 million. The rollout of GOV.UK One Login, the Data (Use and Access) Act 2025 and rapid advances in AI-driven fraud detection will embed verified digital identity as a standard security layer across UK finance and public services, supporting stronger, technology-led growth. At the same time, free monitoring from banks and credit agencies, privacy concerns about wider data sharing and persistent labour and compliance costs are expected to cap pricing power and limit profit margins for industry companies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data contains information on food fraud and was used to predict food fraud type using a Bayesian Network model. Food fraud notifications for the period 2000-2014 were downloaded from the Rapid Alert System for Food and Feed (RASFF) database. Each record contains detailed information on the kind of notification and the products and countries involved. Based on the description in each notification we added a variable "food fraud type" (i.e. six different types of food fraud). A set of 749 notifications for the years 2000-2013 was used to train a Bayesian Network model to predict food fraud type. This model was validated using the 88 notifications for the year 2014.
Interpretation of the data and details on the performance of the BN model can be found in the research article titled “Prediction of food fraud type using data from Rapid Alert System for Food and Feed (RASFF) and Bayesian network modelling” https://doi.org/10.1016/j.foodcont.2015.09.026
Column names
year - year notification was made
product - categorization of the different products
notification - categorization of the notifications
notified - country that made the notification
origin - country where the product originated from
fraud - classification of fraud type
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Project Objectives Provider Fraud is one of the biggest problems facing Medicare. According to the government, the total Medicare spending increased exponentially due to frauds in Medicare claims. Healthcare fraud is an organized crime which involves peers of providers, physicians, beneficiaries acting together to make fraud claims.
Rigorous analysis of Medicare data has yielded many physicians who indulge in fraud. They adopt ways in which an ambiguous diagnosis code is used to adopt costliest procedures and drugs. Insurance companies are the most vulnerable institutions impacted due to these bad practices. Due to this reason, insurance companies increased their insurance premiums and as result healthcare is becoming costly matter day by day.
Healthcare fraud and abuse take many forms. Some of the most common types of frauds by providers are:
a) Billing for services that were not provided.
b) Duplicate submission of a claim for the same service.
c) Misrepresenting the service provided.
d) Charging for a more complex or expensive service than was actually provided.
e) Billing for a covered service when the service actually provided was not covered.
Problem Statement The goal of this project is to " predict the potentially fraudulent providers " based on the claims filed by them.along with this, we will also discover important variables helpful in detecting the behaviour of potentially fraud providers. further, we will study fraudulent patterns in the provider's claims to understand the future behaviour of providers.
Introduction to the Dataset For the purpose of this project, we are considering Inpatient claims, Outpatient claims and Beneficiary details of each provider. Lets s see their details :
A) Inpatient Data
This data provides insights about the claims filed for those patients who are admitted in the hospitals. It also provides additional details like their admission and discharge dates and admit d diagnosis code.
B) Outpatient Data
This data provides details about the claims filed for those patients who visit hospitals and not admitted in it.
C) Beneficiary Details Data
This data contains beneficiary KYC details like health conditions,regioregion they belong to etc.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset was created by Faraz Ahmad
Released under Database: Open Database, Contents: Database Contents
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hungary Fraud Attempts: Volume data was reported at 32.000 Unit in Dec 2019. This records an increase from the previous number of 12.000 Unit for Sep 2019. Hungary Fraud Attempts: Volume data is updated quarterly, averaging 68.000 Unit from Mar 2010 (Median) to Dec 2019, with 40 observations. The data reached an all-time high of 243.000 Unit in Mar 2013 and a record low of 12.000 Unit in Sep 2019. Hungary Fraud Attempts: Volume data remains active status in CEIC and is reported by National Bank of Hungary. The data is categorized under Global Database’s Hungary – Table HU.KA013: Card and Electronic Payment Frauds.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset was created by PANKAJ BHOSALE
Released under Database: Open Database, Contents: Database Contents
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global credit risk database market is experiencing robust growth, driven by the increasing need for accurate and timely credit risk assessment across diverse sectors. The market's expansion is fueled by several key factors, including the rising adoption of digital technologies in lending and credit underwriting, the growing complexity of financial regulations demanding more sophisticated risk management strategies, and the increasing prevalence of fraud and credit defaults. This necessitates comprehensive credit risk databases offering detailed information on individuals and businesses, enabling financial institutions and other stakeholders to make informed decisions, minimize losses, and optimize their credit portfolios. The market is segmented by database type (consumer, commercial, etc.), deployment model (cloud-based, on-premise), and end-user (banks, insurance companies, etc.). Key players are actively investing in advanced analytics, machine learning, and data enrichment capabilities to enhance the accuracy and predictive power of their credit risk databases, further driving market expansion. Competition is intensifying, with companies focusing on strategic partnerships, acquisitions, and technological innovation to maintain a competitive edge. The forecast period (2025-2033) anticipates continued growth, fueled by burgeoning adoption of sophisticated credit scoring models and the expansion of fintech companies leveraging these databases for lending and other financial services. Regulatory changes impacting credit reporting and data privacy are likely to shape the market landscape, necessitating compliance and adaptation among database providers. While challenges such as data security concerns and the cost of data acquisition and maintenance persist, the overall market outlook remains positive, with substantial growth potential across various geographic regions, particularly in emerging economies experiencing rapid economic development and financial sector expansion. Estimating a reasonable market size requires making assumptions about the provided CAGR and the market's initial value. Let's assume a base year (2025) market size of $5 billion and a CAGR of 12% for illustration purposes. This would indicate significant growth over the forecast period.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
🇬🇧 영국
Facebook
Twitterhttps://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy
Global GPU Database Market size is set to expand from $ 509.36 Million in 2023 to $ 2368.35 Million by 2032, with an anticipated CAGR of around 18.62% from 2024 to 2032.
Facebook
TwitterDetails of fraud referrals relating to war pensions & compensation