Facebook
TwitterView Data Breach Notification Reports, which include how many breaches are reported each year and the number of affected residents.
Facebook
TwitterThe largest reported data leakage as of January 2025 was the Cam4 data breach in March 2020, which exposed more than 10 billion data records. The second-largest data breach in history so far, the Yahoo data breach, occurred in 2013. The company initially reported about one billion exposed data records, but after an investigation, the company updated the number, revealing that three billion accounts were affected. The National Public Data Breach was announced in August 2024. The incident became public when personally identifiable information of individuals became available for sale on the dark web. Overall, the security professionals estimate the leakage of nearly three billion personal records. The next significant data leakage was the March 2018 security breach of India's national ID database, Aadhaar, with over 1.1 billion records exposed. This included biometric information such as identification numbers and fingerprint scans, which could be used to open bank accounts and receive financial aid, among other government services.
Cybercrime - the dark side of digitalization As the world continues its journey into the digital age, corporations and governments across the globe have been increasing their reliance on technology to collect, analyze and store personal data. This, in turn, has led to a rise in the number of cyber crimes, ranging from minor breaches to global-scale attacks impacting billions of users – such as in the case of Yahoo. Within the U.S. alone, 1802 cases of data compromise were reported in 2022. This was a marked increase from the 447 cases reported a decade prior. The high price of data protection As of 2022, the average cost of a single data breach across all industries worldwide stood at around 4.35 million U.S. dollars. This was found to be most costly in the healthcare sector, with each leak reported to have cost the affected party a hefty 10.1 million U.S. dollars. The financial segment followed closely behind. Here, each breach resulted in a loss of approximately 6 million U.S. dollars - 1.5 million more than the global average.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a dataset containing all the major data breaches in the world from 2004 to 2021
As we know, there is a big issue related to the privacy of our data. Many major companies in the world still to this day face this issue every single day. Even with a great team of people working on their security, many still suffer. In order to tackle this situation, it is only right that we must study this issue in great depth and therefore I pulled this data from Wikipedia to conduct data analysis. I would encourage others to take a look at this as well and find as many insights as possible.
This data contains 5 columns: 1. Entity: The name of the company, organization or institute 2. Year: In what year did the data breach took place 3. Records: How many records were compromised (can include information like email, passwords etc.) 4. Organization type: Which sector does the organization belong to 5. Method: Was it hacked? Were the files lost? Was it an inside job?
Here is the source for the dataset: https://en.wikipedia.org/wiki/List_of_data_breaches
Here is the GitHub link for a guide on how it was scraped: https://github.com/hishaamarmghan/Data-Breaches-Scraping-Cleaning
Facebook
TwitterPublicly reported U.S. data compromises tracked by the Identity Theft Resource Center since 2005.
Facebook
TwitterIn 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Each row in the dataset represents a specific data breach incident. Here's an explanation of the columns in the dataset:
Number: An identifier for each data breach incident.
Name_of_Covered_Entity: The name of the organization or entity that experienced the data breach.
Business_Associate_Involved: Information about whether a business associate was involved in the breach.
Total_Individuals: The total number of individuals affected by the breach.
Individuals_Affected: The number of individuals whose information was compromised.
Type_of_Breach: The method or nature of the data breach (e.g., theft, loss, hacking/IT incident, unauthorized access/disclosure).
Location_of_Breached_Information: The location or type of device where the breached information was stored (e.g., laptop, desktop computer, network server).
Breach_Start: The start date of the data breach.
Breach_End: The end date of the data breach.
Branch: A categorical identifier, possibly indicating a specific branch or division of the organization.
Department: A categorical identifier, possibly indicating a specific department within the organization.
CountryBranch: The country associated with the branch.
Employee(who find out breach): The employee who discovered the breach.
Employee URL: A URL link associated with the employee who discovered the breach.
Estimate Stole Data(GB): An estimate of the amount of data stolen in gigabytes.
Facebook
TwitterBetween 2004 and October 2024, the United States recorded the highest number of data points leaked online. Overall, more than 17 billion data points were leaked in the country during the measured period. Russia ranked second, with more than four billion leaked data points.
Facebook
TwitterAs of 2025, the mean number of days to identify the data breaches was *** days, six days faster than in the previous year. The mean time companies needed to contain the breaches in the measured year was ** days. In comparison, in 2021, it took organizations *** days to identify and ** days to address the data breaches.
Facebook
TwitterAs of May 2024, 33 percent of adults in the United States reported having their personal data breached, this represents a 13 percent increase in comparison to the previous year. Approximately 42 percent of respondents stated that they do not believe their data was exposed, while 25 percent of respondents said they do not know if their data was breached.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Number of breaches applications for social work orders by order type.
Facebook
TwitterAs of 2024, the average cost of a data breach in the United States amounted to **** million U.S. dollars, down from **** million U.S. dollars in the previous year. The global average cost per data breach was **** million U.S. dollars in 2024. Cost of a data breach in different countries worldwide Data breaches impose a big threat for organizations globally. The monetary damage caused by data breaches has increased in many markets in the past decade. In 2023, Canada followed the U.S. by data breach costs, with an average of **** million U.S. dollars. Since 2019, the average monetary damage caused by loss of sensitive information in Canada has increased notably. In the United Kingdom, the average cost of a data breach in 2024 amounted to around **** million U.S. dollars, while in Germany it stood at **** million U.S. dollars. The cost of data breach by industry and segment Data breach costs vary depending on the industry and segment. For the fourth consecutive year, the global healthcare sector registered the highest costs of data breach, which in 2024 amounted to about **** million U.S. dollars. Financial institutions ranked second, with an average cost of *** million U.S. dollars for a data breach. Detection and escalation was the costliest segment in data breaches worldwide, with **** U.S. dollars on average. The cost for lost business ranked second, while response following a breach came across as the third-costliest segment.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1842206%2Ff29f742e3d48f66bf0eccf60abf631d1%2Frockyo2.png?generation=1720539563047126&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-forum-message-attachments/o/inbox%2F1842206%2F0e4b20e3662c065318f7feefb42ef785%2Foriginal.png?generation=1720578063663708&alt=media" alt="">
The original RockYou.txt dataset was uploaded by @wjburns 5 years ago, with 95K downloads and 640 upvotes, which means Kaggle allows this type of data for research and educational purposes.
I separated the single 160GB txt file into smaller files with filenames based on first character to make it easier to utilize for those with less powerful machines.
Everyone involved with Capture The Flag (CTF) has used the infamous rockyou.txt wordlist at least once, mainly to perform password cracking activities. The file is a list of 14 million unique passwords originating from the 2009 RockYou hack making a piece of computer security history. The “rockyou lineage” has evolved over the years.
https://www.youtube.com/watch?v=0_mQACSn6XM" alt="">
With the 2021 version we touched high numbers but with the newest release is the (apparently) ultimate amalgamation. RockYou2024 has been released by the user “ObamaCare” . This new version added 1.5 billion of records to the 2021 version reaching the 10 billions records. A wordlist can potentially be used for a multitude of tasks and having this number of records in a single file, especially in 2024 with increasingly aggressive data breaches, is a dream come true for attackers. The user have not specified the nature of the additional records but punctualize the new data comes from recent leaked databases.
From The New RockYou2024 Collection has been published!
I got it from https://github.com/hkphh/rockyou2024.txt, but it was originally shared by a certain aka ObamaCare which I don't have any affiliation nor association with.
In case you'd like to process the RockYou2024.txt yourself, you can find it here ❗Original RockYou2024.txt zip file
In case you'd like to see only the "Strong Passwords", you can find it here ❗180 Million "Strong Passwords" in RockYou2024.txt
Generated with Bing Image Generator
Facebook
TwitterBetween the third quarter of 2024 and the second quarter of 2025, the number of records exposed in data breaches in the United States decreased significantly. In the most recent measured period, over 16.9 million records were reported as leaked, down from around 494.17 million in the third quarter of 2024.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Global Cybersecurity Threats Dataset (2015-2024) provides extensive data on cyberattacks, malware types, targeted industries, and affected countries. It is designed for threat intelligence analysis, cybersecurity trend forecasting, and machine learning model development to enhance global digital security.
| Column Name | Description |
|---|---|
| Country | Country where the attack occurred |
| Year | Year of the incident |
| Threat Type | Type of cybersecurity threat (e.g., Malware, DDoS) |
| Attack Vector | Method of attack (e.g., Phishing, SQL Injection) |
| Affected Industry | Industry targeted (e.g., Finance, Healthcare) |
| Data Breached (GB) | Volume of data compromised |
| Financial Impact ($M) | Estimated financial loss in millions |
| Severity Level | Low, Medium, High, Critical |
| Response Time (Hours) | Time taken to mitigate the attack |
| Mitigation Strategy | Countermeasures taken |
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterView Data Breach Notification Reports, which include how many breaches are reported each year and the number of affected residents.