12 datasets found

"Pwned Passwords" Dataset
academictorrents.com
bittorrent
Updated Aug 3, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
haveibeenpwned.com (2018). "Pwned Passwords" Dataset [Dataset]. https://academictorrents.com/details/53555c69e3799d876159d7290ea60e56b35e36a9
Explore at:
bittorrent(11101449979)Available download formats
Dataset updated
Aug 3, 2018
Dataset provided by
Have I Been Pwned?http://haveibeenpwned.com/
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Version 3 with 517M hashes and counts of password usage ordered by most to least prevalent Pwned Passwords are 517,238,891 real world passwords previously exposed in data breaches. This exposure makes them unsuitable for ongoing use as they re at much greater risk of being used to take over other accounts. They re searchable online below as well as being downloadable for use in other online system. The entire set of passwords is downloadable for free below with each password being represented as a SHA-1 hash to protect the original value (some passwords contain personally identifiable information) followed by a count of how many times that password had been seen in the source data breaches. The list may be integrated into other systems and used to verify whether a password has previously appeared in a data breach after which a system may warn the user or even block the password outright.
All-time biggest online data breaches 2025
statista.com
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). All-time biggest online data breaches 2025 [Dataset]. https://www.statista.com/statistics/290525/cyber-crime-biggest-online-data-breaches-worldwide/
Explore at:
Dataset updated
May 26, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2025
Area covered
Worldwide
Description
The largest reported data leakage as of January 2025 was the Cam4 data breach in March 2020, which exposed more than 10 billion data records. The second-largest data breach in history so far, the Yahoo data breach, occurred in 2013. The company initially reported about one billion exposed data records, but after an investigation, the company updated the number, revealing that three billion accounts were affected. The National Public Data Breach was announced in August 2024. The incident became public when personally identifiable information of individuals became available for sale on the dark web. Overall, the security professionals estimate the leakage of nearly three billion personal records. The next significant data leakage was the March 2018 security breach of India's national ID database, Aadhaar, with over 1.1 billion records exposed. This included biometric information such as identification numbers and fingerprint scans, which could be used to open bank accounts and receive financial aid, among other government services.

Cybercrime - the dark side of digitalization As the world continues its journey into the digital age, corporations and governments across the globe have been increasing their reliance on technology to collect, analyze and store personal data. This, in turn, has led to a rise in the number of cyber crimes, ranging from minor breaches to global-scale attacks impacting billions of users – such as in the case of Yahoo. Within the U.S. alone, 1802 cases of data compromise were reported in 2022. This was a marked increase from the 447 cases reported a decade prior. The high price of data protection As of 2022, the average cost of a single data breach across all industries worldwide stood at around 4.35 million U.S. dollars. This was found to be most costly in the healthcare sector, with each leak reported to have cost the affected party a hefty 10.1 million U.S. dollars. The financial segment followed closely behind. Here, each breach resulted in a loss of approximately 6 million U.S. dollars - 1.5 million more than the global average.
a
CrackStation's Password Cracking Dictionary (Human Passwords Only)
academictorrents.com
bittorrent
Updated Aug 10, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Defuse Security (2014). CrackStation's Password Cracking Dictionary (Human Passwords Only) [Dataset]. https://academictorrents.com/details/7ae809ccd7f0778328ab4b357e777040248b8c7f
Explore at:
bittorrent(257973006)Available download formats
Dataset updated
Aug 10, 2014
Dataset authored and provided by
Defuse Security
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
The list contains every wordlist, dictionary, and password database leak that I could find on the internet (and I spent a LOT of time looking). It also contains every word in the Wikipedia databases (pages-articles, retrieved 2010, all languages) as well as lots of books from Project Gutenberg. It also includes the passwords from some low-profile database breaches that were being sold in the underground years ago. The format of the list is a standard text file sorted in non-case-sensitive alphabetical order. Lines are separated with a newline " " character. You can test the list without downloading it by giving SHA256 hashes to the free hash cracker or to @PlzCrack on twitter. Here s a tool for computing hashes easily. Here are the results of cracking LinkedIn s and eHarmony s password hash leaks with the list. The list is responsible for cracking about 30% of all hashes given to CrackStation s free hash cracker, but that figure should be taken with a grain of salt because s
Number of data compromises and impacted individuals in U.S. 2005-2024
statista.com
Updated Jul 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of data compromises and impacted individuals in U.S. 2005-2024 [Dataset]. https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/
Explore at:
Dataset updated
Jul 14, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
In 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.
i
Data from: Rockyou
ieee-dataport.org
Updated Apr 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zeeshan Shaikh (2021). Rockyou [Dataset]. https://ieee-dataport.org/documents/rockyou
Explore at:
Dataset updated
Apr 27, 2021
Authors
Zeeshan Shaikh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Passwords that were leaked or stolen from sites. The Rockyou Dataset is about 14 million passwords.
Italy number of data sets affected in data breaches Q1 2020-Q2 2024
statista.com
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Italy number of data sets affected in data breaches Q1 2020-Q2 2024 [Dataset]. https://www.statista.com/statistics/1453453/number-of-records-exposed-italy/
Explore at:
Dataset updated
Dec 19, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Italy
Description
Between the first quarter of 2020 and the first second quarter of 2024, the number of records exposed in data breaches in Italy experienced a significant decrease. In the most recent measured period, approximately one million records were reported as leaked, down from around 10.2 million data sets affected in the first quarter of 2021.
e
Eximpedia Export Import Trade
eximpedia.app
Updated Feb 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Eximpedia Export Import Trade [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 18, 2025
Dataset provided by
Eximpedia PTE LTD
Eximpedia Export Import Trade Data
Authors
Seair Exim
Area covered
China, Indonesia, Barbados, Tanzania, Mauritania, Cambodia, Mozambique, American Samoa, Christmas Island, Ghana
Description
Access Leak Detector import export data of global countries with importers' & exporters' details, shipment date, price, hs code, ports, quantity etc.
Number of accounts affected in data breaches Thailand Q2 2022-Q3 2024
statista.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of accounts affected in data breaches Thailand Q2 2022-Q3 2024 [Dataset]. https://www.statista.com/statistics/1404553/thailand-number-of-account-breaches-exposed/
Explore at:
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Thailand
Description
Between the second quarter of 2022 and the third quarter of 2024, the number of records exposed to account breaches in Thailand fluctuated significantly. Over ******* datasets were reported as having been leaked in the third quarter of 2024, compared to around ******* during the same quarter of the previous year.
Leaked U.S. Department of Defense DMED (Defense Medical Epidemiology...
data.niaid.nih.gov
zenodo.org
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Samuel Sigoloff (2024). Leaked U.S. Department of Defense DMED (Defense Medical Epidemiology Database) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5985601
Explore at:
Dataset updated
Jul 17, 2024
Dataset provided by
United States Department of Defensehttp://www.defense.gov/
Thomas Renz
Lt. Col. Peter Chambers
Leigh Dundas
Dr. Samuel Sigoloff
Lt. Col. Theresa Long
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This is a database and accompanying report first appearing on a google drive link shared by Attorney Thomas Renz of Renz Law with the Epoch Times as well as embedded into Thomas Renz' website.

According to DHA's Armed Forces Surveillance Division, the data in this database is incorrect for the years 2016-2020. (Watson, 2022)

DMED is a web tool to query medical event data contained within the Defense Medical Surveillance System. The AFHSD claims that due to a serious error in their system, data between the years of 2016-2020 has been incredibly under-reported which has lead to the appearance of a significant increase of occurrences of medical diagnoses in 2021. (Watson 2022)

References:

Renz, T. (2021, October 1). Attorney Tom Renz discovers leaked DOD covid files. Renz Law. Retrieved February 6, 2022, from https://renz-law.com/attorney-tom-renz-discovers-leaked-dod-covid-files/

Watson, S. (2022, February 6). Pentagon responds to DOD whistleblowers' claim of spiking disease rates in the military after Covid Vaccine Mandate. InfoWars. Retrieved February 6, 2022, from https://www.infowars.com/posts/pentagon-responds-to-dod-whistleblowers-claim-of-spiking-disease-rates-in-the-military-after-covid-vaccine-mandate/
d
Acoustic detection for undersea oil leaks project: programs and algorithms...
search.dataone.org
data.griidc.org
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lu, Zhiqu (2025). Acoustic detection for undersea oil leaks project: programs and algorithms dataset [Dataset]. http://doi.org/10.7266/ZP35J344
Explore at:
Unique identifier
https://doi.org/10.7266/ZP35J344
Dataset updated
Feb 5, 2025
Dataset provided by
GRIIDC
Authors
Lu, Zhiqu
Description
The U.S. outer continental shelf is a major source of energy for the United States. The rapid growth of oil and gas production in the Gulf of Mexico increases the risk of underwater oil spills at greater water depths and drilling wells. These hydrocarbons leakages can be caused by either natural events, such as seeping from fissures in the ocean seabed, or by anthropogenic accidents, such as leaking from broken wellheads and pipelines. In order to improve safety and reduce the environmental risks of offshore oil and gas operations, the Bureau of Safety and Environmental Enforcement recommended the use of real-time monitoring. An early warning system for detecting, locating, and characterizing hydrocarbon leakages is essential for preventing the next oil spill as well as for seafloor hydrocarbon seepage detection. Existing monitoring techniques have significant limitations and cannot achieve real-time monitoring. This project launches an effort to develop a functional real-time monitoring system that uses passive acoustic technologies to detect, locate, and characterize undersea hydrocarbon leakages over large areas in a cost-effective manner.

In an oil spill event, the leaked hydrocarbon is injected into seawater with huge amounts of discharge at high speeds. With mixed natural gases and oils, this hydrocarbon leakage creates underwater sound through two major mechanisms: shearing and turbulence by a streaming jet of oil droplets and gas bubbles, and bubble oscillation and collapse. These acoustic emissions can be recorded by hydrophones in the water column at far distances. They will be characterized and differentiated from other underwater noises through their unique frequency spectrum, evolution and transportation processes and leaking positions, and further be utilized to detect and position the leakage locations.

With the objective of leakage detection and localization, our approach consists of recording and modeling the acoustic signals induced by the oil-spill and implementing advanced signal processing and triangulation localization techniques with a hydrophone network.

Tasks of this project are: 1. Conduct a laboratory study to simulate hydrocarbon leakages and their induced sound under controlled conditions, and to establish the correlation between frequency spectra and leakage properties, such as oil-jet intensities and speeds, bubble radii and distributions, and crack sizes. 2. Implement and develop acoustic bubble modeling for estimating features and strength of the oil leakage. 3. Develop a set of advanced signal processing and triangulation algorithms for leakage detection and localization.

The experimental data have been collected in a water tank in the building of the National Center for Physical Acoustics, the University of Mississippi from 2018-2020, including hydrophone recorded underwater sounds generated by oil leakage bubbles under different testing conditions, such as pressures, flow rates, jet velocities, and crack sizes, and movies of oil leakages. Two types of oil leakages (a few bubbles and constant flow bubbles) were tested to simulate oil seepages either from seafloors or from oil well and pipe-line breaches. Two types of gases were investigated (nitrogen and methane). These data were analyzed for acoustic bubble modeling, oil leakage characterization, and localization.

This dataset contains programs and algorithms. The folders of the dataset are described as follows: â€¢ the folder of â€œsignal processing programsâ€ contains programs (LabView VIs) for instrument control, data acquisition, and signal processing. â€¢ the folders of â€œmodeling algorithmsâ€ contains algorithms (Matlab m-files) for acoustic bubble sound modeling. â€¢ the folder of â€œlocalization algorithmsâ€ contains algorithms (MatLab m-files) for oil leakage source localization.

More details of this dataset can be found in the corresponding ReadMe files in each folder. Associated data may be found in S3.x911.000:0001 (bubble sound characterization and modeling data, doi:10.7266/3REPB7QM); S3.x911.000:0002 (test data, doi: 10.7266/NPYZ3XFV); S3.x911.000:0003 (raw sound data and validation of modeled source positions, doi: 10.7266/4S9EBZKX); S3.x911.000:0005 (imagery of the laboratory experiment, doi: 10.7266/BZY62EK0).
d
Replication Data and Code for \"Incentives and Information in Methane Leak...
search.dataone.org
dataverse.harvard.edu
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lewis, Eric (2024). Replication Data and Code for \"Incentives and Information in Methane Leak Detection and Repair\" [Dataset]. http://doi.org/10.7910/DVN/BAVBGX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/BAVBGX
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Lewis, Eric
Description
Replication Data and Code for "Incentives and Information in Methane Leak Detection and Repair" Abstract: Capturing leaked methane can be a win for both firms and the environment. However, leakage volume uncertainty can be a barrier inhibiting leak repair. We study an experiment at oil and gas production sites which randomized whether site operators were informed of methane leakage volumes. At sites with high baseline leakage, we estimate a negative but imprecise effect of information on endline emissions. But at sites with zero measured leakage, giving firms information about methane leakage increased emissions at endline. Our results suggest that giving firms news of low leakage disincentivizes maintenance effort, thereby increasing the likelihood of future leaks. Package includes data from Wang et al. (2024) RCT as well as IEA data on estimated methane emissions and methane abatement costs. Package also includes code for replication.
B
The Nauru Files
borealisdata.ca
search.dataone.org
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Guardian (2024). The Nauru Files [Dataset]. http://doi.org/10.5683/SP3/JWHSU9
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP3/JWHSU9
Dataset updated
Apr 18, 2024
Dataset provided by
Borealis
Authors
The Guardian
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
May 12, 2013 - Oct 29, 2015
Area covered
Australia, Nauru
Description
About The Nauru Files contain the largest set of documents published from inside Australia's immigration detention system. Leaked to The Guardian in 2016, they include nearly 2,000 incident reports from the Nauru detention centre, which were written by guards, caseworkers and teachers on the remote Pacific island. Summary Examples of events include assaults, injuries, abuse and other forms of violence reported at the detention centre between 2013 and 2015. As noted by The Guardian, as well as academic research, Australia has privatised its immigration detention centres and exported detention of asylum seekers offshore to places such as Nauru and Manus Island in Papua New Guinea. This strategy is part of a wider "Pacific Solution" implemented by the Government of Australia since the early 2000s as a hardline deterrent to "stop the boats." Effectively, asylum seekers intercepted and detained on Nauru are removed from access to Australia's asylum system. Data Structure These data are composed of incident reports. An incident report is a short summary of an event in the Nauru detention centre written by staff there. Some of the details found in the files may be triggering; we therefore advise caution with reading and analysing these data. According to The Guardian, these reports form part of the Government of Australia's requirements to document what is happening within its detention system. Each report holds detailed information of the incident at the detention centre along with a "summary log". Working with The Guardian, we have organised these data into two forms: a PDF of each incident report, sorted by name at the time of leak, and a CSV/JSON of all incident reports (see "nauru_files.csv/json"), which structures key details into variables within its columns. Examples of variables include time, incident type, severity and description. Combined, these form a structured database linking each incident report to these variables. Data Source The Guardian has modified the original, leaked data to remove any personally-identifying information within them. To achieve this, a stringent approach of redaction has been implemented to remove names of asylum seekers and staff, personal identification numbers of asylum seekers, signatures of detention staff, nationalities within small population groups and residential tent numbers, among other things. There are also a large number of acronyms used in these data. For your convenience, we have provided an RTF document with a listing of these acronyms and their meanings. If you use these data, please cite the original source at The Guardian: The Guardian. (10 August 2016). The Nauru Files: The lives of asylum seekers in detention detailed in a unique database. Retrieved from https://www.theguardian.com/australia-news/ng-interactive/2016/aug/10/the-nauru-files-the-lives-of-asylum-seekers-in-detention-detailed-in-a-unique-database-interactive. Should you have any comments, questions or requested edits or extensions to the Nauru files, please contact Haven at kira.williams@utoronto.ca. For more articles from The Guardian on these data, see: The Nauru files: cache of 2,000 leaked reports reveal scale of abuse of children in Australian offshore detention. A short history of Nauru, Australia’s dumping ground for refugees. ‘I want death’: Nauru files chronicle despair of asylum seeker children.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

haveibeenpwned.com (2018). "Pwned Passwords" Dataset [Dataset]. https://academictorrents.com/details/53555c69e3799d876159d7290ea60e56b35e36a9

"Pwned Passwords" Dataset

Explore at:

bittorrent(11101449979)Available download formats

Dataset updated

Aug 3, 2018

Dataset provided by

Have I Been Pwned?http://haveibeenpwned.com/

License

https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

Description

Version 3 with 517M hashes and counts of password usage ordered by most to least prevalent Pwned Passwords are 517,238,891 real world passwords previously exposed in data breaches. This exposure makes them unsuitable for ongoing use as they re at much greater risk of being used to take over other accounts. They re searchable online below as well as being downloadable for use in other online system. The entire set of passwords is downloadable for free below with each password being represented as a SHA-1 hash to protect the original value (some passwords contain personally identifiable information) followed by a count of how many times that password had been seen in the source data breaches. The list may be integrated into other systems and used to verify whether a password has previously appeared in a data breach after which a system may warn the user or even block the password outright.

Clear search

Close search

Google apps

Main menu

"Pwned Passwords" Dataset

All-time biggest online data breaches 2025

CrackStation's Password Cracking Dictionary (Human Passwords Only)

Number of data compromises and impacted individuals in U.S. 2005-2024

Data from: Rockyou

Italy number of data sets affected in data breaches Q1 2020-Q2 2024

Eximpedia Export Import Trade

Number of accounts affected in data breaches Thailand Q2 2022-Q3 2024

Leaked U.S. Department of Defense DMED (Defense Medical Epidemiology...

Acoustic detection for undersea oil leaks project: programs and algorithms...

Replication Data and Code for \"Incentives and Information in Methane Leak...

The Nauru Files

"Pwned Passwords" Dataset