18 datasets found

All-time biggest online data breaches 2025
statista.com
ai-chatbox.pro
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). All-time biggest online data breaches 2025 [Dataset]. https://www.statista.com/statistics/290525/cyber-crime-biggest-online-data-breaches-worldwide/
Explore at:
Dataset updated
May 26, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2025
Area covered
Worldwide
Description
The largest reported data leakage as of January 2025 was the Cam4 data breach in March 2020, which exposed more than 10 billion data records. The second-largest data breach in history so far, the Yahoo data breach, occurred in 2013. The company initially reported about one billion exposed data records, but after an investigation, the company updated the number, revealing that three billion accounts were affected. The National Public Data Breach was announced in August 2024. The incident became public when personally identifiable information of individuals became available for sale on the dark web. Overall, the security professionals estimate the leakage of nearly three billion personal records. The next significant data leakage was the March 2018 security breach of India's national ID database, Aadhaar, with over 1.1 billion records exposed. This included biometric information such as identification numbers and fingerprint scans, which could be used to open bank accounts and receive financial aid, among other government services.

Cybercrime - the dark side of digitalization As the world continues its journey into the digital age, corporations and governments across the globe have been increasing their reliance on technology to collect, analyze and store personal data. This, in turn, has led to a rise in the number of cyber crimes, ranging from minor breaches to global-scale attacks impacting billions of users – such as in the case of Yahoo. Within the U.S. alone, 1802 cases of data compromise were reported in 2022. This was a marked increase from the 447 cases reported a decade prior. The high price of data protection As of 2022, the average cost of a single data breach across all industries worldwide stood at around 4.35 million U.S. dollars. This was found to be most costly in the healthcare sector, with each leak reported to have cost the affected party a hefty 10.1 million U.S. dollars. The financial segment followed closely behind. Here, each breach resulted in a loss of approximately 6 million U.S. dollars - 1.5 million more than the global average.
"Pwned Passwords" Dataset
academictorrents.com
bittorrent
Updated Aug 3, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
haveibeenpwned.com (2018). "Pwned Passwords" Dataset [Dataset]. https://academictorrents.com/details/53555c69e3799d876159d7290ea60e56b35e36a9
Explore at:
bittorrent(11101449979)Available download formats
Dataset updated
Aug 3, 2018
Dataset provided by
Have I Been Pwned?http://haveibeenpwned.com/
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Version 3 with 517M hashes and counts of password usage ordered by most to least prevalent Pwned Passwords are 517,238,891 real world passwords previously exposed in data breaches. This exposure makes them unsuitable for ongoing use as they re at much greater risk of being used to take over other accounts. They re searchable online below as well as being downloadable for use in other online system. The entire set of passwords is downloadable for free below with each password being represented as a SHA-1 hash to protect the original value (some passwords contain personally identifiable information) followed by a count of how many times that password had been seen in the source data breaches. The list may be integrated into other systems and used to verify whether a password has previously appeared in a data breach after which a system may warn the user or even block the password outright.
AOL Search Data 20M web queries (2006)
academictorrents.com
bittorrent
Updated Dec 17, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AOL (2016). AOL Search Data 20M web queries (2006) [Dataset]. https://academictorrents.com/details/cd339bddeae7126bb3b15f3a72c903cb0c401bd1
Explore at:
bittorrent(460409936)Available download formats
Dataset updated
Dec 17, 2016
Dataset authored and provided by
AOLhttp://aol.com/
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
500k User Session Collection This collection is distributed for NON-COMMERCIAL RESEARCH USE ONLY. Any application of this collection for commercial purposes is STRICTLY PROHIBITED. #### Brief description: This collection consists of ~20M web queries collected from ~650k users over three months. The data is sorted by anonymous user ID and sequentially arranged. The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research. The data set includes AnonID, Query, QueryTime, ItemRank, ClickURL. AnonID - an anonymous user ID number. Query - the query issued by the user, case shifted with most punctuation removed. QueryTime - the time at which the query was submitted for search. ItemRank - if the user clicked on a search result, the rank of the item on which they clicked is listed. ClickURL - if the user clicked on a search result, the domain portion of the URL i
a
CrackStation's Password Cracking Dictionary (Human Passwords Only)
academictorrents.com
bittorrent
Updated Aug 10, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Defuse Security (2014). CrackStation's Password Cracking Dictionary (Human Passwords Only) [Dataset]. https://academictorrents.com/details/7ae809ccd7f0778328ab4b357e777040248b8c7f
Explore at:
bittorrent(257973006)Available download formats
Dataset updated
Aug 10, 2014
Dataset authored and provided by
Defuse Security
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
The list contains every wordlist, dictionary, and password database leak that I could find on the internet (and I spent a LOT of time looking). It also contains every word in the Wikipedia databases (pages-articles, retrieved 2010, all languages) as well as lots of books from Project Gutenberg. It also includes the passwords from some low-profile database breaches that were being sold in the underground years ago. The format of the list is a standard text file sorted in non-case-sensitive alphabetical order. Lines are separated with a newline " " character. You can test the list without downloading it by giving SHA256 hashes to the free hash cracker or to @PlzCrack on twitter. Here s a tool for computing hashes easily. Here are the results of cracking LinkedIn s and eHarmony s password hash leaks with the list. The list is responsible for cracking about 30% of all hashes given to CrackStation s free hash cracker, but that figure should be taken with a grain of salt because s
i
Data from: Rockyou
ieee-dataport.org
Updated Apr 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zeeshan Shaikh (2021). Rockyou [Dataset]. https://ieee-dataport.org/documents/rockyou
Explore at:
Dataset updated
Apr 27, 2021
Authors
Zeeshan Shaikh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Passwords that were leaked or stolen from sites. The Rockyou Dataset is about 14 million passwords.
d
Using Decision Trees to Detect and Isolate Leaks in the J-2X
catalog.data.gov
s.cnmilf.com
+2more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Using Decision Trees to Detect and Isolate Leaks in the J-2X [Dataset]. https://catalog.data.gov/dataset/using-decision-trees-to-detect-and-isolate-leaks-in-the-j-2x
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Full title: Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine Mark Schwabacher, NASA Ames Research Center Robert Aguilar, Pratt & Whitney Rocketdyne Fernando Figueroa, NASA Stennis Space Center Abstract The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically “learns” a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to “train” and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it “learned” a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location. Introduction The J-2X rocket engine will be tested on Test Stand A-1 at NASA Stennis Space Center (SSC) in Mississippi. A team including people from SSC, NASA Ames Research Center (ARC), and Pratt & Whitney Rocketdyne (PWR) is developing a prototype end-to-end integrated systems health management (ISHM) system that will be used to monitor the test stand and the engine while the engine is on the test stand[1]. The prototype will use several different methods for detecting and diagnosing faults in the test stand and the engine, including rule-based, model-based, and data-driven approaches. SSC is currently using the G2 tool http://www.gensym.com to develop rule-based and model-based fault detection and diagnosis capabilities for the A-1 test stand. This paper describes preliminary results in applying the data-driven approach to detecting and diagnosing faults in the J-2X engine. The conventional approach to detecting and diagnosing faults in complex engineered systems such as rocket engines and test stands is to use large numbers of human experts. Test controllers watch the data in near-real time during each engine test. Engineers study the data after each test. These experts are aided by limit checks that signal when a particular variable goes outside of a predetermined range. The conventional approach is very labor intensive. Also, humans may not be able to recognize faults that involve the relationships among large numbers of variables. Further, some potential faults could happen too quickly for humans to detect them and react before they become catastrophic. Automated fault detection and diagnosis is therefore needed. One approach to automation is to encode human knowledge into rules or models. Another approach is use data-driven methods to automatically learn models from historical data or simulated data. Our prototype will combine the data-driven approach with the model-based and rule-based appro
DroidLeaks: A Large Collection of Resource Leak Bugs in Real-World Android...
zenodo.org
explore.openaire.eu
+1more
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yepang Liu; Yepang Liu (2020). DroidLeaks: A Large Collection of Resource Leak Bugs in Real-World Android Apps [Dataset]. http://doi.org/10.5281/zenodo.2589909
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2589909
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yepang Liu; Yepang Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
World
Description
DroidLeaks features 292 diverse resource leak bugs in popular and large-scale open-source Android apps. For each bug, DroidLeaks provides links to:
1. the code repository of the app subject
2. the concerned resource class
3. the buggy code revision (and buggy file and method names)
4. the bug-fixing code revision (i.e., link to the patch)
5. the bug report or the corresponding pull request for patches (if located)
Eximpedia Export Import Trade
eximpedia.app
Updated Feb 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2025). Eximpedia Export Import Trade [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Feb 18, 2025
Dataset provided by
Eximpedia Export Import Trade
Eximpedia Export Import Trade Data
Authors
Seair Exim
Area covered
China, Barbados, Tanzania, Cambodia, Indonesia, Mozambique, American Samoa, Christmas Island, Mauritania, Ghana
Description
Access Leak Detector import export data of global countries with importers' & exporters' details, shipment date, price, hs code, ports, quantity etc.
Global exporters importers-export import data of Helium leak detector
volza.com
csv
Updated Jun 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global exporters importers-export import data of Helium leak detector [Dataset]. https://www.volza.com/p/helium-leak-detector/
Explore at:
csvAvailable download formats
Dataset updated
Jun 19, 2025
Dataset provided by
Volza
Authors
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of exporters, Count of importers, Count of shipments, Sum of export import value
Description
3358 Global exporters importers export import shipment records of Helium leak detector with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
v
Global exporters importers-export import data of Leak detector and Hsn Code...
volza.com
csv
Updated Jun 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global exporters importers-export import data of Leak detector and Hsn Code 3403 [Dataset]. https://www.volza.com/p/leak-detector/hsn-code-3403/
Explore at:
csvAvailable download formats
Dataset updated
Jun 5, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of exporters, Count of importers, Count of shipments, Sum of export import value
Description
314 Global exporters importers export import shipment records of Leak detector and Hsn Code 3403 with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
d
Data from: Leak-resilient enzyme-free nucleic acid dynamical systems through...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Apr 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rajiv Teja Nagipogu (2024). Leak-resilient enzyme-free nucleic acid dynamical systems through shadow cancellation [Dataset]. http://doi.org/10.5061/dryad.g4f4qrfz7
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.g4f4qrfz7
Dataset updated
Apr 25, 2024
Dataset provided by
Dryad Digital Repository
Authors
Rajiv Teja Nagipogu
Description
DNA strand displacement (DSD) emerged as a prominent reaction motif for engineering nucleic acid-based computational devices with programmable behaviors. However, strand displacement circuits are susceptible to background noise that disrupts the circuit behavior, commonly known as leaks. The side effects of leaks are particularly severe in circuits with complex dynamical elements (e.g., feedback loops), as their leaks amplify nonlinearly, disrupting the circuit function. Shadow cancellation is a dynamic leak-elimination strategy originally proposed to control the leak growth in such circuits. However, the kinetic restrictions of the proposed method introduce a significant design overhead, making it less accessible. In this work, we use domain-level DSD simulations to examine the method's capabilities, the inner workings of its components, and, most importantly, robustness to practical deviations in its design requirements. First, we show that the method could stabilize the dynamics of s..., , , # Leak-resilient enzyme-free nucleic acid dynamical systems through shadow cancellation

Abbreviations

RPS: Rock-Paper-Scissors oscillator

UNIAMP: Unimolecular autocatalytic amplifier

BIAMP: Bimolecular autocatalytic amplifier

Basic commands

To run the peppercorn command to generate the *_enum.pilÂ file and the corresponding plotting data

$FOLDERÂ - The folder containing the .pilÂ file

$NAMEÂ - The name of the .pilÂ file without the file extension

$INTERMEDIATE_PREFIXÂ - Prefix of the intermediates generated

$LABELSÂ - Space separated list of the chemical species that need to be tracked

$TIMEÂ Â - Time to run the simulation forÂ

./sim.sh $FOLDER $TIME $NAME $LABELS $INTERMEDIATE_PREFIXÂ

To run the *_enum.pilÂ file in the folder $FOLDER

$NAMEÂ - The name of the .*_enum_pilÂ file without the _enum.pil

./pil.sh $FOLDER $TIME $NAME $LABELS $NAMEÂ

Produce-Helper Leak mechanism

`sLeakWaste = hcjr( fcr mcr scr + fcr( hckr...
d
Replication Data and Code for \"Incentives and Information in Methane Leak...
search.dataone.org
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lewis, Eric (2024). Replication Data and Code for \"Incentives and Information in Methane Leak Detection and Repair\" [Dataset]. http://doi.org/10.7910/DVN/BAVBGX
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/BAVBGX
Dataset updated
Sep 24, 2024
Dataset provided by
Harvard Dataverse
Authors
Lewis, Eric
Description
Replication Data and Code for "Incentives and Information in Methane Leak Detection and Repair" Abstract: Capturing leaked methane can be a win for both firms and the environment. However, leakage volume uncertainty can be a barrier inhibiting leak repair. We study an experiment at oil and gas production sites which randomized whether site operators were informed of methane leakage volumes. At sites with high baseline leakage, we estimate a negative but imprecise effect of information on endline emissions. But at sites with zero measured leakage, giving firms information about methane leakage increased emissions at endline. Our results suggest that giving firms news of low leakage disincentivizes maintenance effort, thereby increasing the likelihood of future leaks. Package includes data from Wang et al. (2024) RCT as well as IEA data on estimated methane emissions and methane abatement costs. Package also includes code for replication.
d
Data from: Using controlled subsurface releases to investigate the effect of...
search-dev.test.dataone.org
search.dataone.org
+3more
Updated Nov 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mercy Mbua; Stuart N. Riddick; Shanru Tian; Fancy Cheptonui; Cade Houlihan; Kathleen M. Smits; Daniel Zimmerle (2023). Using controlled subsurface releases to investigate the effect of leak variation on above-ground natural gas detection [Dataset]. http://doi.org/10.5061/dryad.ncjsxkt15
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.ncjsxkt15
Dataset updated
Nov 29, 2023
Dataset provided by
Dryad Digital Repository
Authors
Mercy Mbua; Stuart N. Riddick; Shanru Tian; Fancy Cheptonui; Cade Houlihan; Kathleen M. Smits; Daniel Zimmerle
Time period covered
Jan 1, 2023
Description
Pipelines transport natural gas (NG) in all stages between production and the end user. The NG composition, pipeline depth, and pressure vary significantly between extraction and consumption. As methane (CH4Â), the primary component of NG is both explosive and a potent greenhouse gas, NG leaks from underground pipelines pose both a safety and environmental threat. Leaks are typically found when an observer detects a CH4 enhancement as they pass through the downwind above-ground NG plume. The likelihood of detecting a plume depends, in part, on the size of the plume, which is contingent on both environmental conditions and intrinsic characteristics of the leak. To investigate the effects of leak characteristics, this study uses controlled NG release experiments to observe how the above-ground plume width changes with changes in the gas composition of the NG, leak rate, and depth of the subsurface emission. Results show that plume width generally decreases when heavier hydrocarbons are pr...
H
Replication Data for: Committee Decision-Making under the Threat of Leaks
dataverse.harvard.edu
Updated Oct 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Fehrler; Volker Hahn (2022). Replication Data for: Committee Decision-Making under the Threat of Leaks [Dataset]. http://doi.org/10.7910/DVN/AV3OQM
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/AV3OQM
Dataset updated
Oct 7, 2022
Dataset provided by
Harvard Dataverse
Authors
Sebastian Fehrler; Volker Hahn
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Leaks are pervasive in politics. Hence, many committees that nominally operate under secrecy de facto operate under the threat that information might be passed on to outsiders. We study theoretically and experimentally how this possibility affects the behavior of committee members and the decision-making accuracy. Our theoretical analysis generates two major predictions. First, a committee operating under the threat of leaks is equivalent to a formally transparent committee in terms of the probabilities of supporting the adoption of a new policy. Second, the threat of leaks leads to a status-quo bias. In our experimental analysis of a committee with possible leaks, individual behavior is often less strategic than theoretically predicted, which leads to frequent leaks. However, despite these deviations on the individual level, our experiment confirms the two major theoretical predictions.
d
Replication Data for: A cooperative model to lower cost and increase the...
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gao, Mozhou (2023). Replication Data for: A cooperative model to lower cost and increase the efficiency of methane leak inspections at oil and gas sites [Dataset]. http://doi.org/10.7910/DVN/CMCM8P
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/CMCM8P
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Gao, Mozhou
Description
These data files are related to the work titled "A cooperative model to lower cost and increase the efficiency of methane leak inspections at oil and gas sites." The abstract of the work: Methane is a potent greenhouse gas that tends to leak from equipment at oil and gas (O&G) sites. The process of locating and repairing fugitive methane emissions is known as leak detection and repair (LDAR). Conventional LDAR methods are labor intensive and costly because they involve time-consuming close-range, component-level inspections at each site. This has prompted duty holders to examine new methods and strategies that could be more cost-effective. We examined a co-operative model in which multiple duty holders of O&G sites in a region use shared services to complete leak inspections. This approach was hypothesized to be more efficient and cost-effective than independent inspection programs by each duty holder in the region. To test this hypothesis, we developed a geospatial simulation model using empirical data from 11 O&G-producing regions in Canada and the USA. We used the model to compare labor cost, transit time, mileage, vehicle emissions, and driving risk between independent and co-op leak inspection programs. The results indicate that co-op leak inspection programs can generate relative savings in labor costs (1.8–34.2%), transit time (0.6–38.6%), mileage (0.2–43.1%), vehicle emissions (0.01–4.0 tCO2), and driving risk (1.9–31.9%). The largest relative savings and efficiency gains resulting from co-op leak inspection programs were in regions with a high diversity of duty holders, which was confirmed with simulations of artificial O&G sites and road networks spanning diverse conditions. We also found reducing leak inspection time by 75% with streamlined methods can additionally reduce labor cost 8.8–41.1%, transit time 5.6–20.2%, and mileage 2.60–34.3% in co-op leak inspection programs. Overall, this study demonstrates that co-op leak inspection programs can be more efficient and cost-effective, particularly in regions with a large diversity of O&G duty holders, and that methods to reduce leak inspection time can create additional savings.
d
Acoustic detection for undersea oil leaks: Bubble sound characterization and...
search.dataone.org
data.griidc.org
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lu, Zhiqu (2025). Acoustic detection for undersea oil leaks: Bubble sound characterization and modeling dataset [Dataset]. http://doi.org/10.7266/3REPB7QM
Explore at:
Unique identifier
https://doi.org/10.7266/3REPB7QM
Dataset updated
Feb 5, 2025
Dataset provided by
GRIIDC
Authors
Lu, Zhiqu
Description
The U.S. outer continental shelf is a major source of energy for the United States. The rapid growth of oil and gas production in the Gulf of Mexico increases the risk of underwater oil spills at greater water depths and drilling wells. These hydrocarbon leakages can be caused by either natural events, such as seeping from fissures in the ocean seabed, or by anthropogenic accidents, such as leaking from broken wellheads and pipelines. In order to improve safety and reduce the environmental risks of offshore oil and gas operations, the Bureau of Safety and Environmental Enforcement recommended the use of real-time monitoring. An early warning system for detecting, locating, and characterizing hydrocarbon leakages is essential for preventing the next oil spill as well as for seafloor hydrocarbon seepage detection. Existing monitoring techniques have significant limitations and cannot achieve real-time monitoring. This project launches an effort to develop a functional real-time monitoring system that uses passive acoustic technologies to detect, locate, and characterize undersea hydrocarbon leakages over large areas in a cost-effective manner. In an oil spill event, the leaked hydrocarbon is injected into seawater with huge amounts of discharge at high speeds. With mixed natural gases and oils, this hydrocarbon leakage creates underwater sound through two major mechanisms: shearing and turbulence by a streaming jet of oil droplets and gas bubbles, and bubble oscillation and collapse. These acoustic emissions can be recorded by hydrophones in the water column at far distances. They will be characterized and differentiated from other underwater noises through their unique frequency spectrum, evolution and transportation processes and leaking positions, and further, be utilized to detect and position the leakage locations.

With the objective of leakage detection and localization, our approach consists of recording and modeling the acoustic signals induced by the oil spill and implementing advanced signal processing and triangulation localization techniques with a hydrophone network. Tasks of this project were:

Conduct a laboratory study to simulate hydrocarbon leakages and their induced sound under controlled conditions, and to establish the correlation between frequency spectra and leakage properties, such as oil-jet intensities and speeds, bubble radii and distributions, and crack sizes.

Implement and develop acoustic bubble modeling for estimating features and strength of the oil leakage.

Develop a set of advanced signal processing and triangulation algorithms for leakage detection and localization.

The experimental data have been collected in a water tank in the building of the National Center for Physical Acoustics, the University of Mississippi from 2018-2020, including hydrophone recorded underwater sounds generated by oil leakage bubbles under different testing conditions, such as pressures, flow rates, jet velocities, and crack sizes, and movies of oil leakages. Two types of oil leakages (a few bubbles and constant flow bubbles) were tested to simulate oil seepages either from seafloors or from oil wells and pipeline breaches. Two types of gases were investigated (nitrogen and methane). These data were analyzed for acoustic bubble modeling, oil leakage characterization, and localization. This dataset contains data for acoustic bubble sound modeling of nitrogen and methane, using a range of jet sizes and flow rates. The data for oil leakage source localization is available under GRIIDC Unique Dataset Identifier (UDI): S3.x911.000:0003 (https://doi.org/10.7266/4S9EBZKX).
QICS Paper: Detection and monitoring of leaked CO2 through sediment, water...
data.europa.eu
metadata.bgs.ac.uk
+2more
unknown
Updated Sep 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
British Geological Survey (BGS) (2024). QICS Paper: Detection and monitoring of leaked CO2 through sediment, water column and atmosphere in a sub-seabed CCS experiment [Dataset]. https://data.europa.eu/data/datasets/qics-paper-detection-and-monitoring-of-leaked-co2-through-sediment-water-column-and-atmosphere-/embed
Explore at:
unknownAvailable download formats
Dataset updated
Sep 17, 2024
Dataset provided by
British Geological Surveyhttps://www.bgs.ac.uk/
Authors
British Geological Survey (BGS)
Description
Carbon capture and storage in sub-seabed geological formations (sub-seabed CCS) is currently being studied as a realistic option to mitigate the accumulation of anthropogenic CO2 in the atmosphere. In implementing sub-seabed CCS, detecting and monitoring the impact of the sequestered CO2 on the ocean environment is highly important. The first controlled CO2 release experiment, Quantifying and Monitoring Potential Ecosystem Impacts of Geological Carbon Storage (QICS), took place in Ardmucknish Bay, Oban, in May–September 2012. We applied the in situ pH/pCO2 sensor to the QICS experiment for detection and monitoring of leaked CO2, and carried out several observations. The cabled real-time sensor was deployed close to the CO2 leakage (bubbling) area, and the fluctuations of in situ pH and pCO2 above the seafloor were monitored in a land-based container. The long-term sensor was placed on seafloor in three different observation zones. The sediment pH sensor was inserted into the sediment at a depth of 50 cm beneath the seafloor near the CO2 leakage area. Wide-area mapping surveys of pH and pCO2 in water column around the CO2 leakage area were carried out by using an autonomous underwater vehicle (AUV) installed with sensors. Atmospheric CO2 above the leakage area was observed by using a CO2 analyzer that was attached to the bow of ship of 50 cm above the sea-surface. The behavior of the leaked CO2 is highly dependent on the tidal periodicity (low tide or high tide) during the CO2 gas release period. At low tide, the pH in sediment and overlying seawater decreased due to strong eruption of CO2 gas bubbles, and the CO2 ascended to sea-surface quickly with a little dissolution to seawater and dispersed into the atmosphere. On the other hand, the CO2 bubbles release was lower at high tide due to higher water pressure, and slight low pH seawater and high atmospheric CO2 were detected. After stopping CO2 gas injection, no remarkable variations of pH in sediment and overlying water column were observed for three months. This is a publication in QICS Special Issue - International Journal of Greenhouse Gas Control, Kiminori Shitashima et. al. Doi: 10.1016/j.ijggc.2014.12.011.
Biggest risks to businesses worldwide 2018-2025
statista.com
ai-chatbox.pro
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Biggest risks to businesses worldwide 2018-2025 [Dataset]. https://www.statista.com/statistics/422171/leading-business-risks-globally/
Explore at:
Dataset updated
May 20, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
Cyber incidents were the leading risk to businesses globally for 2025, according to a survey carried out among risk management experts in late 2024. These cyber incidents refer to things such as cyber crime, IT failure or outages, data breaches, and fines and penalties. The global cyber insurance market is forecast to grow consistently in coming years. What is cyber crime? Cyber crime refers to any criminal activities carried out through the use of a computer, a digital network, or the internet. As of January 2024, the biggest reported data leak to occur in the past few years was the 2020 hack of the online platform Cam4, which affected more than 10 billion user accounts. In 2020, the Global Cybersecurity Index (GCI) ranked the United States as the country with the highest commitment to cyber security. Cyber attacks in the U.S. Instances of cyber crime has been on the rise in recent years, with the annual number of data breaches in the U.S. reaching a total of over 3,200 in 2023. At the same time, about 350 million individuals were seemingly affected by record exposure. In 2023, the most common type of cyber attack experienced by U.S.-based companies was network intrusion. Network intrusion refers to unauthorized access to a corporate network.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). All-time biggest online data breaches 2025 [Dataset]. https://www.statista.com/statistics/290525/cyber-crime-biggest-online-data-breaches-worldwide/

All-time biggest online data breaches 2025

Explore at:

36 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

May 26, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Jan 2025

Area covered

Worldwide

Description

The largest reported data leakage as of January 2025 was the Cam4 data breach in March 2020, which exposed more than 10 billion data records. The second-largest data breach in history so far, the Yahoo data breach, occurred in 2013. The company initially reported about one billion exposed data records, but after an investigation, the company updated the number, revealing that three billion accounts were affected. The National Public Data Breach was announced in August 2024. The incident became public when personally identifiable information of individuals became available for sale on the dark web. Overall, the security professionals estimate the leakage of nearly three billion personal records. The next significant data leakage was the March 2018 security breach of India's national ID database, Aadhaar, with over 1.1 billion records exposed. This included biometric information such as identification numbers and fingerprint scans, which could be used to open bank accounts and receive financial aid, among other government services.

Cybercrime - the dark side of digitalization As the world continues its journey into the digital age, corporations and governments across the globe have been increasing their reliance on technology to collect, analyze and store personal data. This, in turn, has led to a rise in the number of cyber crimes, ranging from minor breaches to global-scale attacks impacting billions of users – such as in the case of Yahoo. Within the U.S. alone, 1802 cases of data compromise were reported in 2022. This was a marked increase from the 447 cases reported a decade prior. The high price of data protection As of 2022, the average cost of a single data breach across all industries worldwide stood at around 4.35 million U.S. dollars. This was found to be most costly in the healthcare sector, with each leak reported to have cost the affected party a hefty 10.1 million U.S. dollars. The financial segment followed closely behind. Here, each breach resulted in a loss of approximately 6 million U.S. dollars - 1.5 million more than the global average.

Clear search

Close search

Google apps

Main menu

All-time biggest online data breaches 2025

"Pwned Passwords" Dataset

AOL Search Data 20M web queries (2006)

CrackStation's Password Cracking Dictionary (Human Passwords Only)

Data from: Rockyou

Using Decision Trees to Detect and Isolate Leaks in the J-2X

DroidLeaks: A Large Collection of Resource Leak Bugs in Real-World Android...

Eximpedia Export Import Trade

Global exporters importers-export import data of Helium leak detector

Global exporters importers-export import data of Leak detector and Hsn Code...

Data from: Leak-resilient enzyme-free nucleic acid dynamical systems through...

Abbreviations

Basic commands

Produce-Helper Leak mechanism

Replication Data and Code for \"Incentives and Information in Methane Leak...

Data from: Using controlled subsurface releases to investigate the effect of...

Replication Data for: Committee Decision-Making under the Threat of Leaks

Replication Data for: A cooperative model to lower cost and increase the...

Acoustic detection for undersea oil leaks: Bubble sound characterization and...

QICS Paper: Detection and monitoring of leaked CO2 through sediment, water...

Biggest risks to businesses worldwide 2018-2025

All-time biggest online data breaches 2025