9 datasets found
  1. Yahoo Password Frequency Corpus

    • kaggle.com
    zip
    Updated Jan 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imani (2019). Yahoo Password Frequency Corpus [Dataset]. https://www.kaggle.com/datasets/drwardog/yahoo-password-frequency-corpus
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jan 14, 2019
    Authors
    Imani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset includes sanitized password frequency lists collected from Yahoo in May 2011.

    For details of the original collection experiment, please see:

    Bonneau, Joseph. "The science of guessing: analyzing an anonymized corpus of 70 million passwords." IEEE Symposium on Security & Privacy, 2012. http://www.jbonneau.com/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdf

    This data has been modified to preserve differential privacy. For details of this modification, please see:

    Jeremiah Blocki, Anupam Datta and Joseph Bonneau. "Differentially Private Password Frequency Lists." Network & Distributed Systems Symposium (NDSS), 2016. http://www.jbonneau.com/doc/BDB16-NDSS-pw_list_differential_privacy.pdf

    Each of the 51 .txt files represents one subset of all users' passwords observed during the experiment period. "yahoo-all.txt" includes all users; every other file represents a strict subset of that group.

    Content

    Each file is a series of lines of the format:

    FREQUENCY #OBSERVATIONS ...

    with FREQUENCY in descending order. For example, the file:

    3 1 2 1 1 3

    would represent a the frequency list (3, 2, 1, 1, 1), that is, one password observed 3 times, one observed twice, and three separate passwords observed once each.

  2. P

    Password Management Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2024). Password Management Market Report [Dataset]. https://www.promarketreports.com/reports/password-management-market-7993
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Nov 22, 2024
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Product insights in the report cover various aspects of password management solutions, including:Types: Self-service password management, privileged user password management, cloud-based password management, on-premises password managementFeatures: Password storage, password generation, password autofill, multi-factor authentication, single sign-onDeployment models: Cloud, on-premises, hybrid Recent developments include: July 2022: Google updated its password managers by integrating various highly requested features to help consumers, like auto-login, credential saving, and password generation. This led to enhanced market growth owing to the higher utilization of the Google Chrome browser for web surfing and remote working., June 2022: Lookout Inc. acquired SaferPass, offering simple and secure password managers for enterprises and individuals. The acquisition helps in delivering proactive security platforms to safeguard user data and privacy while broadening the business footprint., January 2022: Keepers Security launched Secrets Manager, which secured infrastructure credentials like API keys, certificates, access keys, and database passwords. The solution included cloud-based integration with a zero-knowledge security model similar to their enterprise password management platform..

  3. c

    Global Database Security Market Report 2025 Edition, Market Size, Share,...

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Global Database Security Market Report 2025 Edition, Market Size, Share, CAGR, Forecast, Revenue [Dataset]. https://www.cognitivemarketresearch.com/database-security-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    Market Summary of Database Security Market:

    • The Global Database Security market size in 2023 was XX Million. The Database Security Industry's compound annual growth rate (CAGR) will be XX% from 2024 to 2031. • The database security industry is growing faster and is expected to expand at a faster rate due to these strict regulatory frameworks. Also, the increase in advanced technology for better protection of data is driving the growth of the Database security market. • The dominating segment is the software. It includes encryption, auditing, tokenization, data masking, and access control management. • Due to the increase in internet users, remote working demand, and risk of data breaches, the COVID-19 pandemic has had a beneficial effect on the market for data security solutions. • The database security market is dominated by North America in terms of both revenue and market share. This can be attributed to the region's concentration of significant industry participants and increasing technical advancements in their product line.

    Market Dynamics of Database Security Market:

    Key Drivers of Database Security Market:

    An increase in advanced technology for better protection of data is driving the growth of the Database security market
    

    Retail, banking, healthcare, and government are just a few of the industries where a strong data security plan could help companies stay compliant and lower their exposure to threats. When data is used by the principles of availability, confidentiality, and integrity, it becomes the most precious resource that aids in decision-making, strategic endeavor execution, and the development of closer relationships between companies and their clients. For Instance, Records from thousands of people assembled and reindexed leaks, breaches, and privately sold databases are part of a supermassive Mother of all Breaches or MOAB. The huge release includes information from multiple earlier breaches, totaling an incredible 12 gigabytes of data covering an incredible 26 billion records. The leak is most likely the biggest to be found to date and includes user data from Tencent, Weibo, LinkedIn, Twitter, and other networks.(Source: https://cybernews.com/security/billions-passwords-credentials-leaked-mother-of-all-breaches/) Hence, the protection of data is of utmost importance in almost all sectors. Hardware-based security, data backup and resilience, data erasure, data masking, encryption, firewalls, and authentication and authorization are examples of data security technologies. It is essential to corporate development, operations, and financing. Companies can better comply with regulatory standards and avoid data breaches and reputational harm by securing their data. Data is locked up by modern encryption methods with a single key, making it only accessible to the key holder. AES-compliant standards are used by many databases to encrypt data. These remedies are the most robust against hardware loss, possibly due to theft. The data is protected even if the encryption key is incorrect. For Instance, An innovative method for protecting personal information for use with generative artificial intelligence has been released, according to security company Baffle. Assuring that their regulated data is compliant and cryptographically safe, Baffle Data Protection for AI interacts with current data pipelines to help businesses expedite generative AI initiatives. According to Baffle, the method encrypts sensitive data using the advanced encryption standard (AES) algorithm so that outside parties cannot view private information in plaintext. (Source: https://baffle.io/news/baffle-releases-encryption-solution-to-secure-data-for-generative-ai/) Hence, technology is playing an important role in reducing data breaches and protecting data, which is eventually increasing the market for database security as many companies require data protection.

    The Database Security Market is driven by the strict regulatory framework to address information security
    

    Regulatory frameworks can establish standards that developers and users must follow to guarantee a secure database. The market is growing as a result of increasingly stringent regulations enforced globally to protect sensitive data by governments and other relevant authorities in numerous nations. ...

  4. Z

    Data from: Malware Finances and Operations: a Data-Driven Study of the Value...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brumley, Billy (2023). Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8047204
    Explore at:
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    Brumley, Billy
    Nurmi, Juha
    Niemelä, Mikko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    The datasets demonstrate the malware economy and the value chain published in our paper, Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access, at the 12th International Workshop on Cyber Crime (IWCC 2023), part of the ARES Conference, published by the International Conference Proceedings Series of the ACM ICPS.

    Using the well-documented scripts, it is straightforward to reproduce our findings. It takes an estimated 1 hour of human time and 3 hours of computing time to duplicate our key findings from MalwareInfectionSet; around one hour with VictimAccessSet; and minutes to replicate the price calculations using AccountAccessSet. See the included README.md files and Python scripts.

    We choose to represent each victim by a single JavaScript Object Notation (JSON) data file. Data sources provide sets of victim JSON data files from which we've extracted the essential information and omitted Personally Identifiable Information (PII). We collected, curated, and modelled three datasets, which we publish under the Creative Commons Attribution 4.0 International License.

    1. MalwareInfectionSet We discover (and, to the best of our knowledge, document scientifically for the first time) that malware networks appear to dump their data collections online. We collected these infostealer malware logs available for free. We utilise 245 malware log dumps from 2019 and 2020 originating from 14 malware networks. The dataset contains 1.8 million victim files, with a dataset size of 15 GB.

    2. VictimAccessSet We demonstrate how Infostealer malware networks sell access to infected victims. Genesis Market focuses on user-friendliness and continuous supply of compromised data. Marketplace listings include everything necessary to gain access to the victim's online accounts, including passwords and usernames, but also detailed collection of information which provides a clone of the victim's browser session. Indeed, Genesis Market simplifies the import of compromised victim authentication data into a web browser session. We measure the prices on Genesis Market and how compromised device prices are determined. We crawled the website between April 2019 and May 2022, collecting the web pages offering the resources for sale. The dataset contains 0.5 million victim files, with a dataset size of 3.5 GB.

    3. AccountAccessSet The Database marketplace operates inside the anonymous Tor network. Vendors offer their goods for sale, and customers can purchase them with Bitcoins. The marketplace sells online accounts, such as PayPal and Spotify, as well as private datasets, such as driver's licence photographs and tax forms. We then collect data from Database Market, where vendors sell online credentials, and investigate similarly. To build our dataset, we crawled the website between November 2021 and June 2022, collecting the web pages offering the credentials for sale. The dataset contains 33,896 victim files, with a dataset size of 400 MB.

    Credits Authors

    Billy Bob Brumley (Tampere University, Tampere, Finland)

    Juha Nurmi (Tampere University, Tampere, Finland)

    Mikko Niemelä (Cyber Intelligence House, Singapore)

    Funding

    This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under project numbers 804476 (SCARE) and 952622 (SPIRS).

    Alternative links to download: AccountAccessSet, MalwareInfectionSet, and VictimAccessSet.

  5. SQL Injection Attack Netflow

    • zenodo.org
    • data.niaid.nih.gov
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ignacio Crespo; Ignacio Crespo; Adrián Campazas; Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. http://doi.org/10.5281/zenodo.6907252
    Explore at:
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ignacio Crespo; Ignacio Crespo; Adrián Campazas; Adrián Campazas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

    NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

    Datasets

    The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

    The datasets contain both benign and malicious traffic. All collected datasets are balanced.

    The version of NetFlow used to build the datasets is 5.

    DatasetAimSamplesBenign-malicious
    traffic ratio
    D1Training400,00350%
    D2Test57,23950%

    Infrastructure and implementation

    Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

    DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

    Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

    The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

    The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

    ParametersDescription
    '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'Enumerate users, password hashes, privileges, roles, databases, tables and columns
    --level=5Increase the probability of a false positive identification
    --risk=3Increase the probability of extracting data
    --random-agentSelect the User-Agent randomly
    --batchNever ask for user input, use the default behavior
    --answers="follow=Y"Predefined answers to yes

    Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

    The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24.
    The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

    However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

    To run the MySQL server we ran MariaDB version 10.4.12.
    Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  6. Identity as Service Market By Component Type (Provisioning, Directory...

    • zionmarketresearch.com
    pdf
    Updated Mar 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Identity as Service Market By Component Type (Provisioning, Directory services, Password management , Single sign-on , Advanced authentication and Others) , By End Use (BFSI, IT & Telecom, Public ,Healthcare, Retail, Education, and Manufacturing): Global Industry Perspective, Comprehensive Analysis, and Forecast, 2024 - 2032 [Dataset]. https://www.zionmarketresearch.com/report/identity-as-a-service-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    Authors
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Identity as Service Market size is set to expand from $ 6.53 Billion in 2023 to $ 57.73 Billion by 2032, with CAGR of around 27.4% from 2024 to 2032.

  7. Identity and Access Management Market By Component (Single Sign-On,...

    • zionmarketresearch.com
    pdf
    Updated Mar 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Identity and Access Management Market By Component (Single Sign-On, Directory Services, Advanced Authentication, Audit, Compliance, and Governance, and Password Management), By Deployment (On-Premises and Cloud), and By Vertical (BFSI, Telecom and IT, Retail and CPG, Public Sector and Utilities, Energy, Education, Manufacturing, Healthcare and Life Sciences, and Others): Global Industry Perspective, Comprehensive Analysis, and Forecast, 2024-2032. [Dataset]. https://www.zionmarketresearch.com/report/identity-access-management-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    Authors
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Identity and Access Management Market size valued at US$ 18.30 Billion in 2023, set to reach US$ 67.04 Billion by 2032 at a CAGR of about 15.52% from 2024 to 2032.

  8. Blockchain Identity Management Market by Deployment (Public, Private, and...

    • zionmarketresearch.com
    pdf
    Updated Mar 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zion Market Research (2025). Blockchain Identity Management Market by Deployment (Public, Private, and Hybrid Cloud), by Application (Multi-Factor Authentication, Password Management, Access Management, Directory Services, and Others), and by Vertical (BFSI, Retail, Logistics, Healthcare, Government, IT and Telecom, Media and Entertainment, Travel and Hospitality, and Others): Global Industry Perspective, Comprehensive Analysis, and Forecast, 2017-2024 [Dataset]. https://www.zionmarketresearch.com/report/blockchain-identity-management-market
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 17, 2025
    Dataset provided by
    Authors
    Zion Market Research
    License

    https://www.zionmarketresearch.com/privacy-policyhttps://www.zionmarketresearch.com/privacy-policy

    Time period covered
    2022 - 2030
    Area covered
    Global
    Description

    Global Blockchain Identity Management market size earned around $1.38 Bn in 2023 and is expected to reach $495.46 Bn by 2032, with a projected CAGR of 92.30%.

  9. Penalties issued to Meta for EU GDPR violations 2024

    • statista.com
    Updated Nov 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Penalties issued to Meta for EU GDPR violations 2024 [Dataset]. https://www.statista.com/statistics/1192794/meta-fines-from-eu-and-dpc/
    Explore at:
    Dataset updated
    Nov 15, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2022 - Sep 2024
    Area covered
    Europe
    Description

    In September 2024, the Irish Data Protection Commission fined Meta Ireland 91 million euros after passwords of social media users were stored in 'plaintext' on Meta's internal systems rather than with cryptographic protection or encryption. In May 2023, the EU fined Meta 1.2 billion euros for violating laws on digital privacy and putting the data of EU citizens at risk through Facebook's EU-U.S. data transfers. European privacy legislation is seen as being far stricter than American privacy law, and the sending of EU citizens’ data to the United States resulted in the record breaking penalty being issued to the tech giant. In January 2023, after it was discovered that Meta Platforms had improperly required that users of Facebook, Instagram, and WhatsApp accept personalized adverts to use the platforms, the company was issued a 390 million euro fine by the European Commission. EU regulators claim that the social media giant broke the General Data Protection Regulation (GDPR) by including the demand in its terms of service. In addition, Meta was fined 405 million euros by the Irish Data Protection Commission (DPC) in September 2022 for violating Instagram's children's privacy settings. In November 2022, the DPC fined Meta a further 265 million euros for failing to protect their users from data scraping. GDPR violations in 2022 Social media sites and companies are not the only types of online services upon which users' data can potentially be compromised. In 2022, the online service with the biggest fine for violating GDPR was e-commerce and digital powerhouse Amazon, which was issued a 746 million euro fine. Furthermore, in December 2021, Google was penalized 90 million euros for GDPR violations. What are the most common GDPR violations? Since GDPR went into effect in May 2018, fines have been imposed for a variety of reasons. As of June 2022, companies' non-compliance with general data processing principles accounted for the largest share of fines, resulting in over 845 million euros worth of penalties. Insufficient legal basis for data processing was the second most common violation, amounting to 447 million euros in fines.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Imani (2019). Yahoo Password Frequency Corpus [Dataset]. https://www.kaggle.com/datasets/drwardog/yahoo-password-frequency-corpus
Organization logo

Yahoo Password Frequency Corpus

This dataset is password frequency lists collected from Yahoo, May 2011.

Explore at:
zip(0 bytes)Available download formats
Dataset updated
Jan 14, 2019
Authors
Imani
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

This dataset includes sanitized password frequency lists collected from Yahoo in May 2011.

For details of the original collection experiment, please see:

Bonneau, Joseph. "The science of guessing: analyzing an anonymized corpus of 70 million passwords." IEEE Symposium on Security & Privacy, 2012. http://www.jbonneau.com/doc/B12-IEEESP-analyzing_70M_anonymized_passwords.pdf

This data has been modified to preserve differential privacy. For details of this modification, please see:

Jeremiah Blocki, Anupam Datta and Joseph Bonneau. "Differentially Private Password Frequency Lists." Network & Distributed Systems Symposium (NDSS), 2016. http://www.jbonneau.com/doc/BDB16-NDSS-pw_list_differential_privacy.pdf

Each of the 51 .txt files represents one subset of all users' passwords observed during the experiment period. "yahoo-all.txt" includes all users; every other file represents a strict subset of that group.

Content

Each file is a series of lines of the format:

FREQUENCY #OBSERVATIONS ...

with FREQUENCY in descending order. For example, the file:

3 1 2 1 1 3

would represent a the frequency list (3, 2, 1, 1, 1), that is, one password observed 3 times, one observed twice, and three separate passwords observed once each.

Search
Clear search
Close search
Google apps
Main menu