100+ datasets found
  1. Global number of breached user accounts Q1 2020-Q3 2025

    • statista.com
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global number of breached user accounts Q1 2020-Q3 2025 [Dataset]. https://www.statista.com/statistics/1307426/number-of-data-breaches-worldwide/
    Explore at:
    Dataset updated
    Oct 14, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    During the third quarter of 2025, data breaches exposed more than ** million records worldwide. Since the first quarter of 2020, the highest number of data records were exposed in the third quarter of ****, more than **** billion data sets. Data breaches remain among the biggest concerns of company leaders worldwide. The most common causes of sensitive information loss were operating system vulnerabilities on endpoint devices. Which industries see the most data breaches? Meanwhile, certain conditions make some industry sectors more prone to data breaches than others. According to the latest observations, the public administration experienced the highest number of data breaches between 2021 and 2022. The industry saw *** reported data breach incidents with confirmed data loss. The second were financial institutions, with *** data breach cases, followed by healthcare providers. Data breach cost Data breach incidents have various consequences, the most common impact being financial losses and business disruptions. As of 2023, the average data breach cost across businesses worldwide was **** million U.S. dollars. Meanwhile, a leaked data record cost about *** U.S. dollars. The United States saw the highest average breach cost globally, at **** million U.S. dollars.

  2. S

    Hacking Statistics By Cost, Email, Social Media Hacking and Key Hacking...

    • sci-tech-today.com
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sci-Tech Today (2025). Hacking Statistics By Cost, Email, Social Media Hacking and Key Hacking Prevention [Dataset]. https://www.sci-tech-today.com/stats/hacking-statistics/
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset authored and provided by
    Sci-Tech Today
    License

    https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Introduction

    Hacking Statistics: In 2024, cybercrime continues to be a growing concern globally, with hacking as one of the most prevalent forms of cyber threats. Hackers have become increasingly sophisticated, targeting both individuals and organisations. The rise in digital activities has led to an increase in hacking incidents, affecting individuals, businesses, and governments worldwide.

    Recent statistics reveal that hacking is responsible for a significant percentage of data breaches, which cause billions of dollars in damages. Understanding the latest hacking trends is crucial for implementing effective security measures to safeguard personal and organisational data.

  3. Global cyber incidents 2024, by type

    • statista.com
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global cyber incidents 2024, by type [Dataset]. https://www.statista.com/statistics/1483769/global-cyber-incidents-by-type/
    Explore at:
    Dataset updated
    May 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Oct 2023 - Sep 2024
    Area covered
    Worldwide
    Description

    As of September 2024, almost 30 percent of cyber incidents detected in the past 12 months were hacking incidents. A further 28.7 percent were incidents of misuse, and 15.2 percent of detections revealed malware attacks.

  4. 🌐 Global Cybersecurity Threats (2015-2024)

    • kaggle.com
    zip
    Updated Mar 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Soundankar (2025). 🌐 Global Cybersecurity Threats (2015-2024) [Dataset]. https://www.kaggle.com/datasets/atharvasoundankar/global-cybersecurity-threats-2015-2024
    Explore at:
    zip(48178 bytes)Available download formats
    Dataset updated
    Mar 16, 2025
    Authors
    Atharva Soundankar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📂

    The Global Cybersecurity Threats Dataset (2015-2024) provides extensive data on cyberattacks, malware types, targeted industries, and affected countries. It is designed for threat intelligence analysis, cybersecurity trend forecasting, and machine learning model development to enhance global digital security.

    📊 Column Descriptions

    Column NameDescription
    CountryCountry where the attack occurred
    YearYear of the incident
    Threat TypeType of cybersecurity threat (e.g., Malware, DDoS)
    Attack VectorMethod of attack (e.g., Phishing, SQL Injection)
    Affected IndustryIndustry targeted (e.g., Finance, Healthcare)
    Data Breached (GB)Volume of data compromised
    Financial Impact ($M)Estimated financial loss in millions
    Severity LevelLow, Medium, High, Critical
    Response Time (Hours)Time taken to mitigate the attack
    Mitigation StrategyCountermeasures taken
  5. Number of data compromises and impacted individuals in U.S. 2005-2024

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Number of data compromises and impacted individuals in U.S. 2005-2024 [Dataset]. https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.

  6. Cyber Security

    • kaggle.com
    zip
    Updated Jan 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishi Kumar (2024). Cyber Security [Dataset]. https://www.kaggle.com/datasets/rishikumarrajvansh/cyber-security
    Explore at:
    zip(8913512 bytes)Available download formats
    Dataset updated
    Jan 29, 2024
    Authors
    Rishi Kumar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Business Context: We are in a time where businesses are more digitally advanced than ever, and as technology improves, organizations’ security postures must be enhanced as well. Failure to do so could result in a costly data breach, as we’ve seen happen with many businesses. The cybercrime landscape has evolved, and threat actors are going after any type of organization, so in order to protect your business’s data, money and reputation, it is critical that you invest in an advanced security system. Cyber security can be described as the collective methods, technologies, and processes to help protect the confidentiality, integrity, and availability of computer systems, networks and data, against cyber-attacks or unauthorized access. a. Information Security vs. Cyber Security vs. Network Security: Information security (also known as InfoSec) ensures that both physical and digital data is protected from unauthorized access, use, disclosure, disruption, modification, inspection, recording or destruction. Information security differs from cyber security in that InfoSec aims to keep data in any form secure, whereas cyber security protects only digital data. Cyber security, a subset of information security, is the practice of defending your organization’s networks, computers and data from unauthorized digital access, attack or damage by implementing various processes, technologies and practices. With the countless sophisticated threat actors targeting all types of organizations, it is critical that your IT infrastructure is secured at all times to prevent a full-scale attack on your network and risk exposing your company’ data and reputation. Network security, a subset of cyber security, aims to protect any data that is being sent through devices in your network to ensure that the information is not changed or intercepted. The role of network security is to protect the organization’s IT infrastructure from all types of cyber threats including: Viruses, worms and Trojan horses a. Zero-day attacks b. Hacker attacks c. Denial of service attacks d. Spyware and adware Your network security team implements the hardware and software necessary to guard your security architecture. With the proper network security in place, your system can detect emerging threats before they infiltrate your network and compromise your data. There are many components to a network security system that work together to improve your security posture. The most common network security components include: a. Firewalls b. Anti-virus software c. Intrusion detection and prevention systems (IDS/IPS) d. Virtual private networks (VPN) Network Intrusions vs. Computer intrusions vs. Cyber Attacks 1. Computer Intrusions: Computer intrusions occur when someone tries to gain access to any part of your computer system. Computer intruders or hackers typically use automated computer programs when they try to compromise a computer’s security. There are several ways an intruder can try to gain access to your computer. They can Access your a. Computer to view, change, or delete information on your computer, b. Crash or slow down your computer c. Access your private data by examining the files on your system d. Use your computer to access other computers on the Internet. 2. Network Intrusions: A network intrusion refers to any unauthorized activity on a digital network. Network intrusions often involve stealing valuable network resources and almost always jeopardize the security of networks and/or their data. In order to proactively detect and respond to network intrusions, organizations and their cyber security teams need to have a thorough understanding of how network intrusions work and implement network intrusion, detection, and response systems that are designed with attack techniques and cover-up methods in mind. Network Intrusion Attack Techniques: Given the amount of normal activity constantly taking place on digital networks, it can be very difficult to pinpoint anomalies that could indicate a network intrusion has occurred. Below are some of the most common network intrusion attack techniques that organizations should continually look for: Living Off the Land: Attackers increasingly use existing tools and processes and stolen credentials when compromising networks. These tools like operating system utilities, business productivity software and scripting languages are clearly not malware and have very legitimate usage as well. In fact, in most cases, the vast majority of the usage is business justified, allowing an attacker to blend in. Multi-Routing: If a network allows for asymmetric routing, attackers will often leverage multiple routes to access the targeted device or network. This allows them to avoid being detected by having a large portion of suspicious packets bypass certain network segments and any relevant network intrusion systems. Buffer Overwrit...

  7. S

    Global Password Hacking Software Market Risk Analysis 2025-2032

    • statsndata.org
    excel, pdf
    Updated Oct 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stats N Data (2025). Global Password Hacking Software Market Risk Analysis 2025-2032 [Dataset]. https://www.statsndata.org/report/password-hacking-software-market-59196
    Explore at:
    pdf, excelAvailable download formats
    Dataset updated
    Oct 2025
    Dataset authored and provided by
    Stats N Data
    License

    https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order

    Area covered
    Global
    Description

    The Password Hacking Software market has evolved significantly in recent years as both a response to and a driver of increasing cybersecurity threats. This software is primarily utilized by security professionals and ethical hackers to assess the robustness of password security systems, identify vulnerabilities, and

  8. U.S. health data breaches caused by hacking 2014 - H1 2024

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). U.S. health data breaches caused by hacking 2014 - H1 2024 [Dataset]. https://www.statista.com/statistics/972228/health-data-breaches-caused-by-hacking-us/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In the first half of 2024, the share of health-related U.S. data breaches caused by hacking was ** percent, which marked a *** percent increase from 2023, reaching its highest rate since 2014.

  9. m

    Data Breaches Statistics and Facts

    • market.biz
    Updated Nov 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market.biz (2025). Data Breaches Statistics and Facts [Dataset]. https://market.biz/data-breaches-statistics/
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    Market.biz
    License

    https://market.biz/privacy-policyhttps://market.biz/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Europe, North America, South America, Australia, Africa, ASIA
    Description

    Introduction

    Data Breaches Statistics: In recent years, data breaches have emerged as a major threat to both businesses and individuals. As the digital world grows, the frequency, scale, and impact of these breaches have surged, resulting in significant financial, reputational, and legal repercussions for organizations. The number of data breaches hit record highs, compromising millions of sensitive records.

    This increase can be attributed to several factors, including rising cybercrime, inadequate data security practices, and the growing sophistication of hacking techniques. Data from cybersecurity experts reveal a notable rise in breaches within the healthcare, financial, and retail industries.

    As the fallout from these breaches intensifies, it has become increasingly important for both businesses and consumers to understand the trends and scope of these incidents. This introduction will explore the latest statistics and developments, offering insights into the evolving landscape of data breaches and their wide-reaching effects.

  10. Multi-Step Cyber-Attack Dataset (MSCAD)

    • kaggle.com
    Updated Jun 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Jamail Al-Sawwa (2022). Multi-Step Cyber-Attack Dataset (MSCAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/3830715
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 19, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dr. Jamail Al-Sawwa
    Description

    There are seven files in this dataset: MSCAD.xlsx, N-0, Scan-1, App-01, App-02, W-B-01, W-B-02:

    • MSCAD.xlsx: MSCAD.xlsx presents the labeled version of the dataset. The six PCAP files were processed using Wireshark. Throughout the processing, we analyzed the timestamp of the network traffic (malicious and normal traffic) in order to label the network traffic. After processing these PCAP files, the generated dataset (MSCAD) contains 77 features (network parameters) with labels.

    • N-0: N-0 presents (Normal traffic).

    • Scan-1: Scan-1 presents (Port Scan Traffic [Full, SYN, FIN, and UDP Scan]).

    • App-01: App-01 presents (App-based DDoS [HTTP Slowloris DDoS]).

    • App-02: App-02 presents (Volume-based DDoS [ICMP Flood]).

    • W-B-01: W-B-01 presents (Web Crawling).

    • W-B-02: W-B-02 presents (Password Cracking [Brute Force]).

      The MSCAD includes two multi-step cyber-attacks scenarios. The two multi-step attack scenarios were performed as follows:

    • Multi-step Attack Scenario A: In this scenario, an attacker aims to perform a password cracking attack (Brute force) on any host within the victim network. The attacker executes this attack in three main sequential steps. Firstly, the port scan was executed simultaneously. Secondly, the HTTrack Website Copier was used as a website crawler tool to take an offline copy of the web application pages. Using a password list of 47 entries and a user list of 10 entries resulted in 470 attempts to crack the password. Finally, the Brute force script was executed.

    • Multi-step Attack Scenario B: In scenario B, the attacker aims to execute the volume-based DDoS on any host within the victim network. The volume-based DDoS was performed based on three sequential steps. The first step of the volume-based DDoS attack is to execute the port scan attack (Full, SYN, FIN, and UDP Scan) simultaneously. Then, the next step is to launch the APP-based DDoS attack using HTTP Slowloris DDoS attack. Finally, executing the volume-based DDoS attack using the Radware tool. This scenario took an hour and three hosts 192.168.159.131, 192.168.159.14, and 192.168.159.16) were infected by the volume-based DDoS attack.

    The MSCAD dataset is publicly available for researchers. If you are using our dataset, you should cite our related research paper that outlines the details of the dataset and its underlying principles:

    **Link to Paper: **Generating a Benchmark Cyber Multi-Step Attacks Dataset for Intrusion Detection

    **Citation: ** 1) Almseidin, Mohammad, Al-Sawwa, Jamil, and Alkasassbeh, Mouhammd. ‘Generating a Benchmark Cyber Multi-step Attacks Dataset for Intrusion Detection’. 1 Jan. 2022 : 1 – 15.

    2) Dr. Jamil Al-Sawwa, Dr. Mohammad Almseidin, & Dr. Mouhammd Alkasassbeh. (2022). Multi-Step Cyber-Attack Dataset (MSCAD) [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/3830715

  11. d

    Data from: Health IT, hacking, and cybersecurity: national trends in data...

    • datadryad.org
    • search.dataone.org
    zip
    Updated May 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jay G. Ronquillo; J. Erik Winterholler; Kamil Cwikla; Raphael Szymanski; Christopher Levy (2019). Health IT, hacking, and cybersecurity: national trends in data breaches of protected health information [Dataset]. http://doi.org/10.5061/dryad.24275c6
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 25, 2019
    Dataset provided by
    Dryad
    Authors
    Jay G. Ronquillo; J. Erik Winterholler; Kamil Cwikla; Raphael Szymanski; Christopher Levy
    Time period covered
    May 24, 2018
    Description

    Objective: The rapid adoption of health information technology (IT) coupled with growing reports of ransomware, and hacking has made cybersecurity a priority in health care. This study leverages federal data in order to better understand current cybersecurity threats in the context of health IT.

    Materials and Methods: Retrospective observational study of all available reported data breaches in the United States from 2013 to 2017, downloaded from a publicly available federal regulatory database.

    Results: There were 1512 data breaches affecting 154 415 257 patient records from a heterogeneous distribution of covered entities (P < .001). There were 128 electronic medical record-related breaches of 4 867 920 patient records, while 363 hacking incidents affected 130 702 378 records.

    Discussion and Conclusion: Despite making up less than 25% of all breaches, hacking was responsible for nearly 85% of all affected patient records. As medicine becomes increasingly interconnected and inform...

  12. U.S. number of data sets affected in data breaches Q1 2020-Q2 2025

    • statista.com
    Updated Mar 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ani Petrosyan (2025). U.S. number of data sets affected in data breaches Q1 2020-Q2 2025 [Dataset]. https://www.statista.com/topics/3387/us-government-and-cyber-crime/
    Explore at:
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Ani Petrosyan
    Area covered
    United States
    Description

    Between the third quarter of 2024 and the second quarter of 2025, the number of records exposed in data breaches in the United States decreased significantly. In the most recent measured period, over 16.9 million records were reported as leaked, down from around 494.17 million in the third quarter of 2024.

  13. m

    Data from: Cyber Attack Evaluation Dataset for Deep Packet Inspection and...

    • data.mendeley.com
    Updated Oct 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shishir Kumar Shandilya (2022). Cyber Attack Evaluation Dataset for Deep Packet Inspection and Analysis [Dataset]. http://doi.org/10.17632/3szjvt3w78.1
    Explore at:
    Dataset updated
    Oct 18, 2022
    Authors
    Shishir Kumar Shandilya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To determine the effectiveness of any defense mechanism, there is a need for comprehensive real-time network data that solely references various attack scenarios based on older software versions or unprotected ports, and so on. This presented dataset has entire network data at the time of several cyber attacks to enable experimentation on challenges based on implementing defense mechanisms on a larger scale. For collecting the data, we captured the network traffic of configured virtual machines using Wireshark and tcpdump. To analyze the impact of several cyber attack scenarios, this dataset presents a set of ten computers connected to Router1 on VLAN1 in a Docker Bridge network, that try and exploit each other. It includes browsing the web and downloading foreign packages including malicious ones. Also, services like FTP and SSH were exploited using several attack mechanisms. The presented dataset shows the importance of updating and patching systems to protect themselves to a greater extent, by following attack tactics on older versions of packages as compared to the newer and updated ones. This dataset also includes an Apache Server hosted on the different subset on VLAN2 which is connected to the VLAN1 to demonstrate isolation and cross-VLAN communication. The services on this web server were also exploited by the previously stated ten computers. The attack types include: Distributed Denial of Service, SQL Injection, Account Takeover, Service Exploitation (SSH, FTP), DNS and ARP Spoofing, Scanning and Firewall Searching and Indexing (using Nmap), Hammering the services to brute-force passwords and usernames, Malware attack, Spoofing and Man-in-the-Middle Attack. The attack scenarios also show various scanning mechanisms and the impact of Insider Threats on the entire network.

  14. Cybersecurity 🪪 Intrusion 🦠 Detection Dataset

    • kaggle.com
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dinesh Naveen Kumar Samudrala (2025). Cybersecurity 🪪 Intrusion 🦠 Detection Dataset [Dataset]. https://www.kaggle.com/datasets/dnkumars/cybersecurity-intrusion-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dinesh Naveen Kumar Samudrala
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This Cybersecurity Intrusion Detection Dataset is designed for detecting cyber intrusions based on network traffic and user behavior. Below, I’ll explain each aspect in detail, including the dataset structure, feature importance, possible analysis approaches, and how it can be used for machine learning.

    1. Understanding the Features

    The dataset consists of network-based and user behavior-based features. Each feature provides valuable information about potential cyber threats.

    A. Network-Based Features

    These features describe network-level information such as packet size, protocol type, and encryption methods.

    1. network_packet_size (Packet Size in Bytes)

      • Represents the size of network packets, ranging between 64 to 1500 bytes.
      • Packets on the lower end (~64 bytes) may indicate control messages, while larger packets (~1500 bytes) often carry bulk data.
      • Attackers may use abnormally small or large packets for reconnaissance or exploitation attempts.
    2. protocol_type (Communication Protocol)

      • The protocol used in the session: TCP, UDP, or ICMP.
      • TCP (Transmission Control Protocol): Reliable, connection-oriented (common for HTTP, HTTPS, SSH).
      • UDP (User Datagram Protocol): Faster but less reliable (used for VoIP, streaming).
      • ICMP (Internet Control Message Protocol): Used for network diagnostics (ping); often abused in Denial-of-Service (DoS) attacks.
    3. encryption_used (Encryption Protocol)

      • Values: AES, DES, None.
      • AES (Advanced Encryption Standard): Strong encryption, commonly used.
      • DES (Data Encryption Standard): Older encryption, weaker security.
      • None: Indicates unencrypted communication, which can be risky.
      • Attackers might use no encryption to avoid detection or weak encryption to exploit vulnerabilities.

    B. User Behavior-Based Features

    These features track user activities, such as login attempts and session duration.

    1. login_attempts (Number of Logins)

      • High values might indicate brute-force attacks (repeated login attempts).
      • Typical users have 1–3 login attempts, while an attack may have hundreds or thousands.
    2. session_duration (Session Length in Seconds)

      • A very long session might indicate unauthorized access or persistence by an attacker.
      • Attackers may try to stay connected to maintain access.
    3. failed_logins (Failed Login Attempts)

      • High failed login counts indicate credential stuffing or dictionary attacks.
      • Many failed attempts followed by a successful login could suggest an account was compromised.
    4. unusual_time_access (Login Time Anomaly)

      • A binary flag (0 or 1) indicating whether access happened at an unusual time.
      • Attackers often operate outside normal business hours to evade detection.
    5. ip_reputation_score (Trustworthiness of IP Address)

      • A score from 0 to 1, where higher values indicate suspicious activity.
      • IP addresses associated with botnets, spam, or previous attacks tend to have higher scores.
    6. browser_type (User’s Browser)

      • Common browsers: Chrome, Firefox, Edge, Safari.
      • Unknown: Could be an indicator of automated scripts or bots.

    2. Target Variable (attack_detected)

    • Binary classification: 1 means an attack was detected, 0 means normal activity.
    • The dataset is useful for supervised machine learning, where a model learns from labeled attack patterns.

    3. Possible Use Cases

    This dataset can be used for intrusion detection systems (IDS) and cybersecurity research. Some key applications include:

    A. Machine Learning-Based Intrusion Detection

    1. Supervised Learning Approaches

      • Classification Models (Logistic Regression, Decision Trees, Random Forest, XGBoost, SVM)
      • Train the model using labeled data (attack_detected as the target).
      • Evaluate using accuracy, precision, recall, F1-score.
    2. Deep Learning Approaches

      • Use Neural Networks (DNN, LSTM, CNN) for pattern recognition.
      • LSTMs work well for time-series-based network traffic analysis.

    B. Anomaly Detection (Unsupervised Learning)

    If attack labels are missing, anomaly detection can be used: - Autoencoders: Learn normal traffic and flag anomalies. - Isolation Forest: Detects outliers based on feature isolation. - One-Class SVM: Learns normal behavior and detects deviations.

    C. Rule-Based Detection

    • If certain thresholds are met (e.g., failed_logins > 10 & ip_reputation_score > 0.8), an alert is triggered.

    4. Challenges & Considerations

    • Adversarial Attacks: Attackers may modify traffic to evade detection.
    • Concept Drift: Cyber threats...
  15. Cyber Crimes Dataset

    • kaggle.com
    zip
    Updated Sep 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shakirul_09 (2024). Cyber Crimes Dataset [Dataset]. https://www.kaggle.com/datasets/shakirul09/cyber-crimes-dataset
    Explore at:
    zip(3997242 bytes)Available download formats
    Dataset updated
    Sep 6, 2024
    Authors
    Shakirul_09
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The dataset contains the following columns , each described below :

    Attack Type: Randomly selected from a broad set of attack types (e.g., phishing, DDoS, malware, etc.). Target System: Corporate IT systems such as servers, databases, user accounts, APIs, and more. Outcome: Whether the attack succeeded or failed. Timestamp: Time of the attack, randomly distributed over the past year. Attacker IP Address: Simulated attacker IP addresses. Target IP Address: Random IP addresses representing internal or external targets. Data Compromised: Amount of data compromised (in gigabytes) if the attack succeeded. Attack Duration: Time the attack lasted (in minutes). Security Tools Used: Various defense mechanisms like firewalls, IDS, antivirus, etc. User Role: The role of the user impacted by the attack (admin, employee, or external user). Location: Country or region where the attack originated or targeted. Attack Severity: Numerical indicator of the severity level (e.g., scale from 1-10). Industry: Type of industry targeted, such as healthcare, finance, government, etc. Response Time: Time taken by the security team to respond (in minutes). Mitigation Method: Steps taken to mitigate the attack (patching, containment, etc.)

    Acknowledgement This dataset is a synthetic creation, generated using ChatGPT to simulate realistic cybersecurity incidents. It is designed to serve as a learning tool for beginners and data enthusiasts, offering a platform for practice and exploration in cybersecurity data analysis. By reflecting real-world cybercrime scenarios, this dataset encourages experimentation and deeper insights into various attack vectors, system vulnerabilities, and defense mechanisms. Its purpose is to promote hands-on learning in a controlled environment, enabling users to enhance their understanding of cybersecurity threats, analysis, and mitigation strategies.

  16. i

    Grant Giving Statistics for Hacking the Workforce

    • instrumentl.com
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Grant Giving Statistics for Hacking the Workforce [Dataset]. https://www.instrumentl.com/990-report/hacking-the-workforce
    Explore at:
    Dataset updated
    Dec 22, 2024
    Description

    Financial overview and grant giving statistics of Hacking the Workforce

  17. S

    AI Cyber Attacks Statistics 2025: How Attacks, Deepfakes & Ransomware Have...

    • sqmagazine.co.uk
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SQ Magazine (2025). AI Cyber Attacks Statistics 2025: How Attacks, Deepfakes & Ransomware Have Escalated [Dataset]. https://sqmagazine.co.uk/ai-cyber-attacks-statistics/
    Explore at:
    Dataset updated
    Oct 7, 2025
    Dataset authored and provided by
    SQ Magazine
    License

    https://sqmagazine.co.uk/privacy-policy/https://sqmagazine.co.uk/privacy-policy/

    Time period covered
    Jan 1, 2024 - Dec 31, 2025
    Area covered
    Global
    Description

    In January 2025, a small fintech startup in Austin discovered it had fallen victim to a cyberattack. At first glance, the breach looked like a typical case of credential stuffing. But it wasn’t. The attacker had used an AI-driven system that mimicked the behavioral patterns of employees, learning login habits,...

  18. Global data breaches caused by hacking 2023-2024, by industry

    • statista.com
    Updated Sep 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global data breaches caused by hacking 2023-2024, by industry [Dataset]. https://www.statista.com/statistics/1419277/data-breaches-hacking-by-industry/
    Explore at:
    Dataset updated
    Sep 18, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 1, 2023 - Oct 31, 2024
    Area covered
    Worldwide
    Description

    Between November 2023 and October 2024, organizations in the manufacturing sector worldwide saw around 818 incidents of data breaches caused by hacking. The healthcare industry ranked second, with 745 data breaches in the measured period. Furthermore, hacking caused 564 data breach incidents in the professional sector.

  19. Average cost per data breach in the United States 2006-2024

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Average cost per data breach in the United States 2006-2024 [Dataset]. https://www.statista.com/statistics/273575/us-average-cost-incurred-by-a-data-breach/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    As of 2024, the average cost of a data breach in the United States amounted to **** million U.S. dollars, down from **** million U.S. dollars in the previous year. The global average cost per data breach was **** million U.S. dollars in 2024. Cost of a data breach in different countries worldwide Data breaches impose a big threat for organizations globally. The monetary damage caused by data breaches has increased in many markets in the past decade. In 2023, Canada followed the U.S. by data breach costs, with an average of **** million U.S. dollars. Since 2019, the average monetary damage caused by loss of sensitive information in Canada has increased notably. In the United Kingdom, the average cost of a data breach in 2024 amounted to around **** million U.S. dollars, while in Germany it stood at **** million U.S. dollars. The cost of data breach by industry and segment Data breach costs vary depending on the industry and segment. For the fourth consecutive year, the global healthcare sector registered the highest costs of data breach, which in 2024 amounted to about **** million U.S. dollars. Financial institutions ranked second, with an average cost of *** million U.S. dollars for a data breach. Detection and escalation was the costliest segment in data breaches worldwide, with **** U.S. dollars on average. The cost for lost business ranked second, while response following a breach came across as the third-costliest segment.

  20. Healthcare Ransomware Dataset

    • kaggle.com
    zip
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    River | Datasets for SQL Practice (2025). Healthcare Ransomware Dataset [Dataset]. https://www.kaggle.com/datasets/rivalytics/healthcare-ransomware-dataset
    Explore at:
    zip(221852 bytes)Available download formats
    Dataset updated
    Feb 21, 2025
    Authors
    River | Datasets for SQL Practice
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    📌 Context of the Dataset

    The Healthcare Ransomware Dataset was created to simulate real-world cyberattacks in the healthcare industry. Hospitals, clinics, and research labs have become prime targets for ransomware due to their reliance on real-time patient data and legacy IT infrastructure. This dataset provides insight into attack patterns, recovery times, and cybersecurity practices across different healthcare organizations.

    Why is this important?

    Ransomware attacks on healthcare organizations can shut down entire hospitals, delay treatments, and put lives at risk. Understanding how different healthcare organizations respond to attacks can help develop better security strategies. The dataset allows cybersecurity analysts, data scientists, and researchers to study patterns in ransomware incidents and explore predictive modeling for risk mitigation.

    📌 Sources and Research Inspiration This simulated dataset was inspired by real-world cybersecurity reports and built using insights from official sources, including:

    1️⃣ IBM Cost of a Data Breach Report (2024)

    The healthcare sector had the highest average cost of data breaches ($10.93 million per incident). On average, organizations recovered only 64.8% of their data after paying ransom. Healthcare breaches took 277 days on average to detect and contain.

    2️⃣ Sophos State of Ransomware in Healthcare (2024)

    67% of healthcare organizations were hit by ransomware in 2024, an increase from 60% in 2023. 66% of backup compromise attempts succeeded, making data recovery significantly more difficult. The most common attack vectors included exploited vulnerabilities (34%) and compromised credentials (34%).

    3️⃣ Health & Human Services (HHS) Cybersecurity Reports

    Ransomware incidents in healthcare have doubled since 2016. Organizations that fail to monitor threats frequently experience higher infection rates.

    4️⃣ Cybersecurity & Infrastructure Security Agency (CISA) Alerts

    Identified phishing, unpatched software, and exposed RDP ports as top ransomware entry points. Only 13% of healthcare organizations monitor cyber threats more than once per day, increasing the risk of undetected attacks.

    5️⃣ Emsisoft 2020 Report on Ransomware in Healthcare

    The number of ransomware attacks in healthcare increased by 278% between 2018 and 2023. 560 healthcare facilities were affected in a single year, disrupting patient care and emergency services.

    📌 Why is This a Simulated Dataset?

    This dataset does not contain real patient data or actual ransomware cases. Instead, it was built using probabilistic modeling and structured randomness based on industry benchmarks and cybersecurity reports.

    How It Was Created:

    1️⃣ Defining the Dataset Structure

    The dataset was designed to simulate realistic attack patterns in healthcare, using actual ransomware case studies as inspiration.

    Columns were selected based on what real-world cybersecurity teams track, such as: Attack methods (phishing, RDP exploits, credential theft). Infection rates, recovery time, and backup compromise rates. Organization type (hospitals, clinics, research labs) and monitoring frequency.

    2️⃣ Generating Realistic Data Using ChatGPT & Python

    ChatGPT assisted in defining relationships between attack factors, ensuring that key cybersecurity concepts were accurately reflected. Python’s NumPy and Pandas libraries were used to introduce randomized attack simulations based on real-world statistics. Data was validated against industry research to ensure it aligns with actual ransomware attack trends.

    3️⃣ Ensuring Logical Relationships Between Data Points

    Hospitals take longer to recover due to larger infrastructure and compliance requirements. Organizations that track more cyber threats recover faster because they detect attacks earlier. Backup security significantly impacts recovery time, reflecting the real-world risk of backup encryption attacks.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Global number of breached user accounts Q1 2020-Q3 2025 [Dataset]. https://www.statista.com/statistics/1307426/number-of-data-breaches-worldwide/
Organization logo

Global number of breached user accounts Q1 2020-Q3 2025

Explore at:
20 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 14, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description

During the third quarter of 2025, data breaches exposed more than ** million records worldwide. Since the first quarter of 2020, the highest number of data records were exposed in the third quarter of ****, more than **** billion data sets. Data breaches remain among the biggest concerns of company leaders worldwide. The most common causes of sensitive information loss were operating system vulnerabilities on endpoint devices. Which industries see the most data breaches? Meanwhile, certain conditions make some industry sectors more prone to data breaches than others. According to the latest observations, the public administration experienced the highest number of data breaches between 2021 and 2022. The industry saw *** reported data breach incidents with confirmed data loss. The second were financial institutions, with *** data breach cases, followed by healthcare providers. Data breach cost Data breach incidents have various consequences, the most common impact being financial losses and business disruptions. As of 2023, the average data breach cost across businesses worldwide was **** million U.S. dollars. Meanwhile, a leaked data record cost about *** U.S. dollars. The United States saw the highest average breach cost globally, at **** million U.S. dollars.

Search
Clear search
Close search
Google apps
Main menu