10 datasets found
  1. Healthcare Ransomware Dataset

    • kaggle.com
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rivalytics (2025). Healthcare Ransomware Dataset [Dataset]. https://www.kaggle.com/datasets/rivalytics/healthcare-ransomware-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 21, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rivalytics
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    📌 Context of the Dataset

    The Healthcare Ransomware Dataset was created to simulate real-world cyberattacks in the healthcare industry. Hospitals, clinics, and research labs have become prime targets for ransomware due to their reliance on real-time patient data and legacy IT infrastructure. This dataset provides insight into attack patterns, recovery times, and cybersecurity practices across different healthcare organizations.

    Why is this important?

    Ransomware attacks on healthcare organizations can shut down entire hospitals, delay treatments, and put lives at risk. Understanding how different healthcare organizations respond to attacks can help develop better security strategies. The dataset allows cybersecurity analysts, data scientists, and researchers to study patterns in ransomware incidents and explore predictive modeling for risk mitigation.

    📌 Sources and Research Inspiration This simulated dataset was inspired by real-world cybersecurity reports and built using insights from official sources, including:

    1️⃣ IBM Cost of a Data Breach Report (2024)

    The healthcare sector had the highest average cost of data breaches ($10.93 million per incident). On average, organizations recovered only 64.8% of their data after paying ransom. Healthcare breaches took 277 days on average to detect and contain.

    2️⃣ Sophos State of Ransomware in Healthcare (2024)

    67% of healthcare organizations were hit by ransomware in 2024, an increase from 60% in 2023. 66% of backup compromise attempts succeeded, making data recovery significantly more difficult. The most common attack vectors included exploited vulnerabilities (34%) and compromised credentials (34%).

    3️⃣ Health & Human Services (HHS) Cybersecurity Reports

    Ransomware incidents in healthcare have doubled since 2016. Organizations that fail to monitor threats frequently experience higher infection rates.

    4️⃣ Cybersecurity & Infrastructure Security Agency (CISA) Alerts

    Identified phishing, unpatched software, and exposed RDP ports as top ransomware entry points. Only 13% of healthcare organizations monitor cyber threats more than once per day, increasing the risk of undetected attacks.

    5️⃣ Emsisoft 2020 Report on Ransomware in Healthcare

    The number of ransomware attacks in healthcare increased by 278% between 2018 and 2023. 560 healthcare facilities were affected in a single year, disrupting patient care and emergency services.

    📌 Why is This a Simulated Dataset?

    This dataset does not contain real patient data or actual ransomware cases. Instead, it was built using probabilistic modeling and structured randomness based on industry benchmarks and cybersecurity reports.

    How It Was Created:

    1️⃣ Defining the Dataset Structure

    The dataset was designed to simulate realistic attack patterns in healthcare, using actual ransomware case studies as inspiration.

    Columns were selected based on what real-world cybersecurity teams track, such as: Attack methods (phishing, RDP exploits, credential theft). Infection rates, recovery time, and backup compromise rates. Organization type (hospitals, clinics, research labs) and monitoring frequency.

    2️⃣ Generating Realistic Data Using ChatGPT & Python

    ChatGPT assisted in defining relationships between attack factors, ensuring that key cybersecurity concepts were accurately reflected. Python’s NumPy and Pandas libraries were used to introduce randomized attack simulations based on real-world statistics. Data was validated against industry research to ensure it aligns with actual ransomware attack trends.

    3️⃣ Ensuring Logical Relationships Between Data Points

    Hospitals take longer to recover due to larger infrastructure and compliance requirements. Organizations that track more cyber threats recover faster because they detect attacks earlier. Backup security significantly impacts recovery time, reflecting the real-world risk of backup encryption attacks.

  2. m

    PhiUSIIL Phishing URL Dataset

    • data.mendeley.com
    Updated Nov 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arvind Prasad (2023). PhiUSIIL Phishing URL Dataset [Dataset]. http://doi.org/10.17632/shwpxscxy2.2
    Explore at:
    Dataset updated
    Nov 15, 2023
    Authors
    Arvind Prasad
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PhiUSIIL Phishing URL Dataset is a substantial dataset comprising 134,850 legitimate and 100,945 phishing URLs. Most of the URLs we analyzed while constructing the dataset are the latest URLs. Features are extracted from the source code of the webpage and URL. Features such as CharContinuationRate, URLTitleMatchScore, URLCharProb, and TLDLegitimateProb are derived from existing features.

    Citation: Prasad, A., & Chandra, S. (2023). PhiUSIIL: A diverse security profile empowered phishing URL detection framework based on similarity index and incremental learning. Computers & Security, 103545. doi: https://doi.org/10.1016/j.cose.2023.103545

  3. z

    Phishing and Benign Domain Dataset (DNS, IP, WHOIS/RDAP, TLS, GeoIP)

    • zenodo.org
    json
    Updated Apr 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Radek Hranický; Radek Hranický; Adam Horák; Jan Polišenský; Jan Polišenský; Petr Pouč; Ondřej Ondryáš; Adam Horák; Petr Pouč; Ondřej Ondryáš (2024). Phishing and Benign Domain Dataset (DNS, IP, WHOIS/RDAP, TLS, GeoIP) [Dataset]. http://doi.org/10.5281/zenodo.8364668
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Apr 28, 2024
    Dataset provided by
    Zenodo
    Authors
    Radek Hranický; Radek Hranický; Adam Horák; Jan Polišenský; Jan Polišenský; Petr Pouč; Ondřej Ondryáš; Adam Horák; Petr Pouč; Ondřej Ondryáš
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains DNS records, IP-related features, WHOIS/RDAP information, information from TLS certificate fields, and GeoIP information for 432,572 verified benign domains from Cisco Umbrella and 36,993 verified phishing domains from PhishTank and OpenPhish services. The dataset is useful for statistical analysis of domain data or feature extraction for training machine learning-based classifiers, e.g. for phishing detection. The data was collected between March and July 2023.The final assessment of the data was conducted in July 2023 (this is why the names are suffixed with _2307).

    The upload contains: a) data files, b) the description of the data structure, and c) the veature vector we used for ML-based phishing domain detection.

    Data Files

    The data is located in two individual files:

    • benign_2307.json - data about 432,572 benign domains, and
    • phishing_2307.json - data about 36,993 phishing domains.

    Data Structure

    Both files are in the JSON Array format. The structure is as follows:

    [
    {
      "_id" : "A unique ID of the data record",
      "domain_name" : "Name of the domain (e.g., zenodo.com)",
      "dns" : { "//": "Data obtained from DNS records" },
      "evaluated_on" : "// ISO Timestamp of data collection ",
      "ip_data" : [  "// Data for each related IP adddress ",
        {
          "//": "IP-related data, including RTT from ICMP echo attempts (from Brno, Czechia)",
          "//": "WHOIS/RDAP data for the given IP address",
          "//": "GeoIP data for the given IP address",
          "//": "NERD system reputation score (if available)",
          "//": "ASN info",
          "//": "remarks: ISO timestamps of collection of the individual data pieces"
        },
      ],
      "label" : "benign_2307 for benign OR misp_2307 for phishing",
      "rdap" : { "//": "WHOIS/RDAP information for the domain name" },
      "remarks" : {
        "dns_evaluated_on" : "ISO Timestamp of DNS data collection",
        "rdap_evaluated_on" : "ISO Timestamp of WHOIS/RDAP data collection",
        "tls_evaluated_on" : "ISO Timestamp of TLS certificate information collection",
        "dns_had_no_ips" : "true if no IPs were found in DNS records"
      },
      "sourced_on" : "ISO Timestamp of the moment the domain was found",
      "tls" : {
        "cipher" : "Identifier of the TLS cipher suite",
        "count" : "Number of certificates in chain",
        "protocol" : "Version of the TLS protocol",
        "certificates" : [
          "//": "Information from TLS certificate fields: issuer, extensions, etc."
        ]
      },
      "category" : "Category of the record (could be ignored)",
      "source" : "Name of the file that we used to save the domain list"
    }
    ]

    Feature Vector

    This section describes the veature vector used in the "Unmasking the Phishermen: Phishing Domain Detection with Machine Learning and Multi-Source Intelligence" paper that was accepted to the IEEE NOMS 2024 conference.

    Lexical Features

    The following features were extracted from the sole domain name:

    • lex_name_len - length of the domain name,
    • lex_begins_with_digit - true if the domain name begins with a digit,
    • lex_www_flag - true if the domain name begins with "www.",
    • lex_phishing_keyword_count - occurence count of 47 phishing-related keywords,
    • lex_consecutive_chars - length of the longest consecutive character sequence,
    • lex_tld_len - length of the top-level domain (TLD),
    • lex_tld_hash - hash of the TLD,
    • lex_sld_len - length of the second-level domain (SLD),
    • lex_sld_norm_entropy - normalized entropy of the SLD,
    • lex_stld_unique_char_count - number of unique characters in the TLD and the SLD,
    • lex_sub_count - number of subdomains,
    • lex_sub_digit_ratio - ratio of digits in subdomains,
    • lex_sub_hex_ratio - ratio of hex symbols in subdomains,
    • lex_sub_non_alpanum_ratio - ratio of non-alphanumeric symbols in subdomains,
    • lex_sub_vowel_ratio - ratio of vowels in subdomains,
    • lex_sub_consonant_ratio - ratio of consonants in subdomains,
    • lex_sub_max_consonant_len - length of the longest consonant sequence in subdomains,
    • lex_sub_norm_entropy - normalized entropy of a string made from all subdomains,
    • lex_phishing_bigram_matches - occurrence count of the top 300 phishing domain bigrams,
    • lex_phishing_trigram_matches - occurrence count of the top 2000 phishing domain trigrams,
    • lex_phishing_tetragram_matches - occurrence count of the top 5000 phishing domain tetragrams,
    • lex_phishing_pentagram_matches - occurrence count of the top 10000 phishing domain pentagrams.

    DNS-based Features

    The following features were extracted from DNS responses when querying about the domain:

    • dns_A_count - number of A records for the domain,
    • dns_AAAA_count - number of AAAA records for the domain,
    • dns_CNAME_count - number of CNAME records for the domain,
    • dns_MX_count - number of MX records for the domain,
    • dns_NS_count - number of nameserver (NS) records for the domain,
    • dns_TXT_count - number of TXT records for the domain,
    • dns_soa_primary_ns_len - number of characters in the primary NS's domain name,
    • dns_soa_primary_ns_level - number of subdomain in the primary NS's domain name,
    • dns_soa_primary_ns_digit_count - number of digits in the primary NS's domain name,
    • dns_soa_primary_ns_entropy - normalized entropy of the primary NS's domain name,
    • dns_soa_email_len - number of characters in the admin's email domain name part,
    • dns_soa_email_level - number of subdomains in the admin's email domain name part,
    • dns_soa_email_digit_count - number of digits in the admin's email domain name part,
    • dns_soa_email_entropy - normalized entropy of the admin's email domain name part,
    • dns_soa_refresh - SOA refresh parameter,
    • dns_soa_retry - SOA retry parameter,
    • dns_soa_expire - SOA expire parameter,
    • dns_mx_avg_len - average number of characters of the domain names in MX records,
    • dns_mx_avg_entropy - average normalized entropy of the domain names in MX records,
    • dns_domain_name_in_mx - true if the domain name is contained in the MX record's domains,
    • dns_txt_spf_exists - true if an SPF record is in the TXT RRs,
    • dns_txt_avg_entropy - average normalized entropy of the TXT records
    • dns_ttl_low - number of RRsets with TTL in [0,100],
    • dns_ttl_mid - number of RRsets with TTL in [101,500],
    • dns_zone_entropy - normalized entropy of the zone's domain name.

    IP-based Features

    These features were derived from IP addresses and ICMP echo replies:

    • ip_mean_average_rtt - average RTT of all ICMP echo attempts,
    • ip_entropy - total entropy of all /16 (/64 for v6) IP prefixes,
    • ip_count - total number of IP addresses for the domain,
    • ip_v4_count - total number of IPv4 addresses for the domain,
    • ip_v6_count - total number of IPv6 addresses for the domain,

    TLS-based Features

    The following features were extracted from TLS certificate chains and TLS handshakes:

    • tls_chain_len - length of the TLS certificate chain,
    • tls_broken_chain - true if there is a certificate that has never been valid,
    • tls_expired_chain - true if there is an expired certificate in the chain,
    • tls_total_extension_count - total extensions in all certificates in the chain,
    • tls_critical_extensions - total extensions flagged as "critical" in all certificates,
    • tls_with_policies_crt_count - number of certificates that include the "policies" extension,
    • tls_percentage_crt_with_policies - percentage of certificates that include the "policies" extension,
    • tls_x509_anypolicy_crt_count - number of certificates not enforcing any security policy,
    • tls_iso_policy_crt_count - total discovered policies from the 1.* OID space,
    • tls_joint_isoitu_policy_crt_count - total discovered policies from from the 2.* OID space,
    • tls_subject_count - number of subject alternative names (SANs) in the leaf certificate,
    • tls_server_auth_crt_count - number of certificates with the "Web Server Authentication",
    • tls_client_auth_crt_count - number of certificates with the "Web Client Authentication",
    • tls_CA_certs_in_chain_ratio - ratio of CA certificates in the chain,
    • tls_unique_SLD_count -number of unique second-level domains

  4. Fraud Detection And Prevention Market Analysis North America, Europe, APAC,...

    • technavio.com
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Fraud Detection And Prevention Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, UK, Germany, China, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/fraud-detection-and-prevention-market-analysis
    Explore at:
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    United States, Global
    Description

    Snapshot img

    Fraud Detection And Prevention Market Size 2024-2028

    The fraud detection and prevention market size is forecast to increase by USD 86.68 billion at a CAGR of 27.17% between 2023 and 2028.

    In the current business landscape, the market is experiencing significant growth due to several key factors. The increasing adoption of cloud infrastructure services, such as cloud computing and big data, is driving market expansion. These technologies enable organizations to store and process large volumes of data, which is essential for advanced fraud detection techniques like anomaly detection. Moreover, the healthcare services sector is increasingly relying on fraud detection solutions to safeguard sensitive patient data. In addition, the rise of business intelligence (BI) and machine-to-machine (M2M) services is leading to an increased need for robust fraud prevention measures. Phone-based authentication solutions are also gaining popularity as an effective method for securing user identities and preventing fraud. The technological advancement in fraud detection and prevention solutions and services, coupled with the complexity of IT infrastructure, is further fueling market growth.
    

    What will be the Size of the Fraud Detection And Prevention Market During the Forecast Period?

    Request Free Sample

    The market encompasses a range of solutions designed to safeguard businesses and organizations from various types of financial and data breaches. Key end-use industries, including healthcare, manufacturing, governments, and IT , business intelligence and telecom, among others, increasingly rely on advanced technologies to mitigate risks. Market dynamics are driven by the growing adoption of cloud-based solutions, big data analytics, and blockchain technology. These innovations enable real-time fraud detection, enhancing the ability to prevent incidents such as payment fraud, identity theft, phishing scams, and money laundering. 
    SMEs and large enterprises across sectors like travel and transportation, energy and utilities, media and entertainment, professional services, and insurance claims face similar challenges, making the market expansive and diverse. Authentication solutions, real-time fraud detection, and managed services are integral components of the market, catering to the evolving needs of businesses in an increasingly digital world.
    

    How is this Fraud Detection And Prevention Industry segmented and which is the largest segment?

    The fraud detection and prevention industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Component
    
      Solutions
      Services
    
    
    End-user
    
      Large enterprise
      SMEs
    
    
    Geography
    
      North America
    
        US
        Canada
    
    
      Europe
    
        Germany
        Spain
        UK
    
    
      APAC
    
        China
        Japan
        India
    
    
      South America
    
        South Africa
    
    
      Middle East and Africa
    

    By Component Insights

    The solutions segment is estimated to witness significant growth during the forecast period.
    

    The market is experiencing significant growth due to escalating cyber threats and the increasing need for robust security measures. Key drivers include the rising number of fraudulent activities such as identity theft, money laundering, and phishing scams, as well as economic uncertainty and the pandemic. In the solutions segment, authentication solutions have emerged as a major revenue generator. However, the high cost of biometric technology may hinder growth in this area. SMEs, healthcare, manufacturing, end-use enterprises, governments, IT and telecom, travel and transportation, energy and utilities, media and entertainment, and financial institutions are among the key industries investing in fraud detection and prevention. Digital technologies, including cloud-based solutions, Big Data, artificial intelligence, and machine learning, are increasingly being adopted for real-time fraud detection. Fraud complexity and online data transactions pose significant challenges, necessitating proactive measures and trained cybersecurity professionals.

    Get a glance at the Fraud Detection And Prevention Industry report of share of various segments Request Free Sample

    The Solutions segment was valued at USD 11.84 billion in 2018 and showed a gradual increase during the forecast period.

    Regional Analysis

    North America is estimated to contribute 40% to the growth of the global market during the forecast period.
    

    Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period.

    For more insights on the market share of various regions, Request Free Sample

    The North American the market is projected to expand substantially due to the increasing prevalence of cyber threats in sectors like healthcare

  5. Cloud Database Security Market Report | Global Forecast From 2025 To 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Cloud Database Security Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/cloud-database-security-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Cloud Database Security Market Outlook



    The global cloud database security market size was valued at around USD 8.5 billion in 2023, and it is expected to reach approximately USD 24.6 billion by 2032, growing at a CAGR of 12.5% during the forecast period. The primary growth factors driving this market include the increasing adoption of cloud services, the rising number of cyber threats, and stringent regulatory requirements for data security. The market size growth is further fueled by the rapid digital transformation across various industry verticals, necessitating robust security measures to safeguard sensitive data hosted on cloud platforms.



    One of the most significant growth factors of the cloud database security market is the proliferation of cloud computing services. As organizations migrate their data to cloud environments to leverage scalability, cost-efficiency, and flexibility, the need for secure database solutions becomes paramount. The increasing complexity of cyber-attacks, including phishing, ransomware, and advanced persistent threats (APTs), has heightened awareness about the importance of cloud database security. Consequently, businesses are investing heavily in advanced security solutions to protect their critical data and maintain compliance with industry standards and regulations.



    Another factor contributing to the market's growth is the stringent regulatory landscape. Governments and regulatory bodies worldwide have established rigorous guidelines and standards to ensure data privacy and security. Regulations such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and the Asia-Pacific Economic Cooperation (APEC) Privacy Framework require organizations to implement robust security measures to protect personal and sensitive data. Compliance with these regulations drives the demand for comprehensive cloud database security solutions, as non-compliance can result in significant fines and reputational damage.



    The increasing adoption of technologies such as artificial intelligence (AI) and machine learning (ML) also plays a crucial role in boosting the cloud database security market. AI and ML are being integrated into security solutions to enhance threat detection and response capabilities. These technologies analyze vast amounts of data to identify patterns and anomalies, thereby enabling proactive measures to prevent potential security breaches. As cyber threats become more sophisticated, the application of AI and ML in cloud database security is expected to grow, driving market expansion further.



    As the cloud database security market evolves, the role of Cloud Data Security Software becomes increasingly critical. These software solutions are designed to protect data stored in cloud environments from unauthorized access and breaches. They offer features such as encryption, identity management, and real-time threat detection, which are essential for maintaining data integrity and privacy. With the growing reliance on cloud services, organizations are prioritizing the implementation of robust Cloud Data Security Software to safeguard their sensitive information. This trend is driven by the need to comply with stringent data protection regulations and to mitigate the risks associated with cyber threats. As a result, the demand for advanced security software that can seamlessly integrate with existing cloud infrastructures is on the rise, further propelling the growth of the cloud database security market.



    From a regional perspective, North America holds a significant share of the cloud database security market, primarily due to the presence of major cloud service providers and technology companies in the region. The United States, in particular, is a hub for technological innovation and has a highly developed IT infrastructure, making it a major market for cloud database security solutions. Additionally, stringent data protection regulations and high adoption rates of cloud services contribute to the market's growth in North America. The Asia Pacific region is also expected to witness substantial growth during the forecast period, driven by the rapid digital transformation in emerging economies such as China and India, increasing cyber threats, and government initiatives to enhance cybersecurity.



    Component Analysis



    The cloud database security market by component is segmented into software and services. The software segment encompasses

  6. D

    Data Encryption Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Data Encryption Market Report [Dataset]. https://www.promarketreports.com/reports/data-encryption-market-9193
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Feb 19, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Data Encryption Market Overview The global data encryption market is projected to register significant growth, with a market size of USD 14.5 billion in 2025 and a CAGR of 16% over the forecast period of 2025-2033. The increasing adoption of cloud computing and digital transformation initiatives are driving the demand for data encryption solutions to protect sensitive data from cyber threats. Additionally, industry regulations, such as GDPR and CCPA, are mandating organizations to implement data encryption measures, further fueling market growth. Market Drivers, Restraints, and Trends Key market drivers include rising cybersecurity threats, increasing data breaches, and the growing need for data privacy. The increasing adoption of IoT and mobile computing is also contributing to the need for data encryption. However, the high cost of implementation and the lack of skilled professionals can pose challenges to market growth. Notable market trends include the emergence of advanced encryption algorithms, such as quantum-safe cryptography, and the integration of encryption with AI and machine learning technologies. Regional factors, such as government regulations and technology adoption rates, also influence the market's growth trajectory. Recent developments include: On Apr. 11, 2023, Menlo Security, a leading provider of browser security solutions, published the results of the 10th Annual Cyberthreat Defense Report (CDR) by the CyberEdge Group. The report, partially sponsored by Menlo Security, highlights the augmenting importance of browser isolation technologies to combat ransomware and other malicious threats., The research revealed that most ransomware attacks include threats beyond data encryption. According to the report, around 51% of respondents confirmed that they have been using at least one type of browser or Internet isolation to protect their organizational data, while another 40% are about to deploy data encryption technology. Furthermore, around 33% of respondents noted that browser isolation is a key cybersecurity strategy to protect against sophisticated attacks, including ransomware, phishing, and zero-day attacks., On Feb.14, 2023, EnterpriseDB, a relational database provider, announced the addition of Transparent Data Encryption (TDE) based on open-source PostgreSQL to its databases. The new TDE feature will be shipped along with the firm's enterprise version of its database. TDE is a method of encrypting database files to ensure data security while at rest and in motion., Adding that most enterprises use TDE for compliance issues helps ensure data encryption on the hard drive and files on a backup. Before the development of built-in TDE, enterprises relied on either full-disk encryption or stackable cryptographic file system encryption., On Jan.25, 2023, Researchers from the Tokyo University of Science, Japan, announced the development of a faster and cheaper method for handling encrypted data while improving security. The new data encryption method developed by Japanese researchers combines the best of homomorphic encryption and secret sharing to handle encrypted data., Homomorphic encryption and secret sharing are key methods to compute sensitive data while preserving privacy. Homomorphic encryption is computationally intensive and involves performing computational data encryption on a single server, while secret sharing is fast and computationally efficient., In this method, the encrypted data/secret input is divided and distributed across multiple servers, each performing a computation, such as multiplication, on its data. The results of the computations are then used to reconstruct the original data., September 2022: Convergence Technology Solutions Corp., a supplier of software-enabled IT and cloud solutions, declared that it has obtained certification in Canada to sell and deploy IBM zsystems and LinuxONE., November 2019: Penta Security Systems announced that it has been selected as a finalist for the 2020 SC Magazine Awards, which are given by SC Media and celebrated in the United States. As a result, MyDiamo from Penta Security has been named the Best Database Security Solution of 2020. Additionally, this will result in the expansion of common-level encryption and improve the open-source DBMS installation procedure.. Potential restraints include: ISSUE REGARDING SECURITY AND DATA BREACH 44, HIGH IMPLEMENTATION COSTS AND COMPLEXITY 45; ISSUE WITH RESPECT TO DATA CONSISTENCY AND INTEROPERABILITY ACROSS DIFFERENT EDGE PLATFORMS 45.

  7. Most common scams in Singapore 2023

    • statista.com
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Most common scams in Singapore 2023 [Dataset]. https://www.statista.com/statistics/981340/leading-types-of-scams-singapore/
    Explore at:
    Dataset updated
    Aug 9, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Singapore
    Description

    In 2023, job scams were the most common type of scam in Singapore, with around 9,914 cases reported. E-commerce scams also represented a prevalent form of fraud in the country, with over 9,700 cases reported.

    Phishing threat in Singapore In Singapore, around 42 thousand different phishing URLs with a .SG domain were detected in 2022. The highest number of phishing URLs was recorded the previous year, with around 55 thousand. Phishing attacks can take many forms, such as corporate e-mail compromise (CEC), mass phishing, or smishing. These phishing e-mails represent a crucial risk for businesses. They can also lead to ransomware infections, which have also increased in recent years.

    Data breaches Companies and governments are increasingly relying on technology to collect, analyze, and store personal data. This can lead to potential risks when such data is affected by cyber incidents. In Singapore, the number of exposed data points per thousand people reached 26 in 2022. Over the same period, around 154 thousand data sets were reported as leaked in the country.

  8. g

    Webpage capture on the news article of year in review: human trafficking,...

    • gimi9.com
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Webpage capture on the news article of year in review: human trafficking, cyber scams and the government’s response [Dataset]. https://gimi9.com/dataset/mekong_20d4a8da52ce0ef91d4c3e0c004eff54a0c80bf5
    Explore at:
    Dataset updated
    Mar 23, 2025
    Description

    The website archive on this article presents the events from July 2022 to June 2023, which describe the problem of online scam operations in Cambodia, including December 29, 2022, the Eternal Diamond Casino, owned by tycoon Try Pheap and situated in the MDS Thmor Da special economic zone where VOD and Al Jazeera reported alleged human trafficking and scam operations, has its casino license renewed.

  9. Most frequent consumer fraud schemes Philippines Q4 2024

    • statista.com
    • ai-chatbox.pro
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most frequent consumer fraud schemes Philippines Q4 2024 [Dataset]. https://www.statista.com/statistics/1271755/philippines-most-frequent-consumer-fraud-schemes/
    Explore at:
    Dataset updated
    Feb 14, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Sep 25, 2024 - Oct 17, 2024
    Area covered
    Philippines
    Description

    According to a survey on personal finance conducted during the fourth quarter of 2024 in the Philippines, 45 percent of respondents who had experienced digital fraud attempts were targeted with phishing attacks. In addition, 41 percent of respondents were targeted with smishing or phishing using text messages.

  10. Number of cyber threat incidents reported to CyberSecurity Malaysia 2024, by...

    • statista.com
    Updated Jan 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of cyber threat incidents reported to CyberSecurity Malaysia 2024, by type [Dataset]. https://www.statista.com/statistics/1043272/malaysia-cyber-crime-incidents/
    Explore at:
    Dataset updated
    Jan 2, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    Malaysia
    Description

    In 2024, online frauds were the most reported cyber threat incidents announced by Cybersecurity Malaysia with more than 3,800 reports. This was followed by content related cyber crime with 533 cases. CyberSecurity Malaysia is a government agency that deals with internet safety and operates under the Malaysian Ministry of Science, Technology and Innovation. Risks of scams in e-commerce leading internet activities. Meanwhile, the Malaysian internet users have experienced cybercrime, only 18.9 percent

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rivalytics (2025). Healthcare Ransomware Dataset [Dataset]. https://www.kaggle.com/datasets/rivalytics/healthcare-ransomware-dataset
Organization logo

Healthcare Ransomware Dataset

Analyze attacks, strengthen security, and improve recovery in healthcare

Explore at:
199 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rivalytics
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

📌 Context of the Dataset

The Healthcare Ransomware Dataset was created to simulate real-world cyberattacks in the healthcare industry. Hospitals, clinics, and research labs have become prime targets for ransomware due to their reliance on real-time patient data and legacy IT infrastructure. This dataset provides insight into attack patterns, recovery times, and cybersecurity practices across different healthcare organizations.

Why is this important?

Ransomware attacks on healthcare organizations can shut down entire hospitals, delay treatments, and put lives at risk. Understanding how different healthcare organizations respond to attacks can help develop better security strategies. The dataset allows cybersecurity analysts, data scientists, and researchers to study patterns in ransomware incidents and explore predictive modeling for risk mitigation.

📌 Sources and Research Inspiration This simulated dataset was inspired by real-world cybersecurity reports and built using insights from official sources, including:

1️⃣ IBM Cost of a Data Breach Report (2024)

The healthcare sector had the highest average cost of data breaches ($10.93 million per incident). On average, organizations recovered only 64.8% of their data after paying ransom. Healthcare breaches took 277 days on average to detect and contain.

2️⃣ Sophos State of Ransomware in Healthcare (2024)

67% of healthcare organizations were hit by ransomware in 2024, an increase from 60% in 2023. 66% of backup compromise attempts succeeded, making data recovery significantly more difficult. The most common attack vectors included exploited vulnerabilities (34%) and compromised credentials (34%).

3️⃣ Health & Human Services (HHS) Cybersecurity Reports

Ransomware incidents in healthcare have doubled since 2016. Organizations that fail to monitor threats frequently experience higher infection rates.

4️⃣ Cybersecurity & Infrastructure Security Agency (CISA) Alerts

Identified phishing, unpatched software, and exposed RDP ports as top ransomware entry points. Only 13% of healthcare organizations monitor cyber threats more than once per day, increasing the risk of undetected attacks.

5️⃣ Emsisoft 2020 Report on Ransomware in Healthcare

The number of ransomware attacks in healthcare increased by 278% between 2018 and 2023. 560 healthcare facilities were affected in a single year, disrupting patient care and emergency services.

📌 Why is This a Simulated Dataset?

This dataset does not contain real patient data or actual ransomware cases. Instead, it was built using probabilistic modeling and structured randomness based on industry benchmarks and cybersecurity reports.

How It Was Created:

1️⃣ Defining the Dataset Structure

The dataset was designed to simulate realistic attack patterns in healthcare, using actual ransomware case studies as inspiration.

Columns were selected based on what real-world cybersecurity teams track, such as: Attack methods (phishing, RDP exploits, credential theft). Infection rates, recovery time, and backup compromise rates. Organization type (hospitals, clinics, research labs) and monitoring frequency.

2️⃣ Generating Realistic Data Using ChatGPT & Python

ChatGPT assisted in defining relationships between attack factors, ensuring that key cybersecurity concepts were accurately reflected. Python’s NumPy and Pandas libraries were used to introduce randomized attack simulations based on real-world statistics. Data was validated against industry research to ensure it aligns with actual ransomware attack trends.

3️⃣ Ensuring Logical Relationships Between Data Points

Hospitals take longer to recover due to larger infrastructure and compliance requirements. Organizations that track more cyber threats recover faster because they detect attacks earlier. Backup security significantly impacts recovery time, reflecting the real-world risk of backup encryption attacks.

Search
Clear search
Close search
Google apps
Main menu