13 datasets found
  1. u

    Analysis of zero-day attacks and ransomware

    • researchdata.up.ac.za
    txt
    Updated Feb 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mike Wa Nkongolo (2024). Analysis of zero-day attacks and ransomware [Dataset]. http://doi.org/10.25403/UPresearchdata.25215530.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 22, 2024
    Dataset provided by
    University of Pretoria
    Authors
    Mike Wa Nkongolo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cybersecurity faces challenges in identifying and mitigating undefined network vulnerabilities, critical for preventing zero-day attacks. The absence of datasets for distinguishing normal versus abnormal network behavior hinders the development of proactive detection strategies. An obstacle in proactive prevention methods is the absence of comprehensive datasets for contrasting normal versus abnormal network behaviours. Such dataset enabling such contrasts would significantly expedite threat anomaly mitigation. The thesis "Ensemble learning and genetic algorithm for the detection of novel network threat anomaly using the UGRansome Dataset"; introduces UGRansome, a dataset for anomaly detection in network traffic. This dataset comprises a comprehensive set of malware features designed for detecting and quantifying zero-day attacks. It was created by integrating similar attributes from both the UGR'16 and ransomware datasets, following a process of development and validation. Malicious behavior is categorized into normal and abnormal patterns, further characterized through supervised learning techniques, which include anomaly, signature, and synthetic signature stratifications. Despite significant advancements in intrusion detection and prevention systems, the need for detecting and quantifying zero-day attacks, including ransomware, persists. Therefore, the development of a specialized analytical approach tailored for quantifying zero-day attacks within cybersecurity datasets is crucial to effectively address the evolving threat landscape posed by advanced persistent threats.

  2. Businesses worldwide affected by ransomware 2018-2023

    • statista.com
    Updated Nov 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Businesses worldwide affected by ransomware 2018-2023 [Dataset]. https://www.statista.com/statistics/204457/businesses-ransomware-attack-rate/
    Explore at:
    Dataset updated
    Nov 9, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    As of 2023, over 72 percent of businesses worldwide were affected by ransomware attacks. This figure represents an increase on the previous five years and was by far the highest figure reported. Overall, since 2018, more than half of the total survey respondents each year stated that their organizations had been victimized by ransomware. Most targeted industries
    In 2023, the healthcare industry in the United States was once again most targeted by ransomware attacks. This industry also suffers most data breaches as a consequence of cyberattacks. The critical manufacturing industry ranked second by the number of ransomware attacks, followed by the government facilities industry. Ransomware in the manufacturing industry
    The manufacturing industry, along with its subindustries, is constantly targeted by ransomware attacks, causing data loss, business disruptions, and reputational damage. Often, such cyberattacks are international and have a political intent. In 2023, compromised credentials were the leading cause of ransomware attacks in the manufacturing industry.

  3. P

    Data from: Cybersecurity Threat Detection Dataset

    • paperswithcode.com
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Cybersecurity Threat Detection Dataset [Dataset]. https://paperswithcode.com/dataset/cybersecurity-threat-detection
    Explore at:
    Dataset updated
    Mar 7, 2025
    Description

    Problem Statement

    👉 Download the case studies here

    Organizations face an increasing number of sophisticated cybersecurity threats, including malware, phishing attacks, and unauthorized access. A financial institution experienced frequent attempts to breach its network, risking sensitive data and regulatory compliance. Traditional security measures were reactive and failed to detect threats in real time. The institution sought a proactive AI-driven solution to identify and prevent cybersecurity threats effectively.

    Challenge

    Developing an advanced threat detection system required addressing several challenges:

    Processing and analyzing large volumes of network traffic and user activity data in real time.

    Identifying new and evolving threats, such as zero-day vulnerabilities, with high accuracy.

    Minimizing false positives to ensure security teams could focus on genuine threats.

    Solution Provided

    An AI-powered threat detection system was developed using machine learning algorithms and advanced analytics. The solution was designed to:

    Continuously monitor network activity and user behavior to identify suspicious patterns.

    Detect and neutralize cybersecurity threats in real time, including malware and phishing attempts.

    Provide actionable insights to security teams for faster and more effective threat response.

    Development Steps

    Data Collection

    Collected network traffic logs, endpoint activity, and historical threat data to train machine learning models.

    Preprocessing

    Cleaned and standardized data, ensuring compatibility across diverse sources, and filtered out noise for accurate analysis.

    Model Development

    Developed machine learning algorithms for anomaly detection, behavioral analysis, and threat classification. Trained models on labeled datasets to recognize known threats and identify emerging attack patterns.

    Validation

    Tested the system against simulated and real-world threat scenarios to evaluate detection accuracy, response times, and reliability.

    Deployment

    Integrated the threat detection system into the institution’s existing cybersecurity infrastructure, including firewalls, SIEM (Security Information and Event Management) tools, and endpoint protection

    Continuous Monitoring & Improvement

    Established a feedback loop to refine models using new threat data and adapt to evolving attack strategies.

    Results

    Enhanced Security Posture

    The system improved the institution’s ability to detect and prevent cybersecurity threats proactively, strengthening its overall security framework.

    Reduced Incidence of Cyber Attacks

    Real-time detection and response significantly reduced the frequency and impact of successful cyber attacks.

    Improved Threat Response Times

    Automated threat identification and prioritization enabled security teams to respond faster and more effectively to potential breaches.

    Minimized False Positives

    Advanced algorithms reduced false alarms, allowing security teams to focus on genuine threats and improve efficiency.

    Scalable and Adaptive Solution

    The system adapted to new threats and scaled effortlessly to protect growing organizational networks and data.

  4. D

    Data Encryption Market Report

    • promarketreports.com
    doc, pdf, ppt
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pro Market Reports (2025). Data Encryption Market Report [Dataset]. https://www.promarketreports.com/reports/data-encryption-market-9193
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Feb 19, 2025
    Dataset authored and provided by
    Pro Market Reports
    License

    https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Data Encryption Market Overview The global data encryption market is projected to register significant growth, with a market size of USD 14.5 billion in 2025 and a CAGR of 16% over the forecast period of 2025-2033. The increasing adoption of cloud computing and digital transformation initiatives are driving the demand for data encryption solutions to protect sensitive data from cyber threats. Additionally, industry regulations, such as GDPR and CCPA, are mandating organizations to implement data encryption measures, further fueling market growth. Market Drivers, Restraints, and Trends Key market drivers include rising cybersecurity threats, increasing data breaches, and the growing need for data privacy. The increasing adoption of IoT and mobile computing is also contributing to the need for data encryption. However, the high cost of implementation and the lack of skilled professionals can pose challenges to market growth. Notable market trends include the emergence of advanced encryption algorithms, such as quantum-safe cryptography, and the integration of encryption with AI and machine learning technologies. Regional factors, such as government regulations and technology adoption rates, also influence the market's growth trajectory. Recent developments include: On Apr. 11, 2023, Menlo Security, a leading provider of browser security solutions, published the results of the 10th Annual Cyberthreat Defense Report (CDR) by the CyberEdge Group. The report, partially sponsored by Menlo Security, highlights the augmenting importance of browser isolation technologies to combat ransomware and other malicious threats., The research revealed that most ransomware attacks include threats beyond data encryption. According to the report, around 51% of respondents confirmed that they have been using at least one type of browser or Internet isolation to protect their organizational data, while another 40% are about to deploy data encryption technology. Furthermore, around 33% of respondents noted that browser isolation is a key cybersecurity strategy to protect against sophisticated attacks, including ransomware, phishing, and zero-day attacks., On Feb.14, 2023, EnterpriseDB, a relational database provider, announced the addition of Transparent Data Encryption (TDE) based on open-source PostgreSQL to its databases. The new TDE feature will be shipped along with the firm's enterprise version of its database. TDE is a method of encrypting database files to ensure data security while at rest and in motion., Adding that most enterprises use TDE for compliance issues helps ensure data encryption on the hard drive and files on a backup. Before the development of built-in TDE, enterprises relied on either full-disk encryption or stackable cryptographic file system encryption., On Jan.25, 2023, Researchers from the Tokyo University of Science, Japan, announced the development of a faster and cheaper method for handling encrypted data while improving security. The new data encryption method developed by Japanese researchers combines the best of homomorphic encryption and secret sharing to handle encrypted data., Homomorphic encryption and secret sharing are key methods to compute sensitive data while preserving privacy. Homomorphic encryption is computationally intensive and involves performing computational data encryption on a single server, while secret sharing is fast and computationally efficient., In this method, the encrypted data/secret input is divided and distributed across multiple servers, each performing a computation, such as multiplication, on its data. The results of the computations are then used to reconstruct the original data., September 2022: Convergence Technology Solutions Corp., a supplier of software-enabled IT and cloud solutions, declared that it has obtained certification in Canada to sell and deploy IBM zsystems and LinuxONE., November 2019: Penta Security Systems announced that it has been selected as a finalist for the 2020 SC Magazine Awards, which are given by SC Media and celebrated in the United States. As a result, MyDiamo from Penta Security has been named the Best Database Security Solution of 2020. Additionally, this will result in the expansion of common-level encryption and improve the open-source DBMS installation procedure.. Potential restraints include: ISSUE REGARDING SECURITY AND DATA BREACH 44, HIGH IMPLEMENTATION COSTS AND COMPLEXITY 45; ISSUE WITH RESPECT TO DATA CONSISTENCY AND INTEROPERABILITY ACROSS DIFFERENT EDGE PLATFORMS 45.

  5. Business or organization reporting of ransomware attack to insurance company...

    • datasets.ai
    • www150.statcan.gc.ca
    • +2more
    21, 55, 8
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada | Statistique Canada (2024). Business or organization reporting of ransomware attack to insurance company in the last 12 months, fourth quarter of 2021 [Dataset]. https://datasets.ai/datasets/88d59d3b-b313-43bf-97a4-237e617882d9
    Explore at:
    55, 8, 21Available download formats
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Authors
    Statistics Canada | Statistique Canada
    Description

    Business or organization reporting of ransomware attack to insurance company in the last 12 months, by North American Industry Classification System (NAICS), business employment size, type of business, business activity and majority ownership, fourth quarter of 2021.

  6. Drone-Based Malware Detection (DBMD)

    • kaggle.com
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DatasetEngineer (2024). Drone-Based Malware Detection (DBMD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/9045375
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DatasetEngineer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description Welcome to the Drone-Based Malware Detection dataset! This dataset is designed to aid researchers and practitioners in exploring innovative cybersecurity solutions using drone-collected data. The dataset contains detailed information on network traffic, drone sensor readings, malware detection indicators, and environmental conditions. It offers a unique perspective by integrating data from drones with traditional network security metrics to enhance malware detection capabilities.

    Dataset Overview The dataset comprises four main categories:

    Network Traffic Data: Captures network traffic attributes including IP addresses, ports, protocols, packet sizes, and various derived metrics. Drone Sensor Data: Includes GPS coordinates, altitude, speed, heading, battery level, and other sensor readings from drones. Malware Detection Data: Contains indicators and scores relevant to detecting malware, such as anomaly scores, suspicious IP counts, reputation scores, and attack types. Environmental Data: Provides context through environmental conditions like location type, noise level, weather conditions, and more. Files and Features The dataset is divided into four separate CSV files:

    network_traffic_data.csv

    timestamp: Date and time of the traffic event. source_ip: Source IP address. destination_ip: Destination IP address. source_port: Source port number. destination_port: Destination port number. protocol: Network protocol (TCP, UDP, ICMP). packet_length: Length of the network packet. payload_data: Content of the packet payload. flag: Network flag (SYN, ACK, FIN, RST). traffic_volume: Volume of traffic in bytes. flow_duration: Duration of the network flow. flow_bytes_per_s: Bytes per second for the flow. flow_packets_per_s: Packets per second for the flow. packet_count: Number of packets in the flow. average_packet_size: Average size of packets. min_packet_size: Minimum packet size. max_packet_size: Maximum packet size. packet_size_variance: Variance in packet sizes. header_length: Length of the packet header. payload_length: Length of the packet payload. ip_ttl: Time to live for the IP packet. tcp_window_size: TCP window size. icmp_type: ICMP type (echo_request, echo_reply, destination_unreachable). dns_query_count: Number of DNS queries. dns_response_count: Number of DNS responses. http_method: HTTP method (GET, POST, PUT, DELETE). http_status_code: HTTP status code (200, 404, 500, 301). content_type: Content type (text/html, application/json, image/png). ssl_tls_version: SSL/TLS version. ssl_tls_cipher_suite: SSL/TLS cipher suite. drone_data.csv

    latitude: Latitude of the drone. longitude: Longitude of the drone. altitude: Altitude of the drone. speed: Speed of the drone. heading: Heading of the drone. battery_level: Battery level of the drone. drone_id: Unique identifier for the drone. flight_time: Total flight time. signal_strength: Strength of the drone's signal. temperature: Temperature at the drone's location. humidity: Humidity at the drone's location. pressure: Atmospheric pressure at the drone's location. wind_speed: Wind speed at the drone's location. wind_direction: Wind direction at the drone's location. gps_accuracy: Accuracy of the GPS signal. malware_detection_data.csv

    anomaly_score: Score indicating the level of anomaly detected. suspicious_ip_count: Number of suspicious IP addresses detected. malicious_payload_indicator: Indicator for malicious payload (0 or 1). reputation_score: Reputation score for the network entity. behavioral_score: Behavioral score indicating potential malicious activity. attack_type: Type of attack (DDoS, phishing, malware). signature_match: Indicator for signature match (0 or 1). sandbox_result: Result from sandbox analysis (clean, infected). heuristic_score: Heuristic score for potential threats. traffic_pattern: Pattern of the traffic (burst, steady). environmental_data.csv

    location_type: Type of location (urban, rural). nearby_devices: Number of nearby devices. signal_interference: Level of signal interference. noise_level: Noise level in the environment. time_of_day: Time of day (morning, afternoon, evening, night). day_of_week: Day of the week. weather_conditions: Weather conditions (sunny, rainy, cloudy, stormy). Usage and Applications This dataset can be used for:

    Cybersecurity Research: Developing and testing algorithms for malware detection using drone data. Machine Learning: Training models to identify malicious activity based on network traffic and drone sensor readings. Data Analysis: Exploring the relationships between environmental conditions, drone sensor data, and network traffic anomalies. Educational Purposes: Teaching data science, machine learning, and cybersecurity concepts using a comprehensive and multi-faceted dataset.

    Acknowledgements This dataset is based on real-world data collected from drone sensors and network traffic monitoring s...

  7. t

    Dataset of Publication "Malware Communication in Smart Factories: A Network...

    • researchdata.tuwien.at
    csv, txt, zip
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper (2025). Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set" [Dataset]. http://doi.org/10.48436/ghdc6-45k78
    Explore at:
    csv, zip, txtAvailable download formats
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    TU Wien
    Authors
    Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 11, 2024
    Description

    Note: If you use this dataset, please cite the following paper:

    Brenner, B., Fabini, J., Offermanns, M., Semper, S., & Zseby, T. (2024). Malware communication in smart factories: A network traffic data set. Computer Networks, 255, 110804.

    or in BibTeX:

    @article{brenner2024malware,
    title={Malware communication in smart factories: A network traffic data set},
    author={Brenner, Bernhard and Fabini, Joachim and Offermanns, Magnus and Semper, Sabrina and Zseby, Tanja},
    journal={Computer Networks},
    volume={255},
    pages={110804},
    year={2024},
    publisher={Elsevier}
    }

    Context and methodology

    Machine learning-based intrusion detection requires suitable and realistic data sets for training and testing. However, data sets that originate from real networks are rare. Network data is considered privacy-sensitive, and the purposeful introduction of malicious traffic is usually not possible.

    In this paper, we introduce a labeled data set captured at a smart factory located in Vienna, Austria, during normal operation and during penetration tests with different attack types. The data set contains 173 GB of PCAP files, representing 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic.

    The captured malicious traffic originated from a professional penetration tester who performed two types of attacks:
    (a) Aggressive attacks that are easier to detect.
    (b) Stealthy attacks that are harder to detect.

    Our data set includes the raw PCAP files and extracted flow data. Labels for packets and flows indicate whether they originated from a specific attack or from benign communication.

    We describe the methodology for creating the dataset, conduct an analysis of the data, and provide detailed information about the recorded traffic itself. The dataset is freely available to support reproducible research and the comparability of results in the area of intrusion detection in industrial networks.

    Technical details

    • readme.txt
      • Information about the data collection, format, necessary software and versions to access it.
    • license.txt:
      • Licensing information.
    • a_day1, a_day2, s_day1, s_day2, tf_a, and tf_s:
      • Main dataset, where files starting with "tf" are training files containing only benign,
        operational data. All other files are attack files containing both operational data and
        attack data.
    • images.zip:
      • Contains descriptive images about the data.
    • extractions.zip:
      • Contains extracted packets and flows in both labeled and unlabeled form.
    • a_day_tuesday_dos.zip:
      • An additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.
    • list_of_extracted_features:
      • A complete list of features we extracted from the PCAP Files. All flow files contain these features.
    • list_of_identified_protocols.csv:
      • A complete list of all protocols that we could identify within the PCAP files provided.
  8. t

    Dataset of Publication "Malware Communication in Smart Factories: A Network...

    • researchdata.tuwien.ac.at
    zip
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper (2024). Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set" [Dataset]. http://doi.org/10.48436/vs6hv-1vs74
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    TU Wien
    Authors
    Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 11, 2024
    Description

    Machine learning-based intrusion detection requires suitable and realistic
    data sets for training and testing. However, data sets that originate from
    real networks are rare. Network data is considered privacy sensitive and the
    purposeful introduction of malicious traffic is usually not possible. In this
    paper we introduce a labeled data set captured at a smart factory located
    in Vienna, Austria during normal operation and during penetration tests with different
    attack types. The data set contains 173 GB of PCAP files, which represent 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic. The captured malicious traffic was originated
    by a professional penetration tester who performed two types of attacks: (a)
    aggressive attacks that are easier to detect and (b) stealthy attacks that are
    harder to detect. Our data set includes the raw PCAP files and extracted
    flow data. Labels for packets and flows indicate whether packets (or flows)
    originated from a specific attack or from benign communication. We describe
    the methodology for creating the data set, conduct an analysis of the data
    and provide detailed information about the recorded traffic itself. The data
    set is freely available to support reproducible research and the comparability
    of results in the area of intrusion detection in industrial networks.

    File description:

    a_day1, a_day2, s_day1, s_day2, tf_a and tf_s: Main data set, where files starting with "tf" are training files containing only benign, operational data and all other files are attack files containing both, operational data and attack data.

    images.zip: Contains descriptive images about the data.

    extractions.zip: Contains extracted packets, flows in both labeled and unlabeled form.

    a_day_tuesday_dos.zip: additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.

  9. Global cyberattack distribution 2023, by type

    • statista.com
    Updated Nov 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Global cyberattack distribution 2023, by type [Dataset]. https://www.statista.com/statistics/1382266/cyber-attacks-worldwide-by-type/
    Explore at:
    Dataset updated
    Nov 14, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Worldwide
    Description

    In 2023, ransomware was the most frequently detected cyberattack worldwide, with around 70 percent of all detected cyberattacks. Network breaches ranked second, with almost 19 percent of the detections. Although less frequently, data exfiltration was also among the detected cyberattacks.

  10. Z

    Trace-Share Dataset for Evaluation of Statistical Characteristics...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Statistical Characteristics Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3553062
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Cermak, Milan
    Madeja, Tomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains all data used during the evaluation of statistical characteristics preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

    For more information, see the project repository at https://github.com/Trace-Share.

    Selected Attack Traces

    We selected 72 different traces of network attacks obtained from various internet databases. File names refer to common names of contained vulnerabilities, malware, or attack tools.

    Background Traffic Data

    Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

    Evaluation Results and Dataset Structure

    Traces variants (traces-normalized.zip, traces-adjusted.zip)

    ./traces-normalized/ — normalized PCAP files and details in YAML format;

    ./traces-adjusted/ — configuration files for traces combination in YAML format.

    Computed statistics (statistics.zip)

    ./statistics-background/ — background traffic statistics computed by ID2T;

    ./statistics-combination/ — combined traces statistics computed by ID2T for all adjust options (selected only combinations where ID2T provided all statistics files);

    ./statistics-difference/ — computed mean and median differences of background and combined traffic traces.

    Evaluation results

    statistics-difference.ipynb — file containing visualization of statistics differences.

  11. d

    SEPTA Ridership Statistics

    • catalog.data.gov
    Updated Mar 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SEPTA (2025). SEPTA Ridership Statistics [Dataset]. https://catalog.data.gov/dataset/septa-ridership-statistics
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    SEPTA
    Description

    Stop summary files represent average daily ridership at the stop level over the course of the relevant period. Trolley ridership data was generated using automatic passenger counters (APCs). Bus data is calculated from a variety of sources depending on the route and year. The bus data files represent average daily fall ridership from 2014 – present. Accurate weekend bus data was not available until 2017 at which point SEPTA had more widespread APC coverage. No bus data is available for Fall 2020 due to a malware attack. APC bus data was also not available for articulated vehicles and the Boulevard Direct from August 2020 through February 2022 due to the malware attack.

  12. Z

    Trace-Share Dataset for Evaluation of Trace Meaning Preservation

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Trace Meaning Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3547527
    Explore at:
    Dataset updated
    May 7, 2020
    Dataset provided by
    Cermak, Milan
    Madeja, Tomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains all data used during the evaluation of trace meaning preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

    For more information, see the project repository at https://github.com/Trace-Share.

    Selected Attack Traces

    The following list contains trace datasets used for evaluation. Each attack was chosen to have not only a different meaning but also different statistical properties.

    dos_http_flood — the capture of GET and POST requests sent to one server by one attacker (HTTP~traffic);

    ftp_bruteforce — short and unsuccessful attempt to guess a user’s password for FTP service (FTP traffic);

    ponyloader_botnet — Pony Loader botnet used for stealing of credentials from 3 target devices reporting to single IP with a large number of intermediate addresses (DNS and HTTP traffic);

    scan — the capture of nmap tool that scans given subnet using ICMP echo and TCP SYN requests (consist of ARP, ICMP, and TCP traffic);

    wannacry_ransomware — the capture of Wanacry ransomware that spreads in a domain with three workstations, a domain controller, and a file-sharing server (SMB and SMBv2 traffic).

    Background Traffic Data

    Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

    Evaluation Results and Dataset Structure

    Traces variants (traces.zip)

    ./traces-original/ — trace PCAP files and crawled details in YAML format;

    ./traces-normalized — normalized PCAP files and details in YAML format;

    ./traces-adjusted — adjusted PCAP files using various timestamp generation settings, combination configuration in YAML format, and lables provided by ID2T in XML format.

    Extracted alerts (alerts.zip)

    ./alerts-original/ — extracted Suricata alerts, Suricata log, and full Suricata output for all original trace files;

    ./alerts-normalized/ — extracted Suricata alerts, Suricata log, and full Suricata output for all normalized trace files;

    ./alerts-adjusted/ — extracted Suricata alerts, Suricata log, and full Suricata output for all adjusted trace files.

    Evaluation results

    *.csv files in the root directory — data contains extracted alert signatures and their count per each trace variant.

  13. Share of cyberattacks in Italy 2024, by reason

    • statista.com
    • ai-chatbox.pro
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Share of cyberattacks in Italy 2024, by reason [Dataset]. https://www.statista.com/statistics/649358/share-cyber-attacks-in-italy-by-reason/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Italy
    Description

    During the first half of 2024, around ** percent of cyberattacks carried out in Italy had cybercrime as a purpose. Cyber espionage was another motivation, representing the main reason behind roughly **** percent of attacks. By contrast, information warfare only accounted for *** percent of the cyberattacks in the country in the last examined period. Data breaches in Italy In 2023, over half of the Italian digital population was alerted that their personal data had been breached, and **** percent of the alerted users had the misfortune of being affected by data compromise on the dark web. Despite a decrease in the number of data sets affected in data breaches between 2020 and 2023, Italy recorded almost *** million exposed data sets at the beginning of 2023.Meanwhile, the average cost of data breaches for both Italian companies and targeted users kept growing, reaching **** million U.S. dollars in 2024, up from the **** million U.S. dollars recorded in the previous year. The Italian privacy landscape: GDPR effects As a state member of the European Union, Italy is covered by the General Data Protection Regulation (GDPR). Since 2018, the GDPR has regulated online data privacy and has the responsibility to represent consumers’ interests within the digital and tech landscape of the Union. As of 2023, approximately *** fines were issued in Italy due to violations of the GDPR – making Italy the second country in Europe with the highest number of violations dispensed to tech companies. The highest GDPR fine ever issued in Italy was at the expense of Telecom Italia (TIM), one of the largest Italian telecommunications companies. TIM was fined approximately **** million euros in January 2020. GDPR is enforced and helped by the country's Garante della Privacy, the national institution overseeing Italian users’ online rights, cybersecurity, and digital privacy.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mike Wa Nkongolo (2024). Analysis of zero-day attacks and ransomware [Dataset]. http://doi.org/10.25403/UPresearchdata.25215530.v1

Analysis of zero-day attacks and ransomware

Explore at:
txtAvailable download formats
Dataset updated
Feb 22, 2024
Dataset provided by
University of Pretoria
Authors
Mike Wa Nkongolo
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Cybersecurity faces challenges in identifying and mitigating undefined network vulnerabilities, critical for preventing zero-day attacks. The absence of datasets for distinguishing normal versus abnormal network behavior hinders the development of proactive detection strategies. An obstacle in proactive prevention methods is the absence of comprehensive datasets for contrasting normal versus abnormal network behaviours. Such dataset enabling such contrasts would significantly expedite threat anomaly mitigation. The thesis "Ensemble learning and genetic algorithm for the detection of novel network threat anomaly using the UGRansome Dataset"; introduces UGRansome, a dataset for anomaly detection in network traffic. This dataset comprises a comprehensive set of malware features designed for detecting and quantifying zero-day attacks. It was created by integrating similar attributes from both the UGR'16 and ransomware datasets, following a process of development and validation. Malicious behavior is categorized into normal and abnormal patterns, further characterized through supervised learning techniques, which include anomaly, signature, and synthetic signature stratifications. Despite significant advancements in intrusion detection and prevention systems, the need for detecting and quantifying zero-day attacks, including ransomware, persists. Therefore, the development of a specialized analytical approach tailored for quantifying zero-day attacks within cybersecurity datasets is crucial to effectively address the evolving threat landscape posed by advanced persistent threats.

Search
Clear search
Close search
Google apps
Main menu