13 datasets found

u
Analysis of zero-day attacks and ransomware
researchdata.up.ac.za
txt
Updated Feb 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mike Wa Nkongolo (2024). Analysis of zero-day attacks and ransomware [Dataset]. http://doi.org/10.25403/UPresearchdata.25215530.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.25403/UPresearchdata.25215530.v1
Dataset updated
Feb 22, 2024
Dataset provided by
University of Pretoria
Authors
Mike Wa Nkongolo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cybersecurity faces challenges in identifying and mitigating undefined network vulnerabilities, critical for preventing zero-day attacks. The absence of datasets for distinguishing normal versus abnormal network behavior hinders the development of proactive detection strategies. An obstacle in proactive prevention methods is the absence of comprehensive datasets for contrasting normal versus abnormal network behaviours. Such dataset enabling such contrasts would significantly expedite threat anomaly mitigation. The thesis "Ensemble learning and genetic algorithm for the detection of novel network threat anomaly using the UGRansome Dataset"; introduces UGRansome, a dataset for anomaly detection in network traffic. This dataset comprises a comprehensive set of malware features designed for detecting and quantifying zero-day attacks. It was created by integrating similar attributes from both the UGR'16 and ransomware datasets, following a process of development and validation. Malicious behavior is categorized into normal and abnormal patterns, further characterized through supervised learning techniques, which include anomaly, signature, and synthetic signature stratifications. Despite significant advancements in intrusion detection and prevention systems, the need for detecting and quantifying zero-day attacks, including ransomware, persists. Therefore, the development of a specialized analytical approach tailored for quantifying zero-day attacks within cybersecurity datasets is crucial to effectively address the evolving threat landscape posed by advanced persistent threats.
Businesses worldwide affected by ransomware 2018-2023
statista.com
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Businesses worldwide affected by ransomware 2018-2023 [Dataset]. https://www.statista.com/statistics/204457/businesses-ransomware-attack-rate/
Explore at:
Dataset updated
Nov 9, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Worldwide
Description
As of 2023, over 72 percent of businesses worldwide were affected by ransomware attacks. This figure represents an increase on the previous five years and was by far the highest figure reported. Overall, since 2018, more than half of the total survey respondents each year stated that their organizations had been victimized by ransomware. Most targeted industries
In 2023, the healthcare industry in the United States was once again most targeted by ransomware attacks. This industry also suffers most data breaches as a consequence of cyberattacks. The critical manufacturing industry ranked second by the number of ransomware attacks, followed by the government facilities industry. Ransomware in the manufacturing industry
The manufacturing industry, along with its subindustries, is constantly targeted by ransomware attacks, causing data loss, business disruptions, and reputational damage. Often, such cyberattacks are international and have a political intent. In 2023, compromised credentials were the leading cause of ransomware attacks in the manufacturing industry.
P
Data from: Cybersecurity Threat Detection Dataset
paperswithcode.com
Updated Mar 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Cybersecurity Threat Detection Dataset [Dataset]. https://paperswithcode.com/dataset/cybersecurity-threat-detection
Explore at:
Dataset updated
Mar 7, 2025
Description
Problem Statement

👉 Download the case studies here

Organizations face an increasing number of sophisticated cybersecurity threats, including malware, phishing attacks, and unauthorized access. A financial institution experienced frequent attempts to breach its network, risking sensitive data and regulatory compliance. Traditional security measures were reactive and failed to detect threats in real time. The institution sought a proactive AI-driven solution to identify and prevent cybersecurity threats effectively.

Challenge

Developing an advanced threat detection system required addressing several challenges:

Processing and analyzing large volumes of network traffic and user activity data in real time.

Identifying new and evolving threats, such as zero-day vulnerabilities, with high accuracy.

Minimizing false positives to ensure security teams could focus on genuine threats.

Solution Provided

An AI-powered threat detection system was developed using machine learning algorithms and advanced analytics. The solution was designed to:

Continuously monitor network activity and user behavior to identify suspicious patterns.

Detect and neutralize cybersecurity threats in real time, including malware and phishing attempts.

Provide actionable insights to security teams for faster and more effective threat response.

Development Steps

Data Collection

Collected network traffic logs, endpoint activity, and historical threat data to train machine learning models.

Preprocessing

Cleaned and standardized data, ensuring compatibility across diverse sources, and filtered out noise for accurate analysis.

Model Development

Developed machine learning algorithms for anomaly detection, behavioral analysis, and threat classification. Trained models on labeled datasets to recognize known threats and identify emerging attack patterns.

Validation

Tested the system against simulated and real-world threat scenarios to evaluate detection accuracy, response times, and reliability.

Deployment

Integrated the threat detection system into the institution’s existing cybersecurity infrastructure, including firewalls, SIEM (Security Information and Event Management) tools, and endpoint protection

Continuous Monitoring & Improvement

Established a feedback loop to refine models using new threat data and adapt to evolving attack strategies.

Results

Enhanced Security Posture

The system improved the institution’s ability to detect and prevent cybersecurity threats proactively, strengthening its overall security framework.

Reduced Incidence of Cyber Attacks

Real-time detection and response significantly reduced the frequency and impact of successful cyber attacks.

Improved Threat Response Times

Automated threat identification and prioritization enabled security teams to respond faster and more effectively to potential breaches.

Minimized False Positives

Advanced algorithms reduced false alarms, allowing security teams to focus on genuine threats and improve efficiency.

Scalable and Adaptive Solution

The system adapted to new threats and scaled effortlessly to protect growing organizational networks and data.
D
Data Encryption Market Report
promarketreports.com
doc, pdf, ppt
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pro Market Reports (2025). Data Encryption Market Report [Dataset]. https://www.promarketreports.com/reports/data-encryption-market-9193
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Feb 19, 2025
Dataset authored and provided by
Pro Market Reports
License
https://www.promarketreports.com/privacy-policyhttps://www.promarketreports.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Data Encryption Market Overview The global data encryption market is projected to register significant growth, with a market size of USD 14.5 billion in 2025 and a CAGR of 16% over the forecast period of 2025-2033. The increasing adoption of cloud computing and digital transformation initiatives are driving the demand for data encryption solutions to protect sensitive data from cyber threats. Additionally, industry regulations, such as GDPR and CCPA, are mandating organizations to implement data encryption measures, further fueling market growth. Market Drivers, Restraints, and Trends Key market drivers include rising cybersecurity threats, increasing data breaches, and the growing need for data privacy. The increasing adoption of IoT and mobile computing is also contributing to the need for data encryption. However, the high cost of implementation and the lack of skilled professionals can pose challenges to market growth. Notable market trends include the emergence of advanced encryption algorithms, such as quantum-safe cryptography, and the integration of encryption with AI and machine learning technologies. Regional factors, such as government regulations and technology adoption rates, also influence the market's growth trajectory. Recent developments include: On Apr. 11, 2023, Menlo Security, a leading provider of browser security solutions, published the results of the 10th Annual Cyberthreat Defense Report (CDR) by the CyberEdge Group. The report, partially sponsored by Menlo Security, highlights the augmenting importance of browser isolation technologies to combat ransomware and other malicious threats., The research revealed that most ransomware attacks include threats beyond data encryption. According to the report, around 51% of respondents confirmed that they have been using at least one type of browser or Internet isolation to protect their organizational data, while another 40% are about to deploy data encryption technology. Furthermore, around 33% of respondents noted that browser isolation is a key cybersecurity strategy to protect against sophisticated attacks, including ransomware, phishing, and zero-day attacks., On Feb.14, 2023, EnterpriseDB, a relational database provider, announced the addition of Transparent Data Encryption (TDE) based on open-source PostgreSQL to its databases. The new TDE feature will be shipped along with the firm's enterprise version of its database. TDE is a method of encrypting database files to ensure data security while at rest and in motion., Adding that most enterprises use TDE for compliance issues helps ensure data encryption on the hard drive and files on a backup. Before the development of built-in TDE, enterprises relied on either full-disk encryption or stackable cryptographic file system encryption., On Jan.25, 2023, Researchers from the Tokyo University of Science, Japan, announced the development of a faster and cheaper method for handling encrypted data while improving security. The new data encryption method developed by Japanese researchers combines the best of homomorphic encryption and secret sharing to handle encrypted data., Homomorphic encryption and secret sharing are key methods to compute sensitive data while preserving privacy. Homomorphic encryption is computationally intensive and involves performing computational data encryption on a single server, while secret sharing is fast and computationally efficient., In this method, the encrypted data/secret input is divided and distributed across multiple servers, each performing a computation, such as multiplication, on its data. The results of the computations are then used to reconstruct the original data., September 2022: Convergence Technology Solutions Corp., a supplier of software-enabled IT and cloud solutions, declared that it has obtained certification in Canada to sell and deploy IBM zsystems and LinuxONE., November 2019: Penta Security Systems announced that it has been selected as a finalist for the 2020 SC Magazine Awards, which are given by SC Media and celebrated in the United States. As a result, MyDiamo from Penta Security has been named the Best Database Security Solution of 2020. Additionally, this will result in the expansion of common-level encryption and improve the open-source DBMS installation procedure.. Potential restraints include: ISSUE REGARDING SECURITY AND DATA BREACH 44, HIGH IMPLEMENTATION COSTS AND COMPLEXITY 45; ISSUE WITH RESPECT TO DATA CONSISTENCY AND INTEROPERABILITY ACROSS DIFFERENT EDGE PLATFORMS 45.
Business or organization reporting of ransomware attack to insurance company...
datasets.ai
www150.statcan.gc.ca
+2more
21, 55, 8
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada | Statistique Canada (2024). Business or organization reporting of ransomware attack to insurance company in the last 12 months, fourth quarter of 2021 [Dataset]. https://datasets.ai/datasets/88d59d3b-b313-43bf-97a4-237e617882d9
Explore at:
55, 8, 21Available download formats
Dataset updated
Aug 26, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Authors
Statistics Canada | Statistique Canada
Description
Business or organization reporting of ransomware attack to insurance company in the last 12 months, by North American Industry Classification System (NAICS), business employment size, type of business, business activity and majority ownership, fourth quarter of 2021.
Drone-Based Malware Detection (DBMD)
kaggle.com
Updated Jul 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DatasetEngineer (2024). Drone-Based Malware Detection (DBMD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/9045375
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/9045375
Dataset updated
Jul 27, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
DatasetEngineer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Description Welcome to the Drone-Based Malware Detection dataset! This dataset is designed to aid researchers and practitioners in exploring innovative cybersecurity solutions using drone-collected data. The dataset contains detailed information on network traffic, drone sensor readings, malware detection indicators, and environmental conditions. It offers a unique perspective by integrating data from drones with traditional network security metrics to enhance malware detection capabilities.

Dataset Overview The dataset comprises four main categories:

Network Traffic Data: Captures network traffic attributes including IP addresses, ports, protocols, packet sizes, and various derived metrics. Drone Sensor Data: Includes GPS coordinates, altitude, speed, heading, battery level, and other sensor readings from drones. Malware Detection Data: Contains indicators and scores relevant to detecting malware, such as anomaly scores, suspicious IP counts, reputation scores, and attack types. Environmental Data: Provides context through environmental conditions like location type, noise level, weather conditions, and more. Files and Features The dataset is divided into four separate CSV files:

network_traffic_data.csv

timestamp: Date and time of the traffic event. source_ip: Source IP address. destination_ip: Destination IP address. source_port: Source port number. destination_port: Destination port number. protocol: Network protocol (TCP, UDP, ICMP). packet_length: Length of the network packet. payload_data: Content of the packet payload. flag: Network flag (SYN, ACK, FIN, RST). traffic_volume: Volume of traffic in bytes. flow_duration: Duration of the network flow. flow_bytes_per_s: Bytes per second for the flow. flow_packets_per_s: Packets per second for the flow. packet_count: Number of packets in the flow. average_packet_size: Average size of packets. min_packet_size: Minimum packet size. max_packet_size: Maximum packet size. packet_size_variance: Variance in packet sizes. header_length: Length of the packet header. payload_length: Length of the packet payload. ip_ttl: Time to live for the IP packet. tcp_window_size: TCP window size. icmp_type: ICMP type (echo_request, echo_reply, destination_unreachable). dns_query_count: Number of DNS queries. dns_response_count: Number of DNS responses. http_method: HTTP method (GET, POST, PUT, DELETE). http_status_code: HTTP status code (200, 404, 500, 301). content_type: Content type (text/html, application/json, image/png). ssl_tls_version: SSL/TLS version. ssl_tls_cipher_suite: SSL/TLS cipher suite. drone_data.csv

latitude: Latitude of the drone. longitude: Longitude of the drone. altitude: Altitude of the drone. speed: Speed of the drone. heading: Heading of the drone. battery_level: Battery level of the drone. drone_id: Unique identifier for the drone. flight_time: Total flight time. signal_strength: Strength of the drone's signal. temperature: Temperature at the drone's location. humidity: Humidity at the drone's location. pressure: Atmospheric pressure at the drone's location. wind_speed: Wind speed at the drone's location. wind_direction: Wind direction at the drone's location. gps_accuracy: Accuracy of the GPS signal. malware_detection_data.csv

anomaly_score: Score indicating the level of anomaly detected. suspicious_ip_count: Number of suspicious IP addresses detected. malicious_payload_indicator: Indicator for malicious payload (0 or 1). reputation_score: Reputation score for the network entity. behavioral_score: Behavioral score indicating potential malicious activity. attack_type: Type of attack (DDoS, phishing, malware). signature_match: Indicator for signature match (0 or 1). sandbox_result: Result from sandbox analysis (clean, infected). heuristic_score: Heuristic score for potential threats. traffic_pattern: Pattern of the traffic (burst, steady). environmental_data.csv

location_type: Type of location (urban, rural). nearby_devices: Number of nearby devices. signal_interference: Level of signal interference. noise_level: Noise level in the environment. time_of_day: Time of day (morning, afternoon, evening, night). day_of_week: Day of the week. weather_conditions: Weather conditions (sunny, rainy, cloudy, stormy). Usage and Applications This dataset can be used for:

Cybersecurity Research: Developing and testing algorithms for malware detection using drone data. Machine Learning: Training models to identify malicious activity based on network traffic and drone sensor readings. Data Analysis: Exploring the relationships between environmental conditions, drone sensor data, and network traffic anomalies. Educational Purposes: Teaching data science, machine learning, and cybersecurity concepts using a comprehensive and multi-faceted dataset.

Acknowledgements This dataset is based on real-world data collected from drone sensors and network traffic monitoring s...
t
Dataset of Publication "Malware Communication in Smart Factories: A Network...
researchdata.tuwien.at
csv, txt, zip
Updated Mar 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper (2025). Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set" [Dataset]. http://doi.org/10.48436/ghdc6-45k78
Explore at:
csv, zip, txtAvailable download formats
Unique identifier
https://doi.org/10.48436/ghdc6-45k78
Dataset updated
Mar 31, 2025
Dataset provided by
TU Wien
Authors
Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Aug 11, 2024
Description
Note: If you use this dataset, please cite the following paper:

Brenner, B., Fabini, J., Offermanns, M., Semper, S., & Zseby, T. (2024). Malware communication in smart factories: A network traffic data set. Computer Networks, 255, 110804.

or in BibTeX:

@article{brenner2024malware,
title={Malware communication in smart factories: A network traffic data set},
author={Brenner, Bernhard and Fabini, Joachim and Offermanns, Magnus and Semper, Sabrina and Zseby, Tanja},
journal={Computer Networks},
volume={255},
pages={110804},
year={2024},
publisher={Elsevier}
}

Context and methodology

Machine learning-based intrusion detection requires suitable and realistic data sets for training and testing. However, data sets that originate from real networks are rare. Network data is considered privacy-sensitive, and the purposeful introduction of malicious traffic is usually not possible.

In this paper, we introduce a labeled data set captured at a smart factory located in Vienna, Austria, during normal operation and during penetration tests with different attack types. The data set contains 173 GB of PCAP files, representing 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic.

The captured malicious traffic originated from a professional penetration tester who performed two types of attacks:
(a) Aggressive attacks that are easier to detect.
(b) Stealthy attacks that are harder to detect.

Our data set includes the raw PCAP files and extracted flow data. Labels for packets and flows indicate whether they originated from a specific attack or from benign communication.

We describe the methodology for creating the dataset, conduct an analysis of the data, and provide detailed information about the recorded traffic itself. The dataset is freely available to support reproducible research and the comparability of results in the area of intrusion detection in industrial networks.

Technical details

readme.txt

Information about the data collection, format, necessary software and versions to access it.

license.txt:

Licensing information.

a_day1, a_day2, s_day1, s_day2, tf_a, and tf_s:

Main dataset, where files starting with "tf" are training files containing only benign,
operational data. All other files are attack files containing both operational data and
attack data.

images.zip:

Contains descriptive images about the data.

extractions.zip:

Contains extracted packets and flows in both labeled and unlabeled form.

a_day_tuesday_dos.zip:

An additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.

list_of_extracted_features:

A complete list of features we extracted from the PCAP Files. All flow files contain these features.

list_of_identified_protocols.csv:

A complete list of all protocols that we could identify within the PCAP files provided.
t
Dataset of Publication "Malware Communication in Smart Factories: A Network...
researchdata.tuwien.ac.at
zip
Updated Oct 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper (2024). Dataset of Publication "Malware Communication in Smart Factories: A Network Traffic Data Set" [Dataset]. http://doi.org/10.48436/vs6hv-1vs74
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/vs6hv-1vs74
Dataset updated
Oct 18, 2024
Dataset provided by
TU Wien
Authors
Bernhard Brenner; Joachim Fabini; Joachim Fabini; Magnus Offermanns; Sabrina Semper; Tanja Zseby; Tanja Zseby; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper; Bernhard Brenner; Magnus Offermanns; Sabrina Semper
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Aug 11, 2024
Description
Machine learning-based intrusion detection requires suitable and realistic
data sets for training and testing. However, data sets that originate from
real networks are rare. Network data is considered privacy sensitive and the
purposeful introduction of malicious traffic is usually not possible. In this
paper we introduce a labeled data set captured at a smart factory located
in Vienna, Austria during normal operation and during penetration tests with different
attack types. The data set contains 173 GB of PCAP files, which represent 16 days (395 hours) of factory operation. It includes MQTT, OPC UA, and Modbus/TCP traffic. The captured malicious traffic was originated
by a professional penetration tester who performed two types of attacks: (a)
aggressive attacks that are easier to detect and (b) stealthy attacks that are
harder to detect. Our data set includes the raw PCAP files and extracted
flow data. Labels for packets and flows indicate whether packets (or flows)
originated from a specific attack or from benign communication. We describe
the methodology for creating the data set, conduct an analysis of the data
and provide detailed information about the recorded traffic itself. The data
set is freely available to support reproducible research and the comparability
of results in the area of intrusion detection in industrial networks.

File description:

a_day1, a_day2, s_day1, s_day2, tf_a and tf_s: Main data set, where files starting with "tf" are training files containing only benign, operational data and all other files are attack files containing both, operational data and attack data.

images.zip: Contains descriptive images about the data.

extractions.zip: Contains extracted packets, flows in both labeled and unlabeled form.

a_day_tuesday_dos.zip: additional day of attack traffic containing benign and attack data, including a DoS attack. This day is not labeled.
Global cyberattack distribution 2023, by type
statista.com
Updated Nov 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Global cyberattack distribution 2023, by type [Dataset]. https://www.statista.com/statistics/1382266/cyber-attacks-worldwide-by-type/
Explore at:
Dataset updated
Nov 14, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
In 2023, ransomware was the most frequently detected cyberattack worldwide, with around 70 percent of all detected cyberattacks. Network breaches ranked second, with almost 19 percent of the detections. Although less frequently, data exfiltration was also among the detected cyberattacks.
Z
Trace-Share Dataset for Evaluation of Statistical Characteristics...
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Statistical Characteristics Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3553062
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Cermak, Milan
Madeja, Tomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains all data used during the evaluation of statistical characteristics preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

For more information, see the project repository at https://github.com/Trace-Share.

Selected Attack Traces

We selected 72 different traces of network attacks obtained from various internet databases. File names refer to common names of contained vulnerabilities, malware, or attack tools.

Background Traffic Data

Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

Evaluation Results and Dataset Structure

Traces variants (traces-normalized.zip, traces-adjusted.zip)

./traces-normalized/ — normalized PCAP files and details in YAML format;

./traces-adjusted/ — configuration files for traces combination in YAML format.

Computed statistics (statistics.zip)

./statistics-background/ — background traffic statistics computed by ID2T;

./statistics-combination/ — combined traces statistics computed by ID2T for all adjust options (selected only combinations where ID2T provided all statistics files);

./statistics-difference/ — computed mean and median differences of background and combined traffic traces.

Evaluation results

statistics-difference.ipynb — file containing visualization of statistics differences.
d
SEPTA Ridership Statistics
catalog.data.gov
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SEPTA (2025). SEPTA Ridership Statistics [Dataset]. https://catalog.data.gov/dataset/septa-ridership-statistics
Explore at:
Dataset updated
Mar 31, 2025
Dataset provided by
SEPTA
Description
Stop summary files represent average daily ridership at the stop level over the course of the relevant period. Trolley ridership data was generated using automatic passenger counters (APCs). Bus data is calculated from a variety of sources depending on the route and year. The bus data files represent average daily fall ridership from 2014 – present. Accurate weekend bus data was not available until 2017 at which point SEPTA had more widespread APC coverage. No bus data is available for Fall 2020 due to a malware attack. APC bus data was also not available for articulated vehicles and the Boulevard Direct from August 2020 through February 2022 due to the malware attack.
Z
Trace-Share Dataset for Evaluation of Trace Meaning Preservation
data.niaid.nih.gov
zenodo.org
Updated May 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Trace Meaning Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3547527
Explore at:
Dataset updated
May 7, 2020
Dataset provided by
Cermak, Milan
Madeja, Tomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains all data used during the evaluation of trace meaning preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

For more information, see the project repository at https://github.com/Trace-Share.

Selected Attack Traces

The following list contains trace datasets used for evaluation. Each attack was chosen to have not only a different meaning but also different statistical properties.

dos_http_flood — the capture of GET and POST requests sent to one server by one attacker (HTTP~traffic);

ftp_bruteforce — short and unsuccessful attempt to guess a user’s password for FTP service (FTP traffic);

ponyloader_botnet — Pony Loader botnet used for stealing of credentials from 3 target devices reporting to single IP with a large number of intermediate addresses (DNS and HTTP traffic);

scan — the capture of nmap tool that scans given subnet using ICMP echo and TCP SYN requests (consist of ARP, ICMP, and TCP traffic);

wannacry_ransomware — the capture of Wanacry ransomware that spreads in a domain with three workstations, a domain controller, and a file-sharing server (SMB and SMBv2 traffic).

Background Traffic Data

Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

Evaluation Results and Dataset Structure

Traces variants (traces.zip)

./traces-original/ — trace PCAP files and crawled details in YAML format;

./traces-normalized — normalized PCAP files and details in YAML format;

./traces-adjusted — adjusted PCAP files using various timestamp generation settings, combination configuration in YAML format, and lables provided by ID2T in XML format.

Extracted alerts (alerts.zip)

./alerts-original/ — extracted Suricata alerts, Suricata log, and full Suricata output for all original trace files;

./alerts-normalized/ — extracted Suricata alerts, Suricata log, and full Suricata output for all normalized trace files;

./alerts-adjusted/ — extracted Suricata alerts, Suricata log, and full Suricata output for all adjusted trace files.

Evaluation results

*.csv files in the root directory — data contains extracted alert signatures and their count per each trace variant.
Share of cyberattacks in Italy 2024, by reason
statista.com
ai-chatbox.pro
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Share of cyberattacks in Italy 2024, by reason [Dataset]. https://www.statista.com/statistics/649358/share-cyber-attacks-in-italy-by-reason/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Italy
Description
During the first half of 2024, around ** percent of cyberattacks carried out in Italy had cybercrime as a purpose. Cyber espionage was another motivation, representing the main reason behind roughly **** percent of attacks. By contrast, information warfare only accounted for *** percent of the cyberattacks in the country in the last examined period. Data breaches in Italy In 2023, over half of the Italian digital population was alerted that their personal data had been breached, and **** percent of the alerted users had the misfortune of being affected by data compromise on the dark web. Despite a decrease in the number of data sets affected in data breaches between 2020 and 2023, Italy recorded almost *** million exposed data sets at the beginning of 2023.Meanwhile, the average cost of data breaches for both Italian companies and targeted users kept growing, reaching **** million U.S. dollars in 2024, up from the **** million U.S. dollars recorded in the previous year. The Italian privacy landscape: GDPR effects As a state member of the European Union, Italy is covered by the General Data Protection Regulation (GDPR). Since 2018, the GDPR has regulated online data privacy and has the responsibility to represent consumers’ interests within the digital and tech landscape of the Union. As of 2023, approximately *** fines were issued in Italy due to violations of the GDPR – making Italy the second country in Europe with the highest number of violations dispensed to tech companies. The highest GDPR fine ever issued in Italy was at the expense of Telecom Italia (TIM), one of the largest Italian telecommunications companies. TIM was fined approximately **** million euros in January 2020. GDPR is enforced and helped by the country's Garante della Privacy, the national institution overseeing Italian users’ online rights, cybersecurity, and digital privacy.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mike Wa Nkongolo (2024). Analysis of zero-day attacks and ransomware [Dataset]. http://doi.org/10.25403/UPresearchdata.25215530.v1

Analysis of zero-day attacks and ransomware

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.25403/UPresearchdata.25215530.v1

Dataset updated

Feb 22, 2024

Dataset provided by

University of Pretoria

Authors

Mike Wa Nkongolo

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Cybersecurity faces challenges in identifying and mitigating undefined network vulnerabilities, critical for preventing zero-day attacks. The absence of datasets for distinguishing normal versus abnormal network behavior hinders the development of proactive detection strategies. An obstacle in proactive prevention methods is the absence of comprehensive datasets for contrasting normal versus abnormal network behaviours. Such dataset enabling such contrasts would significantly expedite threat anomaly mitigation. The thesis "Ensemble learning and genetic algorithm for the detection of novel network threat anomaly using the UGRansome Dataset"; introduces UGRansome, a dataset for anomaly detection in network traffic. This dataset comprises a comprehensive set of malware features designed for detecting and quantifying zero-day attacks. It was created by integrating similar attributes from both the UGR'16 and ransomware datasets, following a process of development and validation. Malicious behavior is categorized into normal and abnormal patterns, further characterized through supervised learning techniques, which include anomaly, signature, and synthetic signature stratifications. Despite significant advancements in intrusion detection and prevention systems, the need for detecting and quantifying zero-day attacks, including ransomware, persists. Therefore, the development of a specialized analytical approach tailored for quantifying zero-day attacks within cybersecurity datasets is crucial to effectively address the evolving threat landscape posed by advanced persistent threats.

Clear search

Close search

Google apps

Main menu

Analysis of zero-day attacks and ransomware

Businesses worldwide affected by ransomware 2018-2023

Data from: Cybersecurity Threat Detection Dataset

Data Encryption Market Report

Business or organization reporting of ransomware attack to insurance company...

Drone-Based Malware Detection (DBMD)

Dataset of Publication "Malware Communication in Smart Factories: A Network...

Context and methodology

Technical details

Dataset of Publication "Malware Communication in Smart Factories: A Network...

Global cyberattack distribution 2023, by type

Trace-Share Dataset for Evaluation of Statistical Characteristics...

SEPTA Ridership Statistics

Trace-Share Dataset for Evaluation of Trace Meaning Preservation

Share of cyberattacks in Italy 2024, by reason

Analysis of zero-day attacks and ransomware