18 datasets found

Cyber security breaches survey 2023
gov.uk
beta.ukdataservice.ac.uk
+2more
Updated Apr 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department for Science, Innovation and Technology (2023). Cyber security breaches survey 2023 [Dataset]. https://www.gov.uk/government/statistics/cyber-security-breaches-survey-2023
Explore at:
Dataset updated
Apr 19, 2023
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Department for Science, Innovation and Technology
Description
The government has surveyed UK businesses, charities and educational institutions to find out how they approach cyber security and gain insight into the cyber security issues they face. The research informs government policy on cyber security and how government works with industry to build a prosperous and resilient digital UK.

Published

19 April 2023

Period covered

Respondents were asked about their approach to cyber security and any breaches or attacks over the 12 months before the interview. Main survey interviews took place between October 2022 and January 2023. Qualitative follow up interviews took place in December 2022 and January 2023.

Geographic coverage

UK

Further Information

The survey is part of the government’s National Cyber Strategy 2002.

There is a wide range of free government cyber security guidance and information for businesses, including details of free online training and support.

The survey was carried out by Ipsos UK. The report has been produced by Ipsos on behalf of the Department for Science, Innovation and Technology.

The UK Statistics Authority

This release is published in accordance with the Code of Practice for Statistics (2018), as produced by the UK Statistics Authority. The UKSA has the overall objective of promoting and safeguarding the production and publication of official statistics that serve the public good. It monitors and reports on all official statistics, and promotes good practice in this area.

Pre-release access

The document above contains a list of ministers and officials who have received privileged early access to this release. In line with best practice, the list has been kept to a minimum and those given access for briefing purposes had a maximum of 24 hours.

Contact information

The Lead Analyst for this release is Emma Johns. For any queries please contact cybersurveys@dsit.gov.uk.

For media enquiries only, please contact the press office on 020 7215 1000.
s
Countries Most Affected By Ransomware Attacks
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Countries Most Affected By Ransomware Attacks [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
On average, 37% of organisations globally were victims of a ransomware attack between January and February 2021. The top 15 countries that were affected the most were...
Z
DNP3 Intrusion Detection Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vasiliki (2024). DNP3 Intrusion Detection Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7348493
Explore at:
Dataset updated
Jul 15, 2024
Dataset provided by
Vasiliki
Vasileios
Panagiotis
Thomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
1.Introduction

In the digital era of the Industrial Internet of Things (IIoT), the conventional Critical Infrastructures (CIs) are transformed into smart environments with multiple benefits, such as pervasive control, self-monitoring and self-healing. However, this evolution is characterised by several cyberthreats due to the necessary presence of insecure technologies. DNP3 is an industrial communication protocol which is widely adopted in the CIs of the US. In particular, DNP3 allows the remote communication between Industrial Control Systems (ICS) and Supervisory Control and Data Acquisition (SCADA). It can support various topologies, such as Master-Slave, Multi-Drop, Hierarchical and Multiple-Server. Initially, the architectural model of DNP3 consists of three layers: (a) Application Layer, (b) Transport Layer and (c) Data Link Layer. However, DNP3 can be now incorporated into the Transmission Control Protocol/Internet Protocol (TCP/IP) stack as an application-layer protocol. However, similarly to other industrial protocols (e.g., Modbus and IEC 60870-5-104), DNP3 is characterised by severe security issues since it does not include any authentication or authorisation mechanisms. More information about the DNP3 security issue is provided in [1-3]. This dataset contains labelled Transmission Control Protocol (TCP) / Internet Protocol (IP) network flow statistics (Common-Separated Values - CSV format) and DNP3 flow statistics (CSV format) related to 9 DNP3 cyberattacks. These cyberattacks are focused on DNP3 unauthorised commands and Denial of Service (DoS). The network traffic data are provided through Packet Capture (PCAP) files. Consequently, this dataset can be used to implement Artificial Intelligence (AI)-powered Intrusion Detection and Prevention (IDPS) systems that rely on Machine Learning (ML) and Deep Learning (DL) techniques.

2.Instructions

This DNP3 Intrusion Detection Dataset was implemented following the methodological frameworks of A. Gharib et al. in [4] and S. Dadkhah et al in [5], including eleven features: (a) Complete Network Configuration, (b) Complete Traffic, (c) Labelled Dataset, (d) Complete Interaction, (e) Complete Capture, (f) Available Protocols, (g) Attack Diversity, (h) Heterogeneity, (i) Feature Set and (j) Metadata.

A network topology consisting of (a) eight industrial entities, (b) one Human Machine Interfaces (HMI) and (c) three cyberattackers was used to implement this DNP3 Intrusion Detection Dataset. In particular, the following cyberattacks were implemented.

On Thursday, May 14, 2020, the DNP3 Disable Unsolicited Messages Attack was executed for 4 hours.

On Friday, May 15, 2020, the DNP3 Cold Restart Message Attack was executed for 4 hours.

On Friday, May 15, 2020, the DNP3 Warm Restart Message Attack was executed for 4 hours.

On Saturday, May 16, 2020, the DNP3 Enumerate Attack was executed for 4 hours.

On Saturday, May 16, 2020, the DNP3 Info Attack was executed for 4 hours.

On Monday, May 18, 2020, the DNP3 Initialisation Attack was executed for 4 hours.

On Monday, May 18, 2020, the Man In The Middle (MITM)-DoS Attack was executed for 4 hours.

On Monday, May 18, 2020, the DNP3 Replay Attack was executed for 4 hours.

On Tuesday, May 19, 2020, the DNP3 Stop Application Attack was executed for 4 hours.

The aforementioned DNP3 cyberattacks were executed, utilising penetration testing tools, such as Nmap and Scapy. For each attack, a relevant folder is provided, including the network traffic and the network flow statistics for each entity. In particular, for each cyberattack, a folder is given, providing (a) the pcap files for each entity, (b) the Transmission Control Protocol (TCP)/ Internet Protocol (IP) network flow statistics for 120 seconds in a CSV format and (c) the DNP3 flow statistics for each entity (using different timeout values in terms of second (such as 45, 60, 75, 90, 120 and 240 seconds)). The TCP/IP network flow statistics were produced by using the CICFlowMeter, while the DNP3 flow statistics were generated based on a Custom DNP3 Python Parser, taking full advantage of Scapy.

Dataset Structure

The dataset consists of the following folders:

20200514_DNP3_Disable_Unsolicited_Messages_Attack: It includes the pcap and CSV files related to the DNP3 Disable Unsolicited Message attack.

20200515_DNP3_Cold_Restart_Attack: It includes the pcap and CSV files related to the DNP3 Cold Restart attack.

20200515_DNP3_Warm_Restart_Attack: It includes the pcap and CSV files related to DNP3 Warm Restart attack.

20200516_DNP3_Enumerate: It includes the pcap and CSV files related to the DNP3 Enumerate attack.

20200516_DNP3_Ιnfo: It includes the pcap and CSV files related to the DNP3 Info attack.

20200518_DNP3_Initialize_Data_Attack: It includes the pcap and CSV files related to the DNP3 Data Initialisation attack.

20200518_DNP3_MITM_DoS: It includes the pcap and CSV files related to the DNP3 MITM-DoS attack.

20200518_DNP3_Replay_Attack: It includes the pcap and CSV files related to the DNP3 replay attack.

20200519_DNP3_Stop_Application_Attack: It includes the pcap and CSV files related to the DNP3 Stop Application attack.

Training_Testing_Balanced_CSV_Files: It includes balanced CSV files from CICFlowMeter and the Custom DNP3 Python Parser that could be utilised for training ML and DL methods. Each folder includes different sub-folder for the corresponding flow timeout values used by the DNP3 Python Custom Parser. For CICFlowMeter, only the timeout value of 120 seconds was used.

Each folder includes respective subfolders related to the entities/devices (described in the following section) participating in each attack. In particular, for each entity/device, there is a folder including (a) the DNP3 network traffic (pcap file) related to this entity/device during each attack, (b) the TCP/IP network flow statistics (CSV file) generated by CICFlowMeter for the timeout value of 120 seconds and finally (c) the DNP3 flow statistics (CSV file) from the Custom DNP3 Python Parser. Finally, it is noteworthy that the network flows from both CICFlowMeter and Custom DNP3 Python Parser in each CSV file are labelled based on the DNP3 cyberattacks executed for the generation of this dataset. The description of these attacks is provided in the following section, while the various features from CICFlowMeter and Custom DNP3 Python Parser are presented in Section 5.

4.Testbed & DNP3 Attacks

The following figure shows the testbed utilised for the generation of this dataset. It is composed of eight industrial entities that play the role of the DNP3 outstations/slaves, such as Remote Terminal Units (RTUs) and Intelligent Electron Devices (IEDs). Moreover, there is another workstation which plays the role of the Master station like a Master Terminal Unit (MTU). For the communication between, the DNP3 outstations/slaves and the master station, opendnp3 was used.

Table 1: DNP3 Attacks Description

DNP3 Attack

Description

Dataset Folder

DNP3 Disable Unsolicited Message Attack

This attack targets a DNP3 outstation/slave, establishing a connection with it, while acting as a master station. The false master then transmits a packet with the DNP3 Function Code 21, which requests to disable all the unsolicited messages on the target.

20200514_DNP3_Disable_Unsolicited_Messages_Attack

DNP3 Cold Restart Attack

The malicious entity acts as a master station and sends a DNP3 packet that includes the “Cold Restart” function code. When the target receives this message, it initiates a complete restart and sends back a reply with the time window before the restart process.

20200515_DNP3_Cold_Restart_Attack

DNP3 Warm Restart Attack

This attack is quite similar to the “Cold Restart Message”, but aims to trigger a partial restart, re-initiating a DNP3 service on the target outstation.

20200515_DNP3_Warm_Restart_Attack

DNP3 Enumerate Attack

This reconnaissance attack aims to discover which DNP3 services and functional codes are used by the target system.

20200516_DNP3_Enumerate

DNP3 Info Attack

This attack constitutes another reconnaissance attempt, aggregating various DNP3 diagnostic information related the DNP3 usage.

20200516_DNP3_Ιnfo

Data Initialisation Attack

This cyberattack is related to Function Code 15 (Initialize Data). It is an unauthorised access attack, which demands from the slave to re-initialise possible configurations to their initial values, thus changing potential values defined by legitimate masters

20200518_Initialize_Data_Attack

MITM-DoS Attack

In this cyberattack, the cyberattacker is placed between a DNP3 master and a DNP3 slave device, dropping all the messages coming from the DNP3 master or the DNP3 slave.

20200518_MITM_DoS

DNP3 Replay Attack

This cyberattack replays DNP3 packets coming from a legitimate DNP3 master or DNP3 slave.

20200518_DNP3_Replay_Attack

DNP3 Step Application Attack

This attack is related to the Function Code 18 (Stop Application) and demands from the slave to stop its function so that the slave cannot receive messages from the master.

20200519_DNP3_Stop_Application_Attack

Features

The TCP/IP network flow statistics generated by CICFlowMeter are summarised below. The TCP/IP network flows and their statistics generated by CICFlowMeter are labelled based on the DNP3 attacks described above, thus allowing the training of ML/DL models. Finally, it is worth mentioning that these statistics are generated when the flow timeout value is equal with 120 seconds.

Table
Number of cyber crimes reported in India 2022, by leading state
statista.com
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of cyber crimes reported in India 2022, by leading state [Dataset]. https://www.statista.com/statistics/1097071/india-number-of-cyber-crimes-by-leading-state/
Explore at:
Dataset updated
Dec 4, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
India
Description
In 2022, the state of Telangana in India had the highest number of reported cybercrimes compared to the rest of the country, with over ****** cases registered with the authorities. The country recorded over ****** cases of cybercrime that year, marking a significant increase compared to about ****** cases in 2016. Cybercrime in India The growing digital economy has created new opportunities for cybercriminals by introducing higher complexity or widening the scope of digital aspects in our daily lives. India is no exception, for example, the number of people arrested and charged for cybercrime across India in 2021 showed a wide spectrum of criminal charges including but not limited to blackmailing, forgery, sexual exploitation, or counterfeiting. Studies also indicated small businesses to be likely targets of such crimes. Combating cybercrime The country led in the encounter rate of cybercrimes, with **************** internet users reporting having experienced a cybercrime, compared to the world average of about four out of ten internet users in 2022. As the government pushes for a digital India, cybersecurity has become the need of the hour. Special initiatives such as the Indian Cyber Crime Coordination Centre, which helps to coordinate the efforts in combating cybercrime, as well as initiatives to raise public awareness and build institutional capacity to cope with it, have been funded by the government.
s
Most Targeted Sectors By Malware and Ransomware
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Most Targeted Sectors By Malware and Ransomware [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
While every industry is affected by ransomware attacks, the truth is that some industries are more susceptible than others. This is the full breakdown of the top 15 sectors most targeted by malware.
H
Hour-Long Wget Attack Dataset (Base Graph)
dataverse.harvard.edu
Updated Oct 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xueyuan Han (2018). Hour-Long Wget Attack Dataset (Base Graph) [Dataset]. http://doi.org/10.7910/DVN/IWFWSP
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/IWFWSP
Dataset updated
Oct 1, 2018
Dataset provided by
Harvard Dataverse
Authors
Xueyuan Han
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Wget provenance data in edge-list format parsed from CamFlow provenance data. This dataset contains attack wget base graph data. Experiments were run for over an hour, with recurrent wget commands issued throughout the experiments (one for every 120 seconds). Background activities were also captured as CamFlow whole-system provenance was turned on. Several malicious URL were run during each experimental session. 5 attack experiments were recorded with different normal benign wget operations mixture. Provenance data was in JSON format and converted into edge-list format for the Unicorn IDS research project. Conversation time was Sept. 26th, 2018. Each experiment consists of a base and a streaming graph component.
Z
IEC 60870-5-104 Intrusion Detection Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vasileios (2024). IEC 60870-5-104 Intrusion Detection Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7108614
Explore at:
Dataset updated
Jul 16, 2024
Dataset provided by
Konstantinos
Vasileios
Panagiotis
Thomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IEC 60870-5-104

Intrusion Detection Dataset

Readme File

ITHACA – University of Western Macedonia - https://ithaca.ece.uowm.gr/

Authors: Panagiotis Radoglou-Grammatikis, Thomas Lagkas, Vasileios Argyriou, Panagiotis Sarigiannidis

Publication Date: September 23, 2022

1.Introduction

The evolution of the Industrial Internet of Things (IIoT) introduces several benefits, such as real-time monitoring, pervasive control and self-healing. However, despite the valuable services, security and privacy issues still remain given the presence of legacy and insecure communication protocols like IEC 60870-5-104. IEC 60870-5-104 is an industrial protocol widely applied in critical infrastructures, such as the smart electrical grid and industrial healthcare systems. The IEC 60870-5-104 Intrusion Detection Dataset was implemented in the context of the research paper entitled "Modeling, Detecting, and Mitigating Threats Against Industrial Healthcare Systems: A Combined Software Defined Networking and Reinforcement Learning Approach" [1], in the context of two H2020 projects: ELECTRON: rEsilient and seLf-healed EleCTRical pOwer Nanogrid (101021936) and SDN-microSENSE: SDN - microgrid reSilient Electrical eNergy SystEm (833955). This dataset includes labelled Transmission Control Protocol (TCP)/Internet Protocol (IP) network flow statistics (Common-Separated Values (CSV) format) and IEC 60870-5-104 flow statistics (CSV format) related to twelve IEC 60870-5-104 cyberattacks. In particular, the cyberattacks are related to unauthorised commands and Denial of Service (DoS) activities against IEC 60870-5-104. Moreover, the relevant Packet Capture (PCAP) files are available. The dataset can be utilised for Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS), taking full advantage of Machine Learning (ML) and Deep Learning (DL).

2.Instructions

The IEC 60870-5-104 dataset was implemented following the methodology of A. Gharib et al. in [2], including eleven features: (a) Complete Network Configuration, (b) Complete Traffic, (c) Labelled Dataset, (d) Complete Interaction, (e) Complete Capture, (f) Available Protocols, (g) Attack Diversity, (h) Heterogeneity, (i) Feature Set and (j) Metadata.

A network topology consisting of (a) seven industrial entities, (b) one Human Machine Interfaces (HMI) and (c) three cyberattackers was used to construct the IEC 60870-5-104 Intrusion Detection Dataset. The industrial entities use IEC TestServer[1], while the HMI uses Qtester104[2]. On the other hand, the cyberattackers use Kali Linux[3] equipped with Metasploit[4], OpenMUC j60870[5] and Ettercap[6]. The cyberattacks were performed during the following days.

On Saturday, April 25, 2020, a DoS cyberattack (M_SP_NA_1_DoS) was executed for 2 hours, using the M_SP_NA_1 command.

On Sunday, April 26, 2020, two cyberattacks were executed, namely (a) DoS (C_CI_NA_1_DoS) and (b) unauthorised injection (C_CI_NA_1), using the C_CI_NA_1 command for 2 hours.

On Monday, April 27, 2020, one unauthorised injection attack (C_SE_NA_1) was executed for 4 hours, using the C_SE_NA_1 command.

Tuesday, April 28, 2020 two cyberattacks were executed, namely (a) unauthorised injection (C_SC_NA_1) and (b) DoS (C_SE_NA_1_DoS), using the C_SC_NA_1 and C_SE_NA_1 commands for 2 hours and 4 hours, respectively.

Wednesday, April 29, 2020, one DoS (C_SC_NA_1) cyberattack was performed for 2 hours, using the C_SC_NA_1 command.

Friday, June 05, 2020, two cyberattacks were executed, namely (a) DoS (C_RD_NA_1_DoS) and (b) unauthorised injection (C_RD_NA_1), using the C_RD_NA_1 command for 2 and 4 hours, respectively.

Saturday, June 06, 2020, two cyberattacks were executed, namely (a) DoS (C_RP_NA_1_DoS) and (b) unauthorised injection (C_RP_NA_1), using the C_RP_NA_1 command for 2 and 4 hours, respectively.

Monday, June 08, 2020, a Man In The Middle (MITM) cyberattack was executed for 2 hours, filtering and dropping the IEC 60870-5-104 packets.

For each attack, a 7zip file is provided, including the network traffic and the network flow statistics for each entity. Moreover, a relevant diagram is provided, illustrating the corresponding cyberattack. In particular, for each entity, a folder is given, including (a) the relevant pcap file, (b) Transmission Control Protocol (TCP) / Internet Protocol (IP) network flow statistics in a Common Separated Value (CSV) format and (c) IEC 60870-5-104 flow statistics in a CSV format. The TCP/IP network flow statistics were generated by CICFlowMeter[7], while the IEC 60870-5-104 flow statistics were generated based on a Custom IEC 60870-5-104 Python Parser[8], taking full advantage of Scapy[9].

3.Dataset Structure

The dataset consists of the following files:

20200425_UOWM_IEC104_Dataset_m_sp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the M_SP_NA_1 attack.

20200426_UOWM_IEC104_Dataset_c_ci_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_CI_NA_1_DoS attack.

20200426_UOWM_IEC104_Dataset_c_ci_na_1.7z: A 7zip file including the pcap and CSV files related to C_CI_NA_1 attack.

20200427_UOWM_IEC104_Dataset_c_se_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1 attack.

20200428_UOWM_IEC104_Dataset_c_sc_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1 attack.

20200428_UOWM_IEC104_Dataset_c_se_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1_DoS attack.

20200429_UOWM_IEC104_Dataset_c_sc_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1_DoS attack.

20200605_UOWM_IEC104_Dataset_c_rd_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1_DoS attack.

20200605_UOWM_IEC104_Dataset_c_rd_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1 attack.

20200606_UOWM_IEC104_Dataset_c_rp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1_DoS attack.

20200606_UOWM_IEC104_Dataset_c_rp_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1 attack.

20200608_UOWM_IEC104_Dataset_mitm_drop.7z: A 7zip file including the pcap and CSV files related to the MITM attack.

Balanced_IEC104_Train_Test_CSV_Files.zip: This zip file includes balanced CSV files from CICFlowMeter and the Custom IEC 60870-5-104 Python Parser that could be utilised for training ML and DL methods. The zip file includes different folders for the corresponding flow timeout values used for CICFlowMeter and IEC 60870-5-104 Python Parser, respectively.

Each 7zip file includes respective folders related to the entities/devices (described in the following section) participating in each attack. In particular, for each entity/device, there is a folder including (a) the overall network traffic (pcap file) related to this entity/device during each attack, (b) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the overall network traffic, (c) the IEC 60870-5-104 network traffic (pcap file) related to this entity/device during each attack, (d) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the IEC 608770-5-104 network traffic, (e) the IEC 60870-5-104 flow statistics (CSV file) from the Custom IEC 60870-5-104 Python Parser for the IEC 608770-5-104 network traffic and finally, (f) an image showing how the attack was executed. Finally, it is noteworthy that the network flow from both CICFlowMeter and Custom IEC 60870-5-104 Python Parser in each CSV file are labelled based on the IEC 60870-5-104 cyberattacks executed for the generation of this dataset. The description of these attacks is given in the following section, while the various features from CICFlowMeter and Custom IEC 60870-5-104 Python Parser are presented in Section 5.

4.Testbed & IEC 60870-5-104 Attacks

The testbed created for generating this dataset is composed of five virtual RTU devices emulated by IEC TestServer and two real RTU devices. Moreover, there is another workstation which plays the role of Master Terminal Unit (MTU) and HMI, sending legitimate IEC 60870-5-104 commands to the corresponding RTUs. For this purpose, the workstation uses QTester104. In addition, there are three attackers that act as malicious insiders executing the following cyberattacks against the aforementioned RTUs. Finally, the network traffic data of each entity/device was captured through tshark.

Table 1: IEC 60870-5-104 Cyberattacks Description

IEC 60870-5-104 Cyberattack Description

Description

Dataset Files

MITM Drop

During this attack, the cyberattacker is placed between two endpoints, thus monitoring and dropping the network traffic exchanged.

20200608_UOWM_IEC104_Dataset_mitm_drop.7z

C_CI_NA_1

The C_CI_NA_1 is a Counter Interrogation command in the control direction. This cyberattack sends unauthorised IEC 60870-5-104 C_CI_NA_1 packets to the target system.

20200426_UOWM_IEC104_Dataset_c_ci_na_1.7z

C_SC_NA_1

The C_SC_NA_1 command is a single command. This cyberattack sends unauthorised C_SC_NA_1 60870-5-104 packets to the target system

20200428_UOWM_IEC104_Dataset_c_sc_na_1.7z

C_SE_NA_1

The C_SE_NA_1 command is a set-point command with normalised values. This cyberattack sends unauthorised IEC 60870-5-104 C_SE_NA_1 packets to the target system.

20200427_UOWM_IEC104_Dataset_c_se_na_1.7z

C_RD_NA_1

The C_RD_NA_1 command is a read command. This cyberattack sends unauthorised IEC 60870-5-104 C_RD_NA_1 packets to the target
Data from: Malware Finances and Operations: a Data-Driven Study of the Value...
zenodo.org
data.niaid.nih.gov
zip
Updated Jun 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juha Nurmi; Juha Nurmi; Mikko Niemelä; Mikko Niemelä; Billy Brumley; Billy Brumley (2023). Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access [Dataset]. http://doi.org/10.5281/zenodo.8047205
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8047205
Dataset updated
Jun 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Juha Nurmi; Juha Nurmi; Mikko Niemelä; Mikko Niemelä; Billy Brumley; Billy Brumley
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description

The datasets demonstrate the malware economy and the value chain published in our paper, Malware Finances and Operations: a Data-Driven Study of the Value Chain for Infections and Compromised Access, at the 12th International Workshop on Cyber Crime (IWCC 2023), part of the ARES Conference, published by the International Conference Proceedings Series of the ACM ICPS.

Using the well-documented scripts, it is straightforward to reproduce our findings. It takes an estimated 1 hour of human time and 3 hours of computing time to duplicate our key findings from MalwareInfectionSet; around one hour with VictimAccessSet; and minutes to replicate the price calculations using AccountAccessSet. See the included README.md files and Python scripts.

We choose to represent each victim by a single JavaScript Object Notation (JSON) data file. Data sources provide sets of victim JSON data files from which we've extracted the essential information and omitted Personally Identifiable Information (PII). We collected, curated, and modelled three datasets, which we publish under the Creative Commons Attribution 4.0 International License.

1. MalwareInfectionSet
We discover (and, to the best of our knowledge, document scientifically for the first time) that malware networks appear to dump their data collections online. We collected these infostealer malware logs available for free. We utilise 245 malware log dumps from 2019 and 2020 originating from 14 malware networks. The dataset contains 1.8 million victim files, with a dataset size of 15 GB.

2. VictimAccessSet
We demonstrate how Infostealer malware networks sell access to infected victims. Genesis Market focuses on user-friendliness and continuous supply of compromised data. Marketplace listings include everything necessary to gain access to the victim's online accounts, including passwords and usernames, but also detailed collection of information which provides a clone of the victim's browser session. Indeed, Genesis Market simplifies the import of compromised victim authentication data into a web browser session. We measure the prices on Genesis Market and how compromised device prices are determined. We crawled the website between April 2019 and May 2022, collecting the web pages offering the resources for sale. The dataset contains 0.5 million victim files, with a dataset size of 3.5 GB.

3. AccountAccessSet
The Database marketplace operates inside the anonymous Tor network. Vendors offer their goods for sale, and customers can purchase them with Bitcoins. The marketplace sells online accounts, such as PayPal and Spotify, as well as private datasets, such as driver's licence photographs and tax forms. We then collect data from Database Market, where vendors sell online credentials, and investigate similarly. To build our dataset, we crawled the website between November 2021 and June 2022, collecting the web pages offering the credentials for sale. The dataset contains 33,896 victim files, with a dataset size of 400 MB.

Credits Authors

Billy Bob Brumley (Tampere University, Tampere, Finland)

Juha Nurmi (Tampere University, Tampere, Finland)

Mikko Niemelä (Cyber Intelligence House, Singapore)

Funding

This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme under project numbers 804476 (SCARE) and 952622 (SPIRS).

Alternative links to download: AccountAccessSet, MalwareInfectionSet, and VictimAccessSet.
o
MAD (MAlicious Traffic Dataset) in home and commercial environments -...
explore.openaire.eu
data.niaid.nih.gov
Updated Apr 4, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Alberto Martins De Sousa Teles; Felipe Da R. Henriques (2021). MAD (MAlicious Traffic Dataset) in home and commercial environments - Environment with scalability [Dataset]. http://doi.org/10.5281/zenodo.5112290
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5112290
Dataset updated
Apr 4, 2021
Authors
Carlos Alberto Martins De Sousa Teles; Felipe Da R. Henriques
Description
We have used the Internet environment: 01 Switch, 01 IP camera, 01 server for monitoring, 01 server for honeypot and no firewall. This environment is directly connected to the Internet. We installed a server, functioning as a Monitoring Environment. The network traffic was obtained via Port Mirroring on the switch to the Monitoring Environment server. We added 08 virtual machines and performed the following test with a denial of service DoS attack: 01 virtual machine from 04:00 pm to 23:55 pm on 2019-12-04 with an interval every 01 hour; 02 virtual machines from 23:55 am on 2019-12-04 to 08:50 am on 2019-12-05 with an interval every 01 hour; 04 virtual machines as of 08:55 am on 2019-12-05 to 05:25 pm on 2019-12-06 with an interval every 5 minutes; 08 virtual machines from 05:30 pm on 2019-12-06 to 23:59 on 2019-12-06 with an interval every 5 minutes; End of tests with shutdown of virtual machines at 23:59 on 2019-12-06. The results were obtained from Suricata and Telegraf collections from the TICK stack. All evidence was performed by queries via EveBox, which received data from Suricata, Grafana or graphics with information extracted from the InfluxDB (Grafana) and PostgreSQL (EveBox) databases. events.csv.gz - Suricata / Evebox collections net.csv.gz - Telegraf collections from the TICK stack netstat.csv.gz - Telegraf collections from the TICK stack For correlation purposes, use the events.csv.gz file as a basis. The key to correlation is the 'timestamp' column events.csv.gz with the 'time' column in the net.csv.gz and netstat.csv.gz files. The interval between collections, non-consecutive, was from 2019-12-04 to 2019-12-06
R
Ransomware Statistics
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Search Logistics (2025). Ransomware Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
Dataset authored and provided by
Search Logistics
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These latest ransomware statistics show how much damage is caused by attacks and the emerging trends you need to be aware of.

HAI Security Dataset

kaggle.com

zip

Updated Apr 27, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

ICS Security Dataset (2022). HAI Security Dataset [Dataset]. https://www.kaggle.com/icsdataset/hai-security-dataset

Explore at:

zip(487855254 bytes)Available download formats

Dataset updated

Apr 27, 2022

Authors

ICS Security Dataset

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

HIL-based Augmented ICS (HAI) Security Dataset

The HAI dataset was collected from a realistic industiral control system (ICS) testbed augmented with a Hardware-In-the-Loop (HIL) simulator that emulates steam-turbine power generation and pumped-storage hydropower generation.

Click here to find out more about the HAI dataset.

Please e-mail us here if you have any questions about the dataset.

Background

In 2017, three laboratory-scale CPS testbeds were initially launched, namely GE’s turbine testbed, Emerson’s boiler testbed, and FESTO’s modular production system (MPS) water-treatment testbed. These testbeds are related to relatively simple processes, and were operated independently of each other.
In 2018, a complex process system was built to combine the three systems using a HIL simulator, where generation of thermal power and pumped-storage hydropower was simulated. This ensured that the variables were highly coupled and correlated for a richer dataset. In addition, an open platform communications united architecture (OPC-UA) gateway was installed to facilitate data collection from heterogeneous devices.
The first version of HAI dataset, HAI 1.0, was made available on GitHub and Kaggle in February 2020. This dataset included ICS operational data from normal and anomalous situations for 38 attacks. Subsequently, a debugged version of HAI 1.0, namely HAI 20.07, was released for the HAICon 2020 competition in August 2020.
HAI 21.03 was released in 2021, and was based on a more tightly coupled HIL simulator to produce clearer attack effects with additional attacks. This version provides more quantitative information and covers a variety of operational situations, and provides better insights into the dynamic changes of the physical system.
HAI 22.04 contained more sophisticated attacks that are significantly more difficult to detect than those in the previous versions. Comparing only the baseline TaPRs of HAICon 2020 and HAICon 2021, detection difficulty in HAI 22.04 is approximately four times higher than HAI 21.03.

HAI Testbed

The testbed consists of four different processes: boiler process, turbine process, water treatement process and HIL simulation:

Boiler Process (P1): This includes water-to-water heat trasfer at a low pressure and a moderate temperature. This process is controlled using Emerson Ovation DCS.
Turbine Process (P2): A rotor kit process that closely simulates the behavior of an actual rotating machine. It is controlled by GE's Mark VIe DCS.
Water treatment Process (P3): This process includes pumping water to the upper reservoir and releasing it back into the lower reservoir. It is controlled by Siemens's S7-300 PLC.
HIL Simulation(P4): Both the boiler and turbine processes are interconnected to synchronize with the rotating speed of the virtual steam-turbine power generation model. The pump and value in the water-treatment process are controlled by the pumped-storage hydropower generation model. The dSPACE's SCALEXIO system is used for the HIL simulations and is interconnected with the real-world processes through a Siemens S7-1500 PLC and ET200 remote IO devices for data-acquisition system based on the OPC gateway.

HAI Datasets

Two major versions of HAI datasets have been released thus far. Each dataset consists of several CSV files, and each file satisfies time continuity. The quantitative summary of each version are as follows:

Note: The version numbering follows a date-based scheme, where the version number indicates the released date of the HAI dataset. HAI 20.07 is the bug-fixed version of HAI v1.0 released in February 2020.

version	Data Points (points/sec)	Normal Datset Files(interval, size)	Attack Dataset Files (interval, size, attack count)
HAI 22.04	86	train1.csv ( 26 hours, 51 MB) train2.csv ( 56 hours, 109 MB) train3.csv (35 hours, 67 MB) train4.csv (24 hours, 46 MB) train5.csv ( 66 hours, 125 MB) train6.csv (72 hours, 137 MB))	test1.csv (24 hours, 48 MB, 07 attacks) test2.csv (23 hours, 45 MB, 17 attacks) test3.csv (17 hours, 33 MB, 10 attacks) test4.csv (36hours, 70MB, 24 attacks)

|HAI 21.03|78|train1.csv ( 60 hours, 100 MB)
train2.csv ( 63 hours, 116 MB)
train3.csv (229 hours, 246 MB) | test1.csv (12 hours, 22 MB, 05 attacks)
test2.csv (33 hours, 62 MB, 20 attacks)
test3.csv (30 hours, 56 MB, 08 attacks)
test4.csv (11 hours, 20MB, 05 attacks)
test5.csv (26 hours, 48MB, 12 attacks)| |HAI 20.07
(HAI 1.0)| 59| train1.csv (86 hours, 127 MB)
train2.csv (91 hours, 98 MB) | test1.csv (81 hours, 119 MB)
test2.csv (42 hours, 62 MB)|

Data fields

The time-series data in each CSV file satisfies time continuity. The first column represents the observed time as “yyyy-MM-dd hh:mm:ss,” while the rest columns provide the recorded SCADA data points. The last four columns provide data labels for whether an attack occurred or not, where the attack column was applicable to all process and the other three columns were for the corresponding control processes.

Refer to the latest technical manual for the details for each column.

time	P1_B2004	P2_B2016	...	attack	attack_P1	...	attack_P3
20190926 13:00:00	0.09830	1.07370	...	0	0	...	0
20190926 13:00:01	0.09830	1.07410	...	1	0	...	1
20190926 13:00:02	0.09830	1.07380	...	1	0	...	1
20190926 13:00:03	0.09830	1.07360	...	1	1	...	1
20190926 13:00:04	0.09830	1.07430	...	1	1	...	1

Getting the dataset

Type git clone, and the paste the below URL. $ git clone https://github.com/icsdataset/hai To unzip multiple gzip files, you can use: $ gunzip *.gz

Performance Evaluation

Use of eTaPR (Enhanced Time-series Aware Precision and Recall) metric is strongly recommended to evaluate your anomaly detection model, which provides fairness to performance comparisons with other studies. Got something to suggest? Let us know!

Projects using the dataset

Here are some projects and experiments that are using or featuring the dataset in interesting ways. Got something to add? Let us know!

The related projects so far are as follows.

Anomaly Detection

Year 2022

Year 2020

Testbed/Dataset

Year 2021

Probabilistic attack sequence generation and execution based on mitre att&ck for ics datasets

Year 2020

[Expansion of ICS testbed for security validation based on MITRE ATT&CK techniques][TB_20_01]
[Expanding a programmable cps testbed for network attack analysis][TB_20_02]
[Co-occurrence based security event analysis and visualization for cyber physical systems][TB_20_03]

s
Ransomware Statistics Overview
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Ransomware Statistics Overview [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here are the most important ransomware statistics you need to know about the attacks, demands, payments and consequences that can occur.
s
Is Paying The Ransom A Good Idea?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Is Paying The Ransom A Good Idea? [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The main goal of any ransomware attacker is to hold people to ransom by not releasing their data until they get paid. But is it actually a good idea to pay the ransom? Here’s what the ransomware statistics tell us about organisations that paid up.
s
Which Strains Of Ransomware Are Most Common?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Which Strains Of Ransomware Are Most Common? [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Different types of ransomware are more common than others and more likely to affect your cybersecurity. The top 5 most common types of ransomware strains are...
s
What Can Cause A Ransomware Infection?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). What Can Cause A Ransomware Infection? [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Here are the leading causes of ransomware attacks today.
s
Who Are The Victims Of Ransomware?
searchlogistics.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Who Are The Victims Of Ransomware? [Dataset]. https://www.searchlogistics.com/learn/statistics/ransomware-statistics/
Explore at:
Dataset updated
Apr 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The following ransomware statistics detail which industries get attacked the most and which countries are most likely to be targeted.
D
Data Masking Report
marketresearchforecast.com
doc, pdf, ppt
Updated May 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Data Masking Report [Dataset]. https://www.marketresearchforecast.com/reports/data-masking-547254
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
May 4, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The data masking market, valued at $397.4 million in 2025, is experiencing robust growth, projected to expand at a compound annual growth rate (CAGR) of 8.9% from 2025 to 2033. This significant expansion is driven by increasing concerns surrounding data privacy regulations like GDPR and CCPA, coupled with the rising adoption of cloud computing and the burgeoning need for secure data sharing across diverse organizational functions. The dynamic nature of data masking solutions, offering real-time protection and adaptability to evolving security threats, further fuels market growth. Key segments contributing to this expansion include the finance sector, heavily regulated and requiring stringent data protection, and the human resources (HR) sector, where sensitive employee information demands robust security measures. The market's growth trajectory is also influenced by the increasing sophistication of cyber threats and the escalating costs associated with data breaches, prompting organizations to invest proactively in data masking technologies. Further fueling market growth is the increasing adoption of data masking across various applications beyond traditional finance and HR. Operations, legal, and even support and R&D departments are increasingly recognizing the value of data masking in protecting sensitive business information and maintaining compliance. While the market faces certain restraints, such as the complexity of implementing data masking solutions and the potential for high initial investment costs, the long-term benefits of enhanced data security and regulatory compliance significantly outweigh these challenges. Leading players like IBM, Informatica, and Oracle are continuously innovating their offerings, incorporating advanced techniques such as tokenization and pseudonymization, driving market consolidation and further stimulating growth within the data masking landscape. The geographical distribution of the market reflects a strong presence in North America, driven by stringent regulations and advanced technological adoption, with Europe and Asia-Pacific also exhibiting considerable growth potential.
ENISA-ina taksonomija prijetnji
data.europa.eu
excel xls
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Union Agency for Cybersecurity, ENISA-ina taksonomija prijetnji [Dataset]. https://data.europa.eu/data/datasets/enisa-threat-taxonomy-1?locale=hr
Explore at:
excel xlsAvailable download formats
Dataset provided by
ENISAhttp://www.enisa.europa.eu/
Authors
European Union Agency for Cybersecurity
License
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
Description
Trenutačna taksonomija prijetnji početna je verzija koja je izrađena na temelju dostupnih materijala ENISA-e. Taj je materijal upotrijebljen kao potpora za strukturiranje unutarnjeg ustroja ENISA-e za prikupljanje informacija i konsolidaciju prijetnji svrhe. Nastao je u razdoblju 2012. – 2015. Konsolidirana aksonomija prijetnje početna je verzija: ENISA 2016. planira ažurirati i proširiti ga dodatnim pojedinostima, kao što su definicije različitih navedenih prijetnji.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Department for Science, Innovation and Technology (2023). Cyber security breaches survey 2023 [Dataset]. https://www.gov.uk/government/statistics/cyber-security-breaches-survey-2023

Cyber security breaches survey 2023

Explore at:

47 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Apr 19, 2023

Dataset provided by

GOV.UKhttp://gov.uk/

Authors

Department for Science, Innovation and Technology

Description

The government has surveyed UK businesses, charities and educational institutions to find out how they approach cyber security and gain insight into the cyber security issues they face. The research informs government policy on cyber security and how government works with industry to build a prosperous and resilient digital UK.

Published

19 April 2023

Period covered

Respondents were asked about their approach to cyber security and any breaches or attacks over the 12 months before the interview. Main survey interviews took place between October 2022 and January 2023. Qualitative follow up interviews took place in December 2022 and January 2023.

Geographic coverage

Further Information

The survey is part of the government’s National Cyber Strategy 2002.

There is a wide range of free government cyber security guidance and information for businesses, including details of free online training and support.

The survey was carried out by Ipsos UK. The report has been produced by Ipsos on behalf of the Department for Science, Innovation and Technology.

The UK Statistics Authority

This release is published in accordance with the Code of Practice for Statistics (2018), as produced by the UK Statistics Authority. The UKSA has the overall objective of promoting and safeguarding the production and publication of official statistics that serve the public good. It monitors and reports on all official statistics, and promotes good practice in this area.

Pre-release access

The document above contains a list of ministers and officials who have received privileged early access to this release. In line with best practice, the list has been kept to a minimum and those given access for briefing purposes had a maximum of 24 hours.

Contact information

The Lead Analyst for this release is Emma Johns. For any queries please contact cybersurveys@dsit.gov.uk.

For media enquiries only, please contact the press office on 020 7215 1000.

Clear search

Close search

Google apps

Main menu

Cyber security breaches survey 2023

Published

Period covered

Geographic coverage

Further Information

The UK Statistics Authority

Pre-release access

Contact information

Countries Most Affected By Ransomware Attacks

DNP3 Intrusion Detection Dataset

Number of cyber crimes reported in India 2022, by leading state

Most Targeted Sectors By Malware and Ransomware

Hour-Long Wget Attack Dataset (Base Graph)

IEC 60870-5-104 Intrusion Detection Dataset

Data from: Malware Finances and Operations: a Data-Driven Study of the Value...

MAD (MAlicious Traffic Dataset) in home and commercial environments -...

Ransomware Statistics

HAI Security Dataset

HIL-based Augmented ICS (HAI) Security Dataset

Background

HAI Testbed

HAI Datasets

Data fields

Getting the dataset

Performance Evaluation

Projects using the dataset

Anomaly Detection

Year 2022

Year 2021

Year 2020

Testbed/Dataset

Year 2021

Year 2020

Ransomware Statistics Overview

Is Paying The Ransom A Good Idea?

Which Strains Of Ransomware Are Most Common?

What Can Cause A Ransomware Infection?

Who Are The Victims Of Ransomware?

Data Masking Report

ENISA-ina taksonomija prijetnji

Cyber security breaches survey 2023See More Versions

Published

Period covered

Geographic coverage

Further Information

The UK Statistics Authority

Pre-release access

Contact information

Cyber security breaches survey 2023