47 datasets found

IoT Intrusion Detection
kaggle.com
Updated Jul 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cyber Cop (2023). IoT Intrusion Detection [Dataset]. http://doi.org/10.34740/kaggle/dsv/6142327
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/6142327
Dataset updated
Jul 16, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Cyber Cop
License
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
Description
The dataset has been introduced by the below-mentioned researches: E. C. P. Neto, S. Dadkhah, R. Ferreira, A. Zohourian, R. Lu, A. A. Ghorbani. "CICIoT2023: A real-time dataset and benchmark for large-scale attacks in IoT environment," Sensor (2023) – (submitted to Journal of Sensors). The present data contains different kinds of IoT intrusions. The categories of the IoT intrusions enlisted in the data are as follows: DDoS Brute Force Spoofing DoS Recon Web-based Mirai

There are several subcategories are present in the data for each kind of intrusion types in the IoT. The dataset contains 1191264 instances of network for intrusions and 47 features of each of the intrusions. The dataset can be used to prepare the predictive model through which different kind of intrusive attacks can be detected. The data is also suitable for designing the IDS system.
NSL-KDD
kaggle.com
Updated Mar 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kiran (2020). NSL-KDD [Dataset]. https://www.kaggle.com/datasets/kiranmahesh/nslkdd
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 16, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kiran
Description
Dataset

This dataset was created by Kiran

Contents
Intrusion Detection Dataset
kaggle.com
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sid007 (2024). Intrusion Detection Dataset [Dataset]. https://www.kaggle.com/datasets/sadaqatrehman/intrusion-detection-dataset/suggestions?status=pending&yourSuggestions=true
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 2, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
sid007
Description
Dataset

This dataset was created by sid007

Contents
P
EDGE-IIOTSET Dataset
paperswithcode.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). EDGE-IIOTSET Dataset [Dataset]. https://paperswithcode.com/dataset/edge-iiotset
Explore at:
Dataset updated
Oct 16, 2023
Description
ABSTRACT In this project, we propose a new comprehensive realistic cyber security dataset of IoT and IIoT applications, called Edge-IIoTset, which can be used by machine learning-based intrusion detection systems in two different modes, namely, centralized and federated learning. Specifically, the proposed testbed is organized into seven layers, including, Cloud Computing Layer, Network Functions Virtualization Layer, Blockchain Network Layer, Fog Computing Layer, Software-Defined Networking Layer, Edge Computing Layer, and IoT and IIoT Perception Layer. In each layer, we propose new emerging technologies that satisfy the key requirements of IoT and IIoT applications, such as, ThingsBoard IoT platform, OPNFV platform, Hyperledger Sawtooth, Digital twin, ONOS SDN controller, Mosquitto MQTT brokers, Modbus TCP/IP, ...etc. The IoT data are generated from various IoT devices (more than 10 types) such as Low-cost digital sensors for sensing temperature and humidity, Ultrasonic sensor, Water level detection sensor, pH Sensor Meter, Soil Moisture sensor, Heart Rate Sensor, Flame Sensor, ...etc.). However, we identify and analyze fourteen attacks related to IoT and IIoT connectivity protocols, which are categorized into five threats, including, DoS/DDoS attacks, Information gathering, Man in the middle attacks, Injection attacks, and Malware attacks. In addition, we extract features obtained from different sources, including alerts, system resources, logs, network traffic, and propose new 61 features with high correlations from 1176 found features. After processing and analyzing the proposed realistic cyber security dataset, we provide a primary exploratory data analysis and evaluate the performance of machine learning approaches (i.e., traditional machine learning as well as deep learning) in both centralized and federated learning modes.

Instructions:

Great news! The Edge-IIoT dataset has been featured as a "Document in the top 1% of Web of Science." This indicates that it is ranked within the top 1% of all publications indexed by the Web of Science (WoS) in terms of citations and impact.

Please kindly visit kaggle link for the updates: https://www.kaggle.com/datasets/mohamedamineferrag/edgeiiotset-cyber-sec...

Free use of the Edge-IIoTset dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes is allowable after asking the leader author, Dr Mohamed Amine Ferrag, who has asserted his right under the Copyright.

The details of the Edge-IIoT dataset were published in following the paper. For the academic/public use of these datasets, the authors have to cities the following paper:

Mohamed Amine Ferrag, Othmane Friha, Djallel Hamouda, Leandros Maglaras, Helge Janicke, "Edge-IIoTset: A New Comprehensive Realistic Cyber Security Dataset of IoT and IIoT Applications for Centralized and Federated Learning", IEEE Access, April 2022 (IF: 3.37), DOI: 10.1109/ACCESS.2022.3165809

Link to paper : https://ieeexplore.ieee.org/document/9751703

The directories of the Edge-IIoTset dataset include the following:

•File 1 (Normal traffic)

-File 1.1 (Distance): This file includes two documents, namely, Distance.csv and Distance.pcap. The IoT sensor (Ultrasonic sensor) is used to capture the IoT data.

-File 1.2 (Flame_Sensor): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.3 (Heart_Rate): This file includes two documents, namely, Flame_Sensor.csv and Flame_Sensor.pcap. The IoT sensor (Flame Sensor) is used to capture the IoT data.

-File 1.4 (IR_Receiver): This file includes two documents, namely, IR_Receiver.csv and IR_Receiver.pcap. The IoT sensor (IR (Infrared) Receiver Sensor) is used to capture the IoT data.

-File 1.5 (Modbus): This file includes two documents, namely, Modbus.csv and Modbus.pcap. The IoT sensor (Modbus Sensor) is used to capture the IoT data.

-File 1.6 (phValue): This file includes two documents, namely, phValue.csv and phValue.pcap. The IoT sensor (pH-sensor PH-4502C) is used to capture the IoT data.

-File 1.7 (Soil_Moisture): This file includes two documents, namely, Soil_Moisture.csv and Soil_Moisture.pcap. The IoT sensor (Soil Moisture Sensor v1.2) is used to capture the IoT data.

-File 1.8 (Sound_Sensor): This file includes two documents, namely, Sound_Sensor.csv and Sound_Sensor.pcap. The IoT sensor (LM393 Sound Detection Sensor) is used to capture the IoT data.

-File 1.9 (Temperature_and_Humidity): This file includes two documents, namely, Temperature_and_Humidity.csv and Temperature_and_Humidity.pcap. The IoT sensor (DHT11 Sensor) is used to capture the IoT data.

-File 1.10 (Water_Level): This file includes two documents, namely, Water_Level.csv and Water_Level.pcap. The IoT sensor (Water sensor) is used to capture the IoT data.

•File 2 (Attack traffic):

-File 2.1 (Attack traffic (CSV files)): This file includes 13 documents, namely, Backdoor_attack.csv, DDoS_HTTP_Flood_attack.csv, DDoS_ICMP_Flood_attack.csv, DDoS_TCP_SYN_Flood_attack.csv, DDoS_UDP_Flood_attack.csv, MITM_attack.csv, OS_Fingerprinting_attack.csv, Password_attack.csv, Port_Scanning_attack.csv, Ransomware_attack.csv, SQL_injection_attack.csv, Uploading_attack.csv, Vulnerability_scanner_attack.csv, XSS_attack.csv. Each document is specific for each attack.

-File 2.2 (Attack traffic (PCAP files)): This file includes 13 documents, namely, Backdoor_attack.pcap, DDoS_HTTP_Flood_attack.pcap, DDoS_ICMP_Flood_attack.pcap, DDoS_TCP_SYN_Flood_attack.pcap, DDoS_UDP_Flood_attack.pcap, MITM_attack.pcap, OS_Fingerprinting_attack.pcap, Password_attack.pcap, Port_Scanning_attack.pcap, Ransomware_attack.pcap, SQL_injection_attack.pcap, Uploading_attack.pcap, Vulnerability_scanner_attack.pcap, XSS_attack.pcap. Each document is specific for each attack.

•File 3 (Selected dataset for ML and DL):

-File 3.1 (DNN-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating deep learning-based intrusion detection systems.

-File 3.2 (ML-EdgeIIoT-dataset): This file contains a selected dataset for the use of evaluating traditional machine learning-based intrusion detection systems.

Step 1: Downloading The Edge-IIoTset dataset From the Kaggle platform from google.colab import files

!pip install -q kaggle

files.upload()

!mkdir ~/.kaggle

!cp kaggle.json ~/.kaggle/

!chmod 600 ~/.kaggle/kaggle.json

!kaggle datasets download -d mohamedamineferrag/edgeiiotset-cyber-security-dataset-of-iot-iiot -f "Edge-IIoTset dataset/Selected dataset for ML and DL/DNN-EdgeIIoT-dataset.csv"

!unzip DNN-EdgeIIoT-dataset.csv.zip

!rm DNN-EdgeIIoT-dataset.csv.zip

Step 2: Reading the Datasets' CSV file to a Pandas DataFrame: import pandas as pd

import numpy as np

df = pd.read_csv('DNN-EdgeIIoT-dataset.csv', low_memory=False)

Step 3 : Exploring some of the DataFrame's contents: df.head(5)

print(df['Attack_type'].value_counts())

Step 4: Dropping data (Columns, duplicated rows, NAN, Null..): from sklearn.utils import shuffle

drop_columns = ["frame.time", "ip.src_host", "ip.dst_host", "arp.src.proto_ipv4","arp.dst.proto_ipv4",

"http.file_data","http.request.full_uri","icmp.transmit_timestamp", "http.request.uri.query", "tcp.options","tcp.payload","tcp.srcport", "tcp.dstport", "udp.port", "mqtt.msg"]

df.drop(drop_columns, axis=1, inplace=True)

df.dropna(axis=0, how='any', inplace=True)

df.drop_duplicates(subset=None, keep="first", inplace=True)

df = shuffle(df)

df.isna().sum()

print(df['Attack_type'].value_counts())

Step 5: Categorical data encoding (Dummy Encoding): import numpy as np

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from sklearn import preprocessing

def encode_text_dummy(df, name):

dummies = pd.get_dummies(df[name])

for x in dummies.columns:

dummy_name = f"{name}-{x}" df[dummy_name] = dummies[x]

df.drop(name, axis=1, inplace=True)

encode_text_dummy(df,'http.request.method')

encode_text_dummy(df,'http.referer')

encode_text_dummy(df,"http.request.version")

encode_text_dummy(df,"dns.qry.name.len")

encode_text_dummy(df,"mqtt.conack.flags")

encode_text_dummy(df,"mqtt.protoname")

encode_text_dummy(df,"mqtt.topic")

Step 6: Creation of the preprocessed dataset df.to_csv('preprocessed_DNN.csv', encoding='utf-8')

For more information about the dataset, please contact the lead author of this project, Dr Mohamed Amine Ferrag, on his email: mohamed.amine.ferrag@gmail.com

More information about Dr. Mohamed Amine Ferrag is available at:

https://www.linkedin.com/in/Mohamed-Amine-Ferrag

https://dblp.uni-trier.de/pid/142/9937.html

https://www.researchgate.net/profile/Mohamed_Amine_Ferrag

https://scholar.google.fr/citations?user=IkPeqxMAAAAJ&hl=fr&oi=ao

https://www.scopus.com/authid/detail.uri?authorId=56115001200

https://publons.com/researcher/1322865/mohamed-amine-ferrag/

https://orcid.org/0000-0002-0632-3172

Last Updated: 27 Mar. 2023
m
CIC-DDoS2019 Dataset
data.mendeley.com
Updated Mar 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Md Alamin Talukder (2023). CIC-DDoS2019 Dataset [Dataset]. http://doi.org/10.17632/ssnc74xm6r.1
Explore at:
Unique identifier
https://doi.org/10.17632/ssnc74xm6r.1
Dataset updated
Mar 3, 2023
Authors
Md Alamin Talukder
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Distributed Denial of Service (DDoS) attack is a menace to network security that aims at exhausting the target networks with malicious traffic. Although many statistical methods have been designed for DDoS attack detection, designing a real-time detector with low computational overhead is still one of the main concerns. On the other hand, the evaluation of new detection algorithms and techniques heavily relies on the existence of well-designed datasets. In this paper, first, we review the existing datasets comprehensively and propose a new taxonomy for DDoS attacks. Secondly, we generate a new dataset, namely CICDDoS2019, which remedies all current shortcomings. Thirdly, using the generated dataset, we propose a new detection and family classification approach based on a set of network flow features. Finally, we provide the most important feature sets to detect different types of DDoS attacks with their corresponding weights.

The dataset offers an extended set of Distributed Denial of Service attacks, most of which employ some form of amplification through reflection. The dataset shares its feature set with the other CIC NIDS datasets, IDS2017, IDS2018 and DoS2017

original paper link: https://ieeexplore.ieee.org/abstract/document/8888419 kaggle dataset link: https://www.kaggle.com/datasets/dhoogla/cicddos2019
Network Intrusion Detection Dataset (UNR-IDD)
kaggle.com
Updated Feb 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mostafa Nofal (2023). Network Intrusion Detection Dataset (UNR-IDD) [Dataset]. https://www.kaggle.com/mostafanofal/network-intrusion-detection-dataset-unr-idd/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mostafa Nofal
Description
Dataset Source

The dataset is from the University of Nevada, Reno Link: https://www.tapadhirdas.com/unr-idd-dataset

Multi-class Classification

The goal of multi-class classification is to differentiate the intrusions not only from normal working conditions but also from each other. Multi-class classification helps us to learn about the root causes of network intrusions. The labels for multi-class classification in UNR-IDD are illustrated in the accompanying table.

Label Description
Normal Network Functionality.
TCP-SYN TCP-SYN Flood.
PortScan Port Scanning.
Overflow Flow Table Overflow.
Blackhole Blackhole Attack.
Diversion Traffic Diversion Attack.
P
UNSW-NB15 Dataset
paperswithcode.com
library.toponeai.link
Updated Jul 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nour Moustafa; Jill Slay (2018). UNSW-NB15 Dataset [Dataset]. https://paperswithcode.com/dataset/unsw-nb15
Explore at:
Dataset updated
Jul 22, 2018
Authors
Nour Moustafa; Jill Slay
Description
UNSW-NB15 is a network intrusion dataset. It contains nine different attacks, includes DoS, worms, Backdoors, and Fuzzers. The dataset contains raw network packets. The number of records in the training set is 175,341 records and the testing set is 82,332 records from the different types, attack and normal.

Paper: UNSW-NB15: a comprehensive data set for network intrusion detection systems
i
Unified Multimodal Network Intrusion Detection Systems Dataset
ieee-dataport.org
Updated Oct 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syed Wali Rizvi (2024). Unified Multimodal Network Intrusion Detection Systems Dataset [Dataset]. https://ieee-dataport.org/documents/unified-multimodal-network-intrusion-detection-systems-dataset
Explore at:
Dataset updated
Oct 19, 2024
Authors
Syed Wali Rizvi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
and contextual features
Smart Grid Intrusion Detection Dataset
kaggle.com
Updated Oct 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hussain Afzaal 03 (2024). Smart Grid Intrusion Detection Dataset [Dataset]. https://www.kaggle.com/datasets/hussainsheikh03/smart-grid-intrusion-detection-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 10, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hussain Afzaal 03
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Hussain Afzaal 03

Released under CC0: Public Domain

Contents
P
Kitsune Network Attack Dataset Dataset
paperswithcode.com
Updated Oct 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yisroel Mirsky; Tomer Doitshman; Yuval Elovici; Asaf Shabtai (2023). Kitsune Network Attack Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/kitsune-network-attack-dataset
Explore at:
Dataset updated
Oct 16, 2023
Authors
Yisroel Mirsky; Tomer Doitshman; Yuval Elovici; Asaf Shabtai
Description
Kitsune Network Attack Dataset This is a collection of nine network attack datasets captured from a either an IP-based commercial surveillance system or a network full of IoT devices. Each dataset contains millions of network packets and diffrent cyber attack within it.

For each attack, you are supplied with:

A preprocessed dataset in csv format (ready for machine learning) The corresponding label vector in csv format The original network capture in pcap format (in case you want to engineer your own features)

We will now describe in detail what's in these datasets and how they were collected.

The Network Attacks We have collected a wide variety of attacks which you would find in a real network intrusion. The following is a list of the cyber attack datasets avalaible:

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F827271%2F79e305668553e521b0709a2413323c45%2Fkaggle_dataset_table.png?generation=1598461684070844&alt=media" alt="image" width="100">

For more details on the attacks themselves, please refer to our NDSS paper (citation below).

The Data Collection The following figure presents the network topologies which we used to collect the data, and the corrisponding attack vectors at which the attacks were performed. The network capture took place at point 1 and point X at the router (where a network intrusion detection system could feasibly be placed). For each dataset, clean network traffic was captured for the first 1 million packets, then the cyber attack was performed.

The Dataset Format Each preprocessed dataset csv has m rows (packets) and 115 columns (features) with no header. The 115 features were extracted using our AfterImage feature extractor, described in our NDSS paper (see below) and available in Python here. In summary, the 115 features provide a statistical snapshot of the network (hosts and behaviors) in the context of the current packet traversing the network. The AfterImage feature extractor is unique in that it can efficiently process millions of streams (network channels) in real-time, incrementally, making it suitable for handling network traffic.

Citation If you use these datasets, please cite:

@inproceedings{mirsky2018kitsune, title={Kitsune: An Ensemble of Autoencoders for Online Network Intrusion Detection}, author={Mirsky, Yisroel and Doitshman, Tomer and Elovici, Yuval and Shabtai, Asaf}, booktitle={The Network and Distributed System Security Symposium (NDSS) 2018}, year={2018} }
i
CLOUD ATTACK DATASET
ieee-dataport.org
Updated Nov 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laxmi Pranitha Rachamalla (2021). CLOUD ATTACK DATASET [Dataset]. https://ieee-dataport.org/documents/cloud-attack-dataset
Explore at:
Dataset updated
Nov 30, 2021
Authors
Laxmi Pranitha Rachamalla
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With the modern day technological advancements and the evolution of Industry 4.0
h
UNSW-NB15
huggingface.co
Updated Mar 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Witold Wydmański (2023). UNSW-NB15 [Dataset]. https://huggingface.co/datasets/wwydmanski/UNSW-NB15
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2023
Authors
Witold Wydmański
Description
Source

https://www.kaggle.com/datasets/dhoogla/unswnb15?resource=download

Dataset

This is an academic intrusion detection dataset. All the credit goes to the original authors: dr. Nour Moustafa and dr. Jill Slay. Please cite their original paper and all other appropriate articles listed on the UNSW-NB15 page. The full dataset also offers the pcap, BRO and Argus files along with additional documentation. The modifications to the predesignated train-test sets are minimal… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/UNSW-NB15.
Z
Data from: Dataset for IDS testing
data.niaid.nih.gov
zenodo.org
Updated Jun 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lukaseder, Thomas (2020). Dataset for IDS testing [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892998
Explore at:
Dataset updated
Jun 14, 2020
Dataset provided by
Wagner, Mathias
Lukaseder, Thomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset constructed to trigger IDS rules based on the community data set of the Snort Intrusion Detection System

Label	Description
Normal	Network Functionality.
TCP-SYN	TCP-SYN Flood.
PortScan	Port Scanning.
Overflow	Flow Table Overflow.
Blackhole	Blackhole Attack.
Diversion	Traffic Diversion Attack.

IEC 60870-5-104 Intrusion Detection Dataset

zenodo.org

bin, pdf

Updated Jul 16, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Panagiotis; Panagiotis; Konstantinos; Thomas; Thomas; Vasileios; Vasileios; Panagiotis; Panagiotis; Konstantinos (2024). IEC 60870-5-104 Intrusion Detection Dataset [Dataset]. http://doi.org/10.21227/fj7s-f281

Explore at:

bin, pdfAvailable download formats

Unique identifier

https://doi.org/10.21227/fj7s-f281

Dataset updated

Jul 16, 2024

Dataset provided by

Zenodo

Authors

Panagiotis; Panagiotis; Konstantinos; Thomas; Thomas; Vasileios; Vasileios; Panagiotis; Panagiotis; Konstantinos

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

IEC 60870-5-104

Intrusion Detection Dataset

Readme File

ITHACA – University of Western Macedonia - https://ithaca.ece.uowm.gr/

Authors: Panagiotis Radoglou-Grammatikis, Thomas Lagkas, Vasileios Argyriou, Panagiotis Sarigiannidis

Publication Date: September 23, 2022

1.Introduction

The evolution of the Industrial Internet of Things (IIoT) introduces several benefits, such as real-time monitoring, pervasive control and self-healing. However, despite the valuable services, security and privacy issues still remain given the presence of legacy and insecure communication protocols like IEC 60870-5-104. IEC 60870-5-104 is an industrial protocol widely applied in critical infrastructures, such as the smart electrical grid and industrial healthcare systems. The IEC 60870-5-104 Intrusion Detection Dataset was implemented in the context of the research paper entitled "Modeling, Detecting, and Mitigating Threats Against Industrial Healthcare Systems: A Combined Software Defined Networking and Reinforcement Learning Approach" [1], in the context of two H2020 projects: ELECTRON: rEsilient and seLf-healed EleCTRical pOwer Nanogrid (101021936) and SDN-microSENSE: SDN - microgrid reSilient Electrical eNergy SystEm (833955). This dataset includes labelled Transmission Control Protocol (TCP)/Internet Protocol (IP) network flow statistics (Common-Separated Values (CSV) format) and IEC 60870-5-104 flow statistics (CSV format) related to twelve IEC 60870-5-104 cyberattacks. In particular, the cyberattacks are related to unauthorised commands and Denial of Service (DoS) activities against IEC 60870-5-104. Moreover, the relevant Packet Capture (PCAP) files are available. The dataset can be utilised for Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS), taking full advantage of Machine Learning (ML) and Deep Learning (DL).

2.Instructions

The IEC 60870-5-104 dataset was implemented following the methodology of A. Gharib et al. in [2], including eleven features: (a) Complete Network Configuration, (b) Complete Traffic, (c) Labelled Dataset, (d) Complete Interaction, (e) Complete Capture, (f) Available Protocols, (g) Attack Diversity, (h) Heterogeneity, (i) Feature Set and (j) Metadata.

A network topology consisting of (a) seven industrial entities, (b) one Human Machine Interfaces (HMI) and (c) three cyberattackers was used to construct the IEC 60870-5-104 Intrusion Detection Dataset. The industrial entities use IEC TestServer[1], while the HMI uses Qtester104[2]. On the other hand, the cyberattackers use Kali Linux[3] equipped with Metasploit[4], OpenMUC j60870[5] and Ettercap[6]. The cyberattacks were performed during the following days.

On Saturday, April 25, 2020, a DoS cyberattack (M_SP_NA_1_DoS) was executed for 2 hours, using the M_SP_NA_1 command.
On Sunday, April 26, 2020, two cyberattacks were executed, namely (a) DoS (C_CI_NA_1_DoS) and (b) unauthorised injection (C_CI_NA_1), using the C_CI_NA_1 command for 2 hours.
On Monday, April 27, 2020, one unauthorised injection attack (C_SE_NA_1) was executed for 4 hours, using the C_SE_NA_1 command.
Tuesday, April 28, 2020 two cyberattacks were executed, namely (a) unauthorised injection (C_SC_NA_1) and (b) DoS (C_SE_NA_1_DoS), using the C_SC_NA_1 and C_SE_NA_1 commands for 2 hours and 4 hours, respectively.
Wednesday, April 29, 2020, one DoS (C_SC_NA_1) cyberattack was performed for 2 hours, using the C_SC_NA_1 command.
Friday, June 05, 2020, two cyberattacks were executed, namely (a) DoS (C_RD_NA_1_DoS) and (b) unauthorised injection (C_RD_NA_1), using the C_RD_NA_1 command for 2 and 4 hours, respectively.
Saturday, June 06, 2020, two cyberattacks were executed, namely (a) DoS (C_RP_NA_1_DoS) and (b) unauthorised injection (C_RP_NA_1), using the C_RP_NA_1 command for 2 and 4 hours, respectively.
Monday, June 08, 2020, a Man In The Middle (MITM) cyberattack was executed for 2 hours, filtering and dropping the IEC 60870-5-104 packets.

For each attack, a 7zip file is provided, including the network traffic and the network flow statistics for each entity. Moreover, a relevant diagram is provided, illustrating the corresponding cyberattack. In particular, for each entity, a folder is given, including (a) the relevant pcap file, (b) Transmission Control Protocol (TCP) / Internet Protocol (IP) network flow statistics in a Common Separated Value (CSV) format and (c) IEC 60870-5-104 flow statistics in a CSV format. The TCP/IP network flow statistics were generated by CICFlowMeter[7], while the IEC 60870-5-104 flow statistics were generated based on a Custom IEC 60870-5-104 Python Parser[8], taking full advantage of Scapy[9].

3.Dataset Structure

The dataset consists of the following files:

20200425_UOWM_IEC104_Dataset_m_sp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the M_SP_NA_1 attack.
20200426_UOWM_IEC104_Dataset_c_ci_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_CI_NA_1_DoS attack.
20200426_UOWM_IEC104_Dataset_c_ci_na_1.7z: A 7zip file including the pcap and CSV files related to C_CI_NA_1 attack.
20200427_UOWM_IEC104_Dataset_c_se_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1 attack.
20200428_UOWM_IEC104_Dataset_c_sc_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1 attack.
20200428_UOWM_IEC104_Dataset_c_se_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1_DoS attack.
20200429_UOWM_IEC104_Dataset_c_sc_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1_DoS attack.
20200605_UOWM_IEC104_Dataset_c_rd_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1_DoS attack.
20200605_UOWM_IEC104_Dataset_c_rd_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1 attack.
20200606_UOWM_IEC104_Dataset_c_rp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1_DoS attack.
20200606_UOWM_IEC104_Dataset_c_rp_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1 attack.
20200608_UOWM_IEC104_Dataset_mitm_drop.7z: A 7zip file including the pcap and CSV files related to the MITM attack.
Balanced_IEC104_Train_Test_CSV_Files.zip: This zip file includes balanced CSV files from CICFlowMeter and the Custom IEC 60870-5-104 Python Parser that could be utilised for training ML and DL methods. The zip file includes different folders for the corresponding flow timeout values used for CICFlowMeter and IEC 60870-5-104 Python Parser, respectively.

Each 7zip file includes respective folders related to the entities/devices (described in the following section) participating in each attack. In particular, for each entity/device, there is a folder including (a) the overall network traffic (pcap file) related to this entity/device during each attack, (b) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the overall network traffic, (c) the IEC 60870-5-104 network traffic (pcap file) related to this entity/device during each attack, (d) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the IEC 608770-5-104 network traffic, (e) the IEC 60870-5-104 flow statistics (CSV file) from the Custom IEC 60870-5-104 Python Parser for the IEC 608770-5-104 network traffic and finally, (f) an image showing how the attack was executed. Finally, it is noteworthy that the network flow from both CICFlowMeter and Custom IEC 60870-5-104 Python Parser in each CSV file are labelled based on the IEC 60870-5-104 cyberattacks executed for the generation of this dataset. The description of these attacks is given in the following section, while the various features from CICFlowMeter and Custom IEC 60870-5-104 Python Parser are presented in Section 5.

4.Testbed & IEC 60870-5-104 Attacks

The testbed created for generating this dataset is composed of five virtual RTU devices emulated by IEC TestServer and two real RTU devices. Moreover, there is another workstation which plays the role of Master Terminal Unit (MTU) and HMI, sending legitimate IEC 60870-5-104 commands to the corresponding RTUs. For this purpose, the workstation uses QTester104. In addition, there are three attackers that act as malicious insiders executing the following cyberattacks against the aforementioned RTUs. Finally, the network traffic data of each entity/device was captured through tshark.

Table 1: IEC 60870-5-104 Cyberattacks Description

IEC 60870-5-104 Cyberattack Description	Description	Dataset Files
MITM Drop	During this attack, the cyberattacker is placed between two endpoints, thus monitoring and dropping the network traffic

Intrusion Detection System
kaggle.com
Updated Dec 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sabarishwaran G (2022). Intrusion Detection System [Dataset]. https://www.kaggle.com/datasets/sabarish2611/intrusion-detection-system
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 7, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sabarishwaran G
Description
Dataset

This dataset was created by Sabarishwaran G

Contents
P
MQTT-IoT-IDS2020 Dataset
paperswithcode.com
opendatalab.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MQTT-IoT-IDS2020 Dataset [Dataset]. https://paperswithcode.com/dataset/mqtt-iot-ids2020
Explore at:
Description
Message Queuing Telemetry Transport (MQTT) protocol is one of the most used standards used in Internet of Things (IoT) machine to machine communication. The increase in the number of available IoT devices and used protocols reinforce the need for new and robust Intrusion Detection Systems (IDS). However, building IoT IDS requires the availability of datasets to process, train and evaluate these models.

MQTT-IoT-IDS2020 is the first dataset to simulate an MQTT-based network. The dataset is generated using a simulated MQTT network architecture. The network comprises twelve sensors, a broker, a simulated camera, and an attacker. Five scenarios are recorded: (1) normal operation, (2) aggressive scan, (3) UDP scan, (4) Sparta SSH brute-force, and (5) MQTT brute-force attack. The raw pcap files are saved, then features are extracted. Three abstraction levels of features are extracted from the raw pcap files: (a) packet features, (b) Unidirectional flow features and (c) Bidirectional flow features. The csv feature files in the dataset are suited for Machine Learning (ML) usage. Also, the raw pcap files are suitable for the deeper analysis of MQTT IoT networks communication and the associated attacks.
network_intrusion_detection_dataset_Monte_Carlo
kaggle.com
Updated Sep 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
md Shamshuzzoha (2024). network_intrusion_detection_dataset_Monte_Carlo [Dataset]. https://www.kaggle.com/datasets/mdshamshuzzoha/network-intrusion-detection-dataset-monte-carlo/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
md Shamshuzzoha
Description
Dataset

This dataset was created by md Shamshuzzoha

Contents
P
TII-SSRC-23 Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dania Herzalla; Willian T. Lunardi; Martin Andreoni Lopez, TII-SSRC-23 Dataset [Dataset]. https://paperswithcode.com/dataset/tii-ssrc-23
Explore at:
Authors
Dania Herzalla; Willian T. Lunardi; Martin Andreoni Lopez
Description
The TII-SSRC-23 dataset offers a comprehensive collection of network traffic patterns, meticulously compiled to support the development and research of Intrusion Detection Systems (IDS). It presents a dual structure: one part provides a tabular representation of extracted features in CSV format, while the other offers raw network traffic data for each type of traffic in PCAP files. This rich dataset captures both benign and malicious network scenarios, serving as an invaluable resource for researchers in the machine learning field.

URL: https://www.kaggle.com/datasets/daniaherzalla/tii-ssrc-23
P
DARPA Dataset
paperswithcode.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard Lippmann; Robert K. Cunningham; David J. Fried; Isaac Graf; Kris R. Kendall; Seth E. Webster; Marc A. Zissman, DARPA Dataset [Dataset]. https://paperswithcode.com/dataset/darpa-1
Explore at:
Authors
Richard Lippmann; Robert K. Cunningham; David J. Fried; Isaac Graf; Kris R. Kendall; Seth E. Webster; Marc A. Zissman
Description
Darpa is a dataset consisting of communications between source IPs and destination IPs. This dataset contains different attacks between IPs.
Large-Scale Attacks in IoT Environment
kaggle.com
zip
Updated May 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikita Manaenkov (2025). Large-Scale Attacks in IoT Environment [Dataset]. https://www.kaggle.com/datasets/nikitamanaenkov/large-scale-attacks-in-iot-environment
Explore at:
zip(1474647877 bytes)Available download formats
Dataset updated
May 7, 2025
Authors
Nikita Manaenkov
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The CICIoT2023 dataset is a large-scale, realistic intrusion detection dataset designed to support security analytics and machine learning research in the Internet of Things (IoT) domain. Created by the Canadian Institute for Cybersecurity (CIC), the dataset captures 33 different types of attacks (including DDoS, DoS, Recon, Web-based, Brute Force, Spoofing, and Mirai) executed by malicious IoT devices against other IoT targets.

The testbed consists of 105 real IoT devices of different types and manufacturers, including smart home devices and industrial equipment, configured in a complex network topology to emulate real-world conditions. The dataset includes benign and malicious traffic in various formats and supports feature extraction for both traditional ML and deep learning models.

This dataset aims to address the lack of diversity and scale in previous IoT security datasets, offering a robust benchmark for evaluating intrusion detection systems (IDS) and enabling research in IoT cybersecurity, anomaly detection, and network forensics.

Source https://www.mdpi.com/1424-8220/23/13/5941

Facebook

Twitter

Click to copy link

Link copied

Cite

Cyber Cop (2023). IoT Intrusion Detection [Dataset]. http://doi.org/10.34740/kaggle/dsv/6142327

IoT Intrusion Detection

Intrusion Detection in Internet of Things Network

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/6142327

Dataset updated

Jul 16, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Cyber Cop

License

http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

Description

The dataset has been introduced by the below-mentioned researches: E. C. P. Neto, S. Dadkhah, R. Ferreira, A. Zohourian, R. Lu, A. A. Ghorbani. "CICIoT2023: A real-time dataset and benchmark for large-scale attacks in IoT environment," Sensor (2023) – (submitted to Journal of Sensors). The present data contains different kinds of IoT intrusions. The categories of the IoT intrusions enlisted in the data are as follows: DDoS Brute Force Spoofing DoS Recon Web-based Mirai

There are several subcategories are present in the data for each kind of intrusion types in the IoT. The dataset contains 1191264 instances of network for intrusions and 47 features of each of the intrusions. The dataset can be used to prepare the predictive model through which different kind of intrusive attacks can be detected. The data is also suitable for designing the IDS system.

Clear search

Close search

Google apps

Main menu

IoT Intrusion Detection

NSL-KDD

Dataset

Contents

Intrusion Detection Dataset

Dataset

Contents

EDGE-IIOTSET Dataset

CIC-DDoS2019 Dataset

Network Intrusion Detection Dataset (UNR-IDD)

Dataset Source

Multi-class Classification

UNSW-NB15 Dataset

Unified Multimodal Network Intrusion Detection Systems Dataset

Smart Grid Intrusion Detection Dataset

Dataset

Contents

Kitsune Network Attack Dataset Dataset

CLOUD ATTACK DATASET

UNSW-NB15

Data from: Dataset for IDS testing

IEC 60870-5-104 Intrusion Detection Dataset

Intrusion Detection System

Dataset

Contents

MQTT-IoT-IDS2020 Dataset

network_intrusion_detection_dataset_Monte_Carlo

Dataset

Contents

TII-SSRC-23 Dataset

DARPA Dataset

Large-Scale Attacks in IoT Environment

IoT Intrusion Detection

Intrusion Detection in Internet of Things Network