13 datasets found

h
UNSW-IoT
huggingface.co
Updated Aug 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mireu Lab (2023). UNSW-IoT [Dataset]. https://huggingface.co/datasets/Mireu-Lab/UNSW-IoT
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 16, 2023
Authors
Mireu Lab
Description
Dataset Card for Dataset Name

Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure Data Instances

[More Information Needed]

Data Fields

[More Information Needed]

Data Splits

[More Information Needed]

Dataset Creation… See the full description on the dataset page: https://huggingface.co/datasets/Mireu-Lab/UNSW-IoT.
h
TON_IoT_network
huggingface.co
Updated Feb 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cody Lewis (2025). TON_IoT_network [Dataset]. https://huggingface.co/datasets/codymlewis/TON_IoT_network
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 22, 2025
Authors
Cody Lewis
Description
TON IoT Network

The TON IoT train test network dataset provided by https://research.unsw.edu.au/projects/toniot-datasets

Dataset Details

The datasets have been called 'ToN_IoT' as they include heterogeneous data sources collected from Telemetry datasets of IoT and IIoT sensors, Operating systems datasets of Windows 7 and 10 as well as Ubuntu 14 and 18 TLS and Network traffic datasets. The datasets were collected from a realistic and large-scale network designed at the… See the full description on the dataset page: https://huggingface.co/datasets/codymlewis/TON_IoT_network.
Bot_IoT
kaggle.com
Updated Mar 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vignesh Venkateswaran (2023). Bot_IoT [Dataset]. https://www.kaggle.com/datasets/vigneshvenkateswaran/bot-iot/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 14, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vignesh Venkateswaran
Description
INFO ABOUT THE BOT-IOT DATASET, NOTE: only the csv files stated in the description are used

The BoT-IoT dataset can be downloaded from HERE. You can also use our new datasets: the TON_IoT and UNSW-NB15.

--------------------------------------------------------------------------

The BoT-IoT dataset was created by designing a realistic network environment in the Cyber Range Lab of UNSW Canberra. The network environment incorporated a combination of normal and botnet traffic. The dataset’s source files are provided in different formats, including the original pcap files, the generated argus files and csv files. The files were separated, based on attack category and subcategory, to better assist in labeling process.

The captured pcap files are 69.3 GB in size, with more than 72.000.000 records. The extracted flow traffic, in csv format is 16.7 GB in size. The dataset includes DDoS, DoS, OS and Service Scan, Keylogging and Data exfiltration attacks, with the DDoS and DoS attacks further organized, based on the protocol used.

To ease the handling of the dataset, we extracted 5% of the original dataset via the use of select MySQL queries. The extracted 5%, is comprised of 4 files of approximately 1.07 GB total size, and about 3 million records.

--------------------------------------------------------------------------

Free use of the Bot-IoT dataset for academic research purposes is hereby granted in perpetuity. Use for commercial purposes should be agreed by the authors. The authors have asserted their rights under the Copyright. To whom intent the use of the Bot-IoT dataset, the authors have to cite the following papers that has the dataset’s details: .

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull. "Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset." Future Generation Computer Systems 100 (2019): 779-796. Public Access Here.

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards developing network forensic mechanism for botnet activities in the iot based on machine learning techniques." In International Conference on Mobile Networks and Management, pp. 30-44. Springer, Cham, 2017.

Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework." Future Generation Computer Systems 110 (2020): 91-106.

Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with particle swarm and deep learning: The particle deep framework." arXiv preprint arXiv:2005.00722 (2020).

Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram, and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability Perspectives in Smart Airports." IEEE Access (2020).

Koroniotis, Nickolaos. "Designing an effective network forensic framework for the investigation of botnets in the Internet of Things." PhD diss., The University of New South Wales Australia, 2020.

--------------------------------------------------------------------------
f
Selected features for the UNSWB15 dataset.
plos.figshare.com
xls
Updated Aug 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Tawfik (2024). Selected features for the UNSWB15 dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0304082.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0304082.t002
Dataset updated
Aug 1, 2024
Dataset provided by
PLOS ONE
Authors
Mohammed Tawfik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The proliferation of Internet of Things (IoT) devices and fog computing architectures has introduced major security and cyber threats. Intrusion detection systems have become effective in monitoring network traffic and activities to identify anomalies that are indicative of attacks. However, constraints such as limited computing resources at fog nodes render conventional intrusion detection techniques impractical. This paper proposes a novel framework that integrates stacked autoencoders, CatBoost, and an optimised transformer-CNN-LSTM ensemble tailored for intrusion detection in fog and IoT networks. Autoencoders extract robust features from high-dimensional traffic data while reducing the dimensionality of the efficiency at fog nodes. CatBoost refines features through predictive selection. The ensemble model combines self-attention, convolutions, and recurrence for comprehensive traffic analysis in the cloud. Evaluations of the NSL-KDD, UNSW-NB15, and AWID benchmarks demonstrate an accuracy of over 99% in detecting threats across traditional, hybrid enterprises and wireless environments. Integrated edge preprocessing and cloud-based ensemble learning pipelines enable efficient and accurate anomaly detection. The results highlight the viability of securing real-world fog and the IoT infrastructure against continuously evolving cyber-attacks.
i
The Bot-IoT dataset
ieee-dataport.org
Updated Oct 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nour Moustafa (2022). The Bot-IoT dataset [Dataset]. https://ieee-dataport.org/documents/bot-iot-dataset
Explore at:
Dataset updated
Oct 27, 2022
Authors
Nour Moustafa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
in most cases
P
IoT Benign and Attack Traces Dataset
paperswithcode.com
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). IoT Benign and Attack Traces Dataset [Dataset]. https://paperswithcode.com/dataset/iot-benign-and-attack-traces
Explore at:
Dataset updated
Jul 26, 2023
Description
IOT BENIGN AND ATTACK TRACES

Data Collected for ACM SOSR 2019 Attack & Benign Data Instructions Flow data contains flow counters of MUD flow, each instance in the file are collected every one minute. Annotations contains information about the start, end time of the attack and corresponsing MUD flows that are impacted through the Attack. More information about the device and the attacker can be found in here Below is an example of the annotations from the Samsung smart camera. eg: "1527838552,1527839153,Localfeatures|Arpfeatures,ArpSpoof100L2D" The above line indicates that the start time of the attack to be 1527838552 and end time is 1527839153. "Localfeatures|Arpfeatures" explains that it should impact the local communication and ARP protocol. "ArpSpoof100L2D" means that the attack was arpspoof lauched with the maximum rate of 100 packets per seconds. In order to identify the attack rows in flow stats you can use below condition. "if (flowtime >= startTime*1000 and endTime*1000>=flowtime) then attack = true" -- This corresponds to the line 4470 to 4479 in the samsung smart camera.

Cite our data A. Hamza, H. Habibi Gharakheili, T. Benson, V. Sivaraman, "Detecting Volumetric Attacks on IoT Devices via SDN-Based Monitoring of MUD Activity", ACM SOSR, San Jose, California, USA, Apr 2019.

Source code https://github.com/ayyoob/mud-ie

Contact ayyoobhamza@student.unsw.edu.au
r
Social Media 3.0 Dataset for Integrating Social Media and Internet of Things...
researchdata.edu.au
Updated 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moustafa Nour; Turnbull Benjamin; Salim Sara; University of New South Wales; University of New South Wales; Sara Salim; Nour Moustafa; Benjamin Turnbull (2021). Social Media 3.0 Dataset for Integrating Social Media and Internet of Things (SM-IoT) Systems [Dataset]. http://doi.org/10.26190/J4G2-PB81
Explore at:
Unique identifier
https://doi.org/10.26190/J4G2-PB81
Dataset updated
2021
Dataset provided by
University of New South Wales
UNSW, Sydney
Authors
Moustafa Nour; Turnbull Benjamin; Salim Sara; University of New South Wales; University of New South Wales; Sara Salim; Nour Moustafa; Benjamin Turnbull
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Social Media 3.0 dataset is a new integration of Social Media (SM) and Internet of Things (IoT) datasets for evaluating the fidelity and efficiency of different privacy preservation models based on Artificial Intelligence (AI) and Machine/Deep Learning algorithms. The directories of the datasets can be found in cloudstor, https://cloudstor.aarnet.edu.au/plus/apps/files/?dir=/&fileid=4570611720

Network traffic datasets created by Single Flow Time Series Analysis

zenodo.org
explore.openaire.eu
+1more

csv, pdf

Updated Jul 11, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka (2024). Network traffic datasets created by Single Flow Time Series Analysis [Dataset]. http://doi.org/10.5281/zenodo.8035724

Explore at:

csv, pdfAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.8035724

Dataset updated

Jul 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Josef Koumar; Josef Koumar; Karel Hynek; Karel Hynek; Tomáš Čejka; Tomáš Čejka

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Network traffic datasets created by Single Flow Time Series Analysis

Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

In the following table is a description of each dataset file:

File name	Detection problem	Citation of original raw dataset
botnet_binary.csv	Binary detection of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
botnet_multiclass.csv	Multi-class classification of botnet	S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.
cryptomining_design.csv	Binary detection of cryptomining; the design part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
cryptomining_evaluation.csv	Binary detection of cryptomining; the evaluation part	Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022
dns_malware.csv	Binary detection of malware DNS	Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.
doh_cic.csv	Binary detection of DoH	Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020
doh_real_world.csv	Binary detection of DoH	Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022
dos.csv	Binary detection of DoS	Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.
edge_iiot_binary.csv	Binary detection of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
edge_iiot_multiclass.csv	Multi-class classification of IoT malware	Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.
https_brute_force.csv	Binary detection of HTTPS Brute Force	Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020
ids_cic_binary.csv	Binary detection of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_cic_multiclass.csv	Multi-class classification of intrusion in IDS	Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.
ids_unsw_nb_15_binary.csv	Binary detection of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
ids_unsw_nb_15_multiclass.csv	Multi-class classification of intrusion in IDS	Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.
iot_23.csv	Binary detection of IoT malware	Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23
ton_iot_binary.csv	Binary detection of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
ton_iot_multiclass.csv	Multi-class classification of IoT malware	Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021
tor_binary.csv	Binary detection of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
tor_multiclass.csv	Multi-class classification of TOR	Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.
vpn_iscx_binary.csv	Binary detection of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_iscx_multiclass.csv	Multi-class classification of VPN	Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.
vpn_vnat_binary.csv	Binary detection of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
vpn_vnat_multiclass.csv	Multi-class classification of VPN	Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

Z
Network traffic datasets with novel extended IP flow called NetTiSA flow
data.niaid.nih.gov
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karel Hynek (2024). Network traffic datasets with novel extended IP flow called NetTiSA flow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8301042
Explore at:
Dataset updated
Apr 18, 2024
Dataset provided by
Josef Koumar
Karel Hynek
Jaroslav Pešek
Tomáš Čejka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Network traffic datasets with novel extended IP flow called NetTiSA flow

Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147Please cite the usage of our datasets as:

Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286

@article{KOUMAR2024110147, title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification}, journal = {Computer Networks}, volume = {240}, pages = {110147}, year = {2024}, issn = {1389-1286}, doi = {https://doi.org/10.1016/j.comnet.2023.110147}, url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923}, author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka} }

This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.

NetTiSA flow feature vector

The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.

Flow features

The flow features are:

Packets is the number of packets in the direction from the source to the destination IP address.

Packets in reverse order is the number of packets in the direction from the destination to the source IP address.

Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.

Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.

Statistical and Time-based features

The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:

Mean represents mean of the payload lengths of packets

Min is the minimal value from payload lengths of all packets in a flow

Max is the maximum value from payload lengths of all packets in a flow

Standard deviation is a measure of the variation of payload lengths from the mean payload length

Root mean square is the measure of the magnitude of payload lengths of packets

Average dispersion is the average absolute difference between each payload length of the packet and the mean value

Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution

Mean of relative times is the mean of the relative times which is a sequence defined as (st = {t_1 - t_1, t_2 - t_1, ..., t_n - t_1} )

Mean of time differences is the mean of the time differences which is a sequence defined as (dt = { t_j - t_i | j = i + 1, i \in {1, 2, \dots, n - 1} }.)

Min from time differences is the minimal value from all time differences, i.e., min space between packets.

Max from time differences is the maximum value from all time differences, i.e., max space between packets.

Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{{dt_{n-1}}} - dt_i \right| }{ \frac{1}{2} \left(max\left({dt_{n-1}}\right) - min\left({dt_{n-1}}\right) \right) })

Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:(sr = \frac{s_n}{\frac{1}{2} (n - 1)})

where \(s_n\) is number of switches.

Features computed at the collectorThe third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:

Max minus min is the difference between minimum and maximum payload lengths

Percent deviation is the dispersion of the average absolute difference to the mean value

Variance is the spread measure of the data from its mean

Burstiness is the degree of peakedness in the central part of the distribution

Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement

Directions describe a percentage ratio of packet direction computed as (\frac{d_1}{ d_1 + d_0}), where (d_1) is a number of packets in a direction from source to destination IP address and (d_0) the opposite direction. Both (d_1) and (d_0) are inside the classical bidirectional flow.

Duration is the duration of the flow

The NetTiSA flow is implemented into IP flow exporter ipfixprobe.

Description of dataset files

In the following table is a description of each dataset file:

File name

Detection problem

Citation of the original raw dataset

botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.

doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022

dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.

edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020

ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

unsw_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

unsw_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23

ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets.
f
Evolution parameters.
plos.figshare.com
xls
Updated Sep 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ankita Sharma; Shalli Rani; Maha Driss (2024). Evolution parameters. [Dataset]. http://doi.org/10.1371/journal.pone.0308206.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0308206.t005
Dataset updated
Sep 12, 2024
Dataset provided by
PLOS ONE
Authors
Ankita Sharma; Shalli Rani; Maha Driss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In response to the rapidly evolving threat landscape in network security, this paper proposes an Evolutionary Machine Learning Algorithm designed for robust intrusion detection. We specifically address challenges such as adaptability to new threats and scalability across diverse network environments. Our approach is validated using two distinct datasets: BoT-IoT, reflecting a range of IoT-specific attacks, and UNSW-NB15, offering a broader context of network intrusion scenarios using GA based hybrid DT-SVM. This selection facilitates a comprehensive evaluation of the algorithm’s effectiveness across varying attack vectors. Performance metrics including accuracy, recall, and false positive rates are meticulously chosen to demonstrate the algorithm’s capability to accurately identify and adapt to both known and novel threats, thereby substantiating the algorithm’s potential as a scalable and adaptable security solution. This study aims to advance the development of intrusion detection systems that are not only reactive but also preemptively adaptive to emerging cyber threats.” During the feature selection step, a GA is used to discover and preserve the most relevant characteristics from the dataset by using evolutionary principles. Through the use of this technology based on genetic algorithms, the subset of features is optimised, enabling the subsequent classification model to focus on the most relevant components of network data. In order to accomplish this, DT-SVM classification and GA-driven feature selection are integrated in an effort to strike a balance between efficiency and accuracy. The system has been purposefully designed to efficiently handle data streams in real-time, ensuring that intrusions are promptly and precisely detected. The empirical results corroborate the study’s assertion that the IDS outperforms traditional methodologies.
f
Comparison with other methods.
plos.figshare.com
xls
Updated Aug 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Tawfik (2024). Comparison with other methods. [Dataset]. http://doi.org/10.1371/journal.pone.0304082.t015
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0304082.t015
Dataset updated
Aug 1, 2024
Dataset provided by
PLOS ONE
Authors
Mohammed Tawfik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The proliferation of Internet of Things (IoT) devices and fog computing architectures has introduced major security and cyber threats. Intrusion detection systems have become effective in monitoring network traffic and activities to identify anomalies that are indicative of attacks. However, constraints such as limited computing resources at fog nodes render conventional intrusion detection techniques impractical. This paper proposes a novel framework that integrates stacked autoencoders, CatBoost, and an optimised transformer-CNN-LSTM ensemble tailored for intrusion detection in fog and IoT networks. Autoencoders extract robust features from high-dimensional traffic data while reducing the dimensionality of the efficiency at fog nodes. CatBoost refines features through predictive selection. The ensemble model combines self-attention, convolutions, and recurrence for comprehensive traffic analysis in the cloud. Evaluations of the NSL-KDD, UNSW-NB15, and AWID benchmarks demonstrate an accuracy of over 99% in detecting threats across traditional, hybrid enterprises and wireless environments. Integrated edge preprocessing and cloud-based ensemble learning pipelines enable efficient and accurate anomaly detection. The results highlight the viability of securing real-world fog and the IoT infrastructure against continuously evolving cyber-attacks.
f
AWID binary classification results.
plos.figshare.com
xls
Updated Aug 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Tawfik (2024). AWID binary classification results. [Dataset]. http://doi.org/10.1371/journal.pone.0304082.t013
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0304082.t013
Dataset updated
Aug 1, 2024
Dataset provided by
PLOS ONE
Authors
Mohammed Tawfik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The proliferation of Internet of Things (IoT) devices and fog computing architectures has introduced major security and cyber threats. Intrusion detection systems have become effective in monitoring network traffic and activities to identify anomalies that are indicative of attacks. However, constraints such as limited computing resources at fog nodes render conventional intrusion detection techniques impractical. This paper proposes a novel framework that integrates stacked autoencoders, CatBoost, and an optimised transformer-CNN-LSTM ensemble tailored for intrusion detection in fog and IoT networks. Autoencoders extract robust features from high-dimensional traffic data while reducing the dimensionality of the efficiency at fog nodes. CatBoost refines features through predictive selection. The ensemble model combines self-attention, convolutions, and recurrence for comprehensive traffic analysis in the cloud. Evaluations of the NSL-KDD, UNSW-NB15, and AWID benchmarks demonstrate an accuracy of over 99% in detecting threats across traditional, hybrid enterprises and wireless environments. Integrated edge preprocessing and cloud-based ensemble learning pipelines enable efficient and accurate anomaly detection. The results highlight the viability of securing real-world fog and the IoT infrastructure against continuously evolving cyber-attacks.
f
Selected features for the KDD dataset.
plos.figshare.com
xls
Updated Aug 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Tawfik (2024). Selected features for the KDD dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0304082.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0304082.t003
Dataset updated
Aug 1, 2024
Dataset provided by
PLOS ONE
Authors
Mohammed Tawfik
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The proliferation of Internet of Things (IoT) devices and fog computing architectures has introduced major security and cyber threats. Intrusion detection systems have become effective in monitoring network traffic and activities to identify anomalies that are indicative of attacks. However, constraints such as limited computing resources at fog nodes render conventional intrusion detection techniques impractical. This paper proposes a novel framework that integrates stacked autoencoders, CatBoost, and an optimised transformer-CNN-LSTM ensemble tailored for intrusion detection in fog and IoT networks. Autoencoders extract robust features from high-dimensional traffic data while reducing the dimensionality of the efficiency at fog nodes. CatBoost refines features through predictive selection. The ensemble model combines self-attention, convolutions, and recurrence for comprehensive traffic analysis in the cloud. Evaluations of the NSL-KDD, UNSW-NB15, and AWID benchmarks demonstrate an accuracy of over 99% in detecting threats across traditional, hybrid enterprises and wireless environments. Integrated edge preprocessing and cloud-based ensemble learning pipelines enable efficient and accurate anomaly detection. The results highlight the viability of securing real-world fog and the IoT infrastructure against continuously evolving cyber-attacks.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mireu Lab (2023). UNSW-IoT [Dataset]. https://huggingface.co/datasets/Mireu-Lab/UNSW-IoT

UNSW-IoT

Mireu-Lab/UNSW-IoT

Explore at:

102 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 16, 2023

Authors

Mireu Lab

Description

Dataset Card for Dataset Name

  Dataset Summary

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

  Supported Tasks and Leaderboards

[More Information Needed]

  Languages

[More Information Needed]

  Dataset Structure





  Data Instances

[More Information Needed]

  Data Fields

[More Information Needed]

  Data Splits

[More Information Needed]

  Dataset Creation… See the full description on the dataset page: https://huggingface.co/datasets/Mireu-Lab/UNSW-IoT.

Clear search

Close search

Google apps

Main menu

UNSW-IoT

TON_IoT_network

Bot_IoT

INFO ABOUT THE BOT-IOT DATASET, NOTE: only the csv files stated in the description are used

The BoT-IoT dataset can be downloaded from HERE. You can also use our new datasets: the TON_IoT and UNSW-NB15.

--------------------------------------------------------------------------

To ease the handling of the dataset, we extracted 5% of the original dataset via the use of select MySQL queries. The extracted 5%, is comprised of 4 files of approximately 1.07 GB total size, and about 3 million records.

--------------------------------------------------------------------------

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull. "Towards the development of realistic botnet dataset in the internet of things for network forensic analytics: Bot-iot dataset." Future Generation Computer Systems 100 (2019): 779-796. Public Access Here.

Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards developing network forensic mechanism for botnet activities in the iot based on machine learning techniques." In International Conference on Mobile Networks and Management, pp. 30-44. Springer, Cham, 2017.

Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network forensic framework based on deep learning for Internet of Things networks: A particle deep framework." Future Generation Computer Systems 110 (2020): 91-106.

Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with particle swarm and deep learning: The particle deep framework." arXiv preprint arXiv:2005.00722 (2020).

Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram, and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability Perspectives in Smart Airports." IEEE Access (2020).

Koroniotis, Nickolaos. "Designing an effective network forensic framework for the investigation of botnets in the Internet of Things." PhD diss., The University of New South Wales Australia, 2020.

--------------------------------------------------------------------------

Selected features for the UNSWB15 dataset.

The Bot-IoT dataset

IoT Benign and Attack Traces Dataset

Social Media 3.0 Dataset for Integrating Social Media and Internet of Things...

Network traffic datasets created by Single Flow Time Series Analysis

Network traffic datasets with novel extended IP flow called NetTiSA flow

Evolution parameters.

Comparison with other methods.

AWID binary classification results.

Selected features for the KDD dataset.

UNSW-IoT

Mireu-Lab/UNSW-IoT