16 datasets found
  1. NF-UQ-NIDS-v2 Network Intrusion Detection Dataset

    • kaggle.com
    zip
    Updated May 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arya Shah (2022). NF-UQ-NIDS-v2 Network Intrusion Detection Dataset [Dataset]. https://www.kaggle.com/datasets/aryashah2k/nfuqnidsv2-network-intrusion-detection-dataset
    Explore at:
    zip(2185336021 bytes)Available download formats
    Dataset updated
    May 14, 2022
    Authors
    Arya Shah
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A comprehensive dataset, merging all the aforementioned datasets mentioned in: https://staff.itee.uq.edu.au/marius/NIDS_datasets/#RA5

    The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS datasets containing flows from multiple network setups and different attack settings.

    An additional label feature identifying the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories.

    Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category.

    The NF-UQ-NIDS dataset has a total of 11,994,893 records, out of which 9,208,048 (76.77%) are benign flows and 2,786,845 (23.23%) are attacks. The table below lists the distribution of the final attack categories.

  2. Netflow V1 Datasets

    • kaggle.com
    zip
    Updated Apr 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Athena (2025). Netflow V1 Datasets [Dataset]. https://www.kaggle.com/datasets/athena21/netflow-v1-datasets
    Explore at:
    zip(109641886 bytes)Available download formats
    Dataset updated
    Apr 8, 2025
    Authors
    Athena
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains the original NetFlow V1 datasets, as published by the authors listed below. I have not made any changes — just uploaded them here to make access easier for others in the community who are working on machine learning-based network intrusion detection systems (NIDS).

    If you use these datasets in your work, please cite the original paper: Mohanad Sarhan, Siamak Layeghy, Nour Moustafa and Marius Portmann. NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications. BDTA 2020, WiCON 2020. Springer, Cham, 2021.

    The collection includes five datasets, converted by the original authors from four different formats into a unified NetFlow format. Each dataset contains 12 basic NetFlow features and is provided in CSV format.

    🧠 Credits & Citation All credit goes to the original authors: Mohanad Sarhan, Siamak Layeghy, Nour Moustafa, and Marius Portmann Published in: "NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems" Big Data Technologies and Applications (BDTA 2020, WiCON 2020), Springer, Cham, 2021.

    Note If you are the original authors and prefer this dataset to be taken down or credited differently, please feel free to reach out. I just wanted to help make the data more accessible to the Kaggle community.

  3. r

    Data from: NF-ToN-IoT-v2

    • researchdata.edu.au
    Updated May 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann (2023). NF-ToN-IoT-v2 [Dataset]. http://doi.org/10.48610/38A2D07
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    The University of Queensland
    Authors
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann
    License

    http://guides.library.uq.edu.au/deposit_your_data/terms_and_conditionshttp://guides.library.uq.edu.au/deposit_your_data/terms_and_conditions

    Description

    NetFlow Version 2 of the datasets is made up of 43 extended NetFlow features. The details of the datasets are published in: Mohanad Sarhan, Siamak Layeghy, and Marius Portmann, Towards a Standard Feature Set for Network Intrusion Detection System Datasets, Mobile Networks and Applications, 103, 108379, 2022 The use of the datasets for academic research purposes is granted in perpetuity after citing the above papers. For commercial purposes, it should be agreed upon by the authors. Please get in touch with the author Mohanad Sarhan for more details.

  4. AIT Netflow Data Set

    • zenodo.org
    bin, zip
    Updated Aug 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesca Soro; Max Landauer; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Francesca Soro; Max Landauer; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger (2023). AIT Netflow Data Set [Dataset]. http://doi.org/10.5281/zenodo.6610489
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Aug 18, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francesca Soro; Max Landauer; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger; Francesca Soro; Max Landauer; Florian Skopik; Wolfgang Hotwagner; Markus Wurzenberger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AIT Netflow Data Sets

    This repository contains labeled synthetic netflows suitable for evaluation of intrusion detection systems, federated learning, and alert aggregation. The netflows are generated from the packet captures contained in the AIT-LDS-v2.0. A detailed description of that dataset is available in [1]. The packet captures were collected from eight testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by [2]. Please cite these papers if the data is used for academic publications.

    In brief, each of the datasets corresponds to a testbed representing a small enterprise network including mail server, file share, WordPress server, VPN, firewall, etc. Normal user behavior is simulated to generate background noise over a time span of 4-6 days. At some point, a sequence of attack steps is launched against the network. The following attacks are launched in the network:

    • Scans (nmap, WPScan, dirb)
    • Webshell upload (CVE-2020-24186)
    • Password cracking (John the Ripper)
    • Privilege escalation
    • Remote command execution
    • Data exfiltration (DNSteal)

    This repository contains the following files:

    • : CSV files of labeled TCP and UDP netflows for each testbed.
    • README.md: Instructions on how to reproduce the generation and labeling of the netflows from the AIT-LDS-v2.0. Note that it is only necessary to run the python scripts if you want to extend or change the labeling procedure.
    • 1_format_dataset_info.ipynb: Generates the tables necessary for labeling (see README.md).
    • 2_label_logs.ipynb: Labels the netflows (see README.md).

    Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU projects GUARD (833456) and PANDORA (SI2.835928).

    If you use the dataset, please cite the following publications:

    [1] M. Landauer, F. Skopik, M. Frank, W. Hotwagner, M. Wurzenberger, and A. Rauber. "Maintainable Log Datasets for Evaluation of Intrusion Detection Systems". IEEE Transactions on Dependable and Secure Computing, vol. 20, no. 4, pp. 3466-3482. [PDF]

    [2] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]

  5. r

    Data from: NF-BoT-IoT

    • researchdata.edu.au
    Updated May 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann (2023). NF-BoT-IoT [Dataset]. http://doi.org/10.48610/62E6D80
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    The University of Queensland
    Authors
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann
    License

    http://guides.library.uq.edu.au/deposit_your_data/terms_and_conditionshttp://guides.library.uq.edu.au/deposit_your_data/terms_and_conditions

    Description

    NetFlow Version 1 of the datasets is made up of 8 basic NetFlow features. The details of the datasets are published in; Sarhan M., Layeghy S., Moustafa N., Portmann M. (2021) NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications. BDTA 2020, WiCON 2020. Springer, Cham. The use of the datasets for academic research purposes is granted in perpetuity after citing the above papers. For commercial purposes, it should be agreed upon by the authors. Please get in touch with the author Mohanad Sarhan for more details.

  6. Network data schema in the Netflow V9 format

    • kaggle.com
    zip
    Updated Feb 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashutosh Sharma (2023). Network data schema in the Netflow V9 format [Dataset]. https://www.kaggle.com/datasets/ashtcoder/network-data-schema-in-the-netflow-v9-format
    Explore at:
    zip(275705142 bytes)Available download formats
    Dataset updated
    Feb 1, 2023
    Authors
    Ashutosh Sharma
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The network data schema is in the Netflow V9 format. Given two files 'train_net.csv' and 'test_net.csv', train_net.csv explains when the particular ALERT will happen. There are 4 classes present in the dataset, named following: 'None', 'Port Scanning', 'Denial of Service', 'Malware'.

    Acknowledgements

    SIMARGL Project – Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 833042.

    Cite

    Maria-Elena Mihailescu, Darius Mihai, Mihai Carabas, Mikolaj Komisarek, Marek Pawlicki, Witold Holubowicz, Rafal Kozik: The Proposition and Evaluation of the RoEduNet-SIMARGL2021 Network Intrusion Detection Dataset. Sensors 21(13): 4319 (2021)

  7. r

    NF-UNSW-NB15

    • researchdata.edu.au
    Updated May 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann (2023). NF-UNSW-NB15 [Dataset]. http://doi.org/10.48610/5D0832D
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    The University of Queensland
    Authors
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann
    License

    http://guides.library.uq.edu.au/deposit_your_data/terms_and_conditionshttp://guides.library.uq.edu.au/deposit_your_data/terms_and_conditions

    Description

    NetFlow Version 1 of the datasets is made up of 8 basic NetFlow features. The details of the datasets are published in; Sarhan M., Layeghy S., Moustafa N., Portmann M. (2021) NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications. BDTA 2020, WiCON 2020. Springer, Cham. The use of the datasets for academic research purposes is granted in perpetuity after citing the above papers. For commercial purposes, it should be agreed upon by the authors. Please get in touch with the author Mohanad Sarhan for more details.

  8. NF-UQ-NIDS

    • kaggle.com
    zip
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    StrGenIx | Laurens D'hooge (2023). NF-UQ-NIDS [Dataset]. https://www.kaggle.com/datasets/dhoogla/nfuqnids
    Explore at:
    zip(82759768 bytes)Available download formats
    Dataset updated
    Jan 13, 2023
    Authors
    StrGenIx | Laurens D'hooge
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    NF-UQ-NIDS is the combined version of the four network intrusion detection (NIDS) datasets in the NF-collection by the university of Queensland. The aim was to standardize network-security datasets to achieve interoperability and to enable larger analyses. With some relabeling (documentation) the authors merged four independent NIDS datasets.

    All credit goes to the original authors: Dr. Mohanad Sarhan, Dr. Siamak Layeghy, Dr. Nour Moustafa & Dr. Marius Portmann. Please cite their original conference article when using this dataset.

    V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files

    In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records.

  9. Appraise H2020 - Real labelled netflow dataset

    • kaggle.com
    Updated May 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    itti (2023). Appraise H2020 - Real labelled netflow dataset [Dataset]. https://www.kaggle.com/datasets/ittibydgoszcz/appraise-h2020-real-labelled-netflow-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    itti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    Field NameDescription
    FLOW_IDUnique identificator of flow
    IPV4_SRC_ADDRIPv4 source address
    IPV4_DST_ADDRIPv4 destination address
    IN_PKTSNumber of incoming packets
    IN_BYTESNumber of incoming bytes
    OUT_PKTSNumber of outgoing packets
    OUT_BYTESNumber of outgoing bytes
    FIRST_SWITCHEDTime of first packet in the flow
    LAST_SWITCHEDTime of last packet in the flow
    L4_SRC_PORTLayer 4 source port
    L4_DST_PORTLayer 4 destination port
    TCP_FLAGSTCP flags
    PROTOCOLProtocol
    PROTOCOL_MAPProtocol map
    TOTAL_FLOWS_EXPTotal flows experienced
    L7_PROTOLayer 7 protocol
    L7_PROTO_NAMELayer 7 protocol name
    ANOMALY_CATEGORYName of classification flow
    ANOMALYBinary classification flow

    This work is co-funded under the APPRAISE Project – fAcilitating Public & Private secuRity operAtors to mitigate terrorIsm Scenarios against soft targEts, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 101021981.

  10. Data from: SQL Injection Attack Netflow

    • zenodo.org
    • portalcientifico.unileon.es
    • +3more
    Updated Sep 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ignacio Crespo; Ignacio Crespo; Adrián Campazas; Adrián Campazas (2022). SQL Injection Attack Netflow [Dataset]. http://doi.org/10.5281/zenodo.6907252
    Explore at:
    Dataset updated
    Sep 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ignacio Crespo; Ignacio Crespo; Adrián Campazas; Adrián Campazas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This datasets have SQL injection attacks (SLQIA) as malicious Netflow data. The attacks carried out are SQL injection for Union Query and Blind SQL injection. To perform the attacks, the SQLMAP tool has been used.

    NetFlow traffic has generated using DOROTHEA (DOcker-based fRamework fOr gaTHering nEtflow trAffic). NetFlow is a network protocol developed by Cisco for the collection and monitoring of network traffic flow data generated. A flow is defined as a unidirectional sequence of packets with some common properties that pass through a network device.

    Datasets

    The firts dataset was colleted to train the detection models (D1) and other collected using different attacks than those used in training to test the models and ensure their generalization (D2).

    The datasets contain both benign and malicious traffic. All collected datasets are balanced.

    The version of NetFlow used to build the datasets is 5.

    DatasetAimSamplesBenign-malicious
    traffic ratio
    D1Training400,00350%
    D2Test57,23950%

    Infrastructure and implementation

    Two sets of flow data were collected with DOROTHEA. DOROTHEA is a Docker-based framework for NetFlow data collection. It allows you to build interconnected virtual networks to generate and collect flow data using the NetFlow protocol. In DOROTHEA, network traffic packets are sent to a NetFlow generator that has a sensor ipt_netflow installed. The sensor consists of a module for the Linux kernel using Iptables, which processes the packets and converts them to NetFlow flows.

    DOROTHEA is configured to use Netflow V5 and export the flow after it is inactive for 15 seconds or after the flow is active for 1800 seconds (30 minutes)

    Benign traffic generation nodes simulate network traffic generated by real users, performing tasks such as searching in web browsers, sending emails, or establishing Secure Shell (SSH) connections. Such tasks run as Python scripts. Users may customize them or even incorporate their own. The network traffic is managed by a gateway that performs two main tasks. On the one hand, it routes packets to the Internet. On the other hand, it sends it to a NetFlow data generation node (this process is carried out similarly to packets received from the Internet).

    The malicious traffic collected (SQLI attacks) was performed using SQLMAP. SQLMAP is a penetration tool used to automate the process of detecting and exploiting SQL injection vulnerabilities.

    The attacks were executed on 16 nodes and launch SQLMAP with the parameters of the following table.

    ParametersDescription
    '--banner','--current-user','--current-db','--hostname','--is-dba','--users','--passwords','--privileges','--roles','--dbs','--tables','--columns','--schema','--count','--dump','--comments', --schema'Enumerate users, password hashes, privileges, roles, databases, tables and columns
    --level=5Increase the probability of a false positive identification
    --risk=3Increase the probability of extracting data
    --random-agentSelect the User-Agent randomly
    --batchNever ask for user input, use the default behavior
    --answers="follow=Y"Predefined answers to yes

    Every node executed SQLIA on 200 victim nodes. The victim nodes had deployed a web form vulnerable to Union-type injection attacks, which was connected to the MYSQL or SQLServer database engines (50% of the victim nodes deployed MySQL and the other 50% deployed SQLServer).

    The web service was accessible from ports 443 and 80, which are the ports typically used to deploy web services. The IP address space was 182.168.1.1/24 for the benign and malicious traffic-generating nodes. For victim nodes, the address space was 126.52.30.0/24.
    The malicious traffic in the test sets was collected under different conditions. For D1, SQLIA was performed using Union attacks on the MySQL and SQLServer databases.

    However, for D2, BlindSQL SQLIAs were performed against the web form connected to a PostgreSQL database. The IP address spaces of the networks were also different from those of D1. In D2, the IP address space was 152.148.48.1/24 for benign and malicious traffic generating nodes and 140.30.20.1/24 for victim nodes.

    To run the MySQL server we ran MariaDB version 10.4.12.
    Microsoft SQL Server 2017 Express and PostgreSQL version 13 were used.

  11. f

    Detailed attributes for Netflow v5 flow records.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jan 13, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Fahad Umer; Muhammad Sher; Yaxin Bi (2018). Detailed attributes for Netflow v5 flow records. [Dataset]. http://doi.org/10.1371/journal.pone.0180945.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 13, 2018
    Dataset provided by
    PLOS ONE
    Authors
    Muhammad Fahad Umer; Muhammad Sher; Yaxin Bi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Detailed attributes for Netflow v5 flow records.

  12. G

    Network Traffic Analytics and Flow Data (NetFlow/IPFIX)

    • gomask.ai
    csv, json
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GoMask.ai (2025). Network Traffic Analytics and Flow Data (NetFlow/IPFIX) [Dataset]. https://gomask.ai/marketplace/datasets/network-traffic-analytics-and-flow-data-netflowipfix
    Explore at:
    json, csv(10 MB)Available download formats
    Dataset updated
    Nov 1, 2025
    Dataset provided by
    GoMask.ai
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024 - 2025
    Area covered
    Global
    Variables measured
    dst_ip, src_ip, dst_asn, flow_id, src_asn, dst_port, protocol, src_port, device_id, byte_count, and 9 more
    Description

    This dataset provides comprehensive network flow analytics, capturing source and destination details, protocols, traffic volumes, flow durations, and autonomous system numbers. It includes traffic classification and anomaly detection flags, making it ideal for network monitoring, security analysis, and performance optimization across enterprise and service provider environments.

  13. r

    NF-BoT-IoT-v2

    • researchdata.edu.au
    Updated May 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann (2023). NF-BoT-IoT-v2 [Dataset]. http://doi.org/10.48610/EC73920
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    The University of Queensland
    Authors
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann
    License

    http://guides.library.uq.edu.au/deposit_your_data/terms_and_conditionshttp://guides.library.uq.edu.au/deposit_your_data/terms_and_conditions

    Description

    NetFlow Version 2 of the datasets is made up of 43 extended NetFlow features. The details of the datasets are published in: Mohanad Sarhan, Siamak Layeghy, and Marius Portmann, Towards a Standard Feature Set for Network Intrusion Detection System Datasets, Mobile Networks and Applications, 103, 108379, 2022 The use of the datasets for academic research purposes is granted in perpetuity after citing the above papers. For commercial purposes, it should be agreed upon by the authors. Please get in touch with the author Mohanad Sarhan for more details.

  14. SIMARGL2021

    • kaggle.com
    zip
    Updated Sep 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H2020 SIMARGL (2021). SIMARGL2021 [Dataset]. https://www.kaggle.com/datasets/h2020simargl/simargl2021-network-intrusion-detection-dataset
    Explore at:
    zip(1078252475 bytes)Available download formats
    Dataset updated
    Sep 17, 2021
    Authors
    H2020 SIMARGL
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Article

    The Proposition and Evaluation of the RoEduNet-SIMARGL2021 Network Intrusion Detection Dataset

    Context

    Cybersecurity is an arms race, with both the security and the adversaries attempting to outsmart one another, coming up with new attacks, new ways to defend against those attacks, and again with new ways to circumvent those defenses. This situation creates a constant need for novel, realistic cybersecurity datasets. This paper introduces the effects of using machine-learning-based intrusion detection methods in network traffic coming from a real-life architecture. The main contribution of this work is a dataset coming from a real-world, academic network. Real-life traffic was collected and, after performing a series of attacks, a dataset was assembled. The network data schema is in the Netflow v9 format and it contains 44 unique features and a label describing each frame.

    Cite

    This dataset is publicly available for use. When using our dataset, please cite our related paper: Maria-Elena Mihailescu, Darius Mihai, Mihai Carabas, Mikolaj Komisarek, Marek Pawlicki, Witold Holubowicz, Rafal Kozik: The Proposition and Evaluation of the RoEduNet-SIMARGL2021 Network Intrusion Detection Dataset. Sensors 21(13): 4319 (2021)

    Acknowledgements

    This work is funded under the SIMARGL Project – Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 833042.

  15. Zero-Day Attack Detection in Logistics Networks

    • kaggle.com
    zip
    Updated Sep 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DatasetEngineer (2024). Zero-Day Attack Detection in Logistics Networks [Dataset]. https://www.kaggle.com/datasets/datasetengineer/zero-day-attack-detection-in-logistics-networks
    Explore at:
    zip(29965360 bytes)Available download formats
    Dataset updated
    Sep 21, 2024
    Authors
    DatasetEngineer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Title: Zero-Day Attack Detection in Airport Logistics Networks

    Overview This dataset comprises 400,000 network traffic entries collected from logistics networks at major airports in the United States, including those in Texas and Washington. The dataset provides a real-world view of network activity, featuring a mix of benign and malicious traffic, making it an invaluable resource for researchers and practitioners in cybersecurity and network analysis. Please note that some data has been eliminated for privacy purposes.

    Features The dataset consists of 26 features, as outlined below:

    Time: Timestamp of the network activity, formatted as YYYY-MM-DD HH:MM . Protocol: Type of protocol used for communication (e.g., TCP, UDP). Flag: TCP flags indicating the state of the connection (e.g., SYN, ACK). Family: Classification of the traffic, including normal operations and various attack families (e.g., WannaCry, Phishing). Clusters: Identifier for clustering similar traffic, useful for analyzing patterns. Source Address: IP address of the device originating the traffic. Destination Address: IP address of the destination device within the airport network. BTC: Bitcoin transaction amounts, if applicable. USD: USD transaction amounts, if applicable. Netflow Bytes: Total bytes of data transmitted in the flow. IP Address: Redundant field for clarity, representing the source IP. Threat Level: Classification indicating the threat level of the traffic (e.g., Benign, Zero-Day Attack). Port: Port number used for communication. Prediction: Model prediction indicating whether the traffic is benign or represents an attack. Payload Size: Size of the data payload transmitted. Number of Packets: Count of packets involved in the traffic flow. Application Layer Data: Information about the application layer requests (e.g., HTTP methods). User-Agent: Information about the client software making the request. Geolocation: Airport-related geolocation, indicating the specific airport involved (e.g., DFW, SEA). Logistics ID: Unique identifier for logistics items (e.g., shipment ID). Anomaly Score: Score indicating the likelihood of the traffic being anomalous or malicious. Event Description: Descriptive label for the event, detailing the nature of the traffic. Response Time: Time taken for the server to respond to the request. Session ID: Unique identifier for the network session. Data Transfer Rate: Rate of data transfer, measured in Mbps. Error Code: HTTP or application-level error codes returned (if applicable). Dataset Characteristics Total Entries: 400,000 Class Distribution: 62% benign traffic and 38% representing zero-day attacks and other threats. Geographical Focus: Traffic data includes activities at major airports, such as Dallas/Fort Worth International Airport (DFW) and Seattle-Tacoma International Airport (SEA). Use Cases This dataset can be utilized for:

    Research: Investigating zero-day attack detection techniques. Machine Learning: Training models to classify benign and malicious network traffic. Network Security: Enhancing security measures in logistics networks at airports. Conclusion The "Zero-Day Attack Detection in Airport Logistics Networks" dataset provides a realistic and comprehensive view of network behavior within airport logistics, offering critical insights for developing effective cybersecurity strategies against zero-day threats.

  16. VHS-22

    • kaggle.com
    zip
    Updated Apr 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H2020 SIMARGL (2022). VHS-22 [Dataset]. https://www.kaggle.com/datasets/h2020simargl/vhs-22-network-traffic-dataset/data
    Explore at:
    zip(1903940704 bytes)Available download formats
    Dataset updated
    Apr 29, 2022
    Authors
    H2020 SIMARGL
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    VHS-22 is a heterogeneous, flow-level dataset which combines ISOT, CICIDS-17, Booters and CTU-13 datasets, as well as traffic from Malware Traffic Analysis (MTA) site, to increase variety of malicious and legitimate traffic flows. It contains 27.7 million flows (20.3 million legitimate and 7.4 million of attacks). The flows are represented in the form of 45 features; apart from classical NetFlow features, VHS-22 contains statistical parameters and network-level features. Their detailed description and the results of initial detection experiments are presented in the paper:

    Paweł Szumełda, Natan Orzechowski, Mariusz Rawski, and Artur Janicki. 2022. VHS-22 – A Very Heterogeneous Set of Network Traffic Data for Threat Detection. In Proc. European Interdisciplinary Cybersecurity Conference (EICC 2022), June 15–16, 2022, Barcelona, Spain. ACM, New York, NY, USA, https://doi.org/10.1145/3528580.3532843

    Every day contains different attacks mixed with legitimate traffic. 01-01-2022 Botnet attacks from ISOT dataset. 02-01-2022 Various attacks from MTA dataset. 03-01-2022 Web attacks from CICIDS-17 dataset. 04-01-2022 Bruteforce attacks from CICIDS-17 dataset. 05-01-2022 Botnet attacks from CICIDS-17 dataset. 06-01-2022 DDoS attacks from CICIDS-17 dataset 07-01-2022 to 11-01-2022 DDoS attacks from Booters dataset. 12-01-2022 to 23-01-2022 Botnet traffic from CTU-13 dataset.

    The VHS-22 dataset consists of labeled network flows and all data is publicly available for researchers in .csv format. When using VHS-22, please cite our paper which describes the VHS-22 dataset in detail, as well as the publications describing the source datasets:

    Paweł Szumełda, Natan Orzechowski, Mariusz Rawski, and Artur Janicki. 2022. VHS-22 – A Very Heterogeneous Set of Network Traffic Data for Threat Detection. In Proc. European Interdisciplinary Cybersecurity Conference (EICC 2022), June 15–16, 2022, Barcelona, Spain. ACM, New York, NY, USA, https://doi.org/10.1145/3528580.3532843

    Sherif Saad, Issa Traore, Ali Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John Felix, and Payman Hakimian. 2011. Detecting P2P botnets through network behavior analysis and machine learning. In Proc. International Conference on Privacy, Security and Trust. IEEE, Montreal, Canada, 174–1

    Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization, In Proc. 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), Funchal, Portugal

    José Jair Santanna, Romain Durban, Anna Sperotto, and Aiko Pras. 2015. Inside booters: An analysis on operational databases. In Proc. International Symposium on Integrated Network Management (INM 2015). IFIP/IEEE, Ottawa, Canada, 432–440. https://doi.org/10.1109/INM.2015.71403

    Riaz Khan, Xiaosong Zhang, Rajesh Kumar, Abubakar Sharif, Noorbakhsh Amiri Golilarz, and Mamoun Alazab. 2019. An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers. Applied Sciences 9 (06 2019), 2375. https://doi.org/10.3390/app91123

    The Malware Traffic Analysis data originate from https://www.malware-traffic-analysis.net, authored by Brad.

    The work has been funded by the SIMARGL Project -- Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 833042.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Arya Shah (2022). NF-UQ-NIDS-v2 Network Intrusion Detection Dataset [Dataset]. https://www.kaggle.com/datasets/aryashah2k/nfuqnidsv2-network-intrusion-detection-dataset
Organization logo

NF-UQ-NIDS-v2 Network Intrusion Detection Dataset

Version 2 of the Netflow datasets made up of 43 extended NetFlow features

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
zip(2185336021 bytes)Available download formats
Dataset updated
May 14, 2022
Authors
Arya Shah
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

A comprehensive dataset, merging all the aforementioned datasets mentioned in: https://staff.itee.uq.edu.au/marius/NIDS_datasets/#RA5

The newly published dataset represents the benefits of shared dataset feature sets, where the merging of multiple smaller ones is possible. This will eventually lead to a bigger and more universal NIDS datasets containing flows from multiple network setups and different attack settings.

An additional label feature identifying the original dataset of each flow. This can be used to compare the same attack scenarios conducted over two or more different test-bed networks. The attack categories have been modified to combine all parent categories.

Attacks named DoS attacks-Hulk, DoS attacks-SlowHTTPTest, DoS attacks-GoldenEye and DoS attacks-Slowloris have been renamed to the parent DoS category. Attacks named DDOS attack-LOIC-UDP, DDOS attack-HOIC and DDoS attacks-LOIC-HTTP have been renamed to DDoS. Attacks named FTP-BruteForce, SSH-Bruteforce, Brute Force -Web and Brute Force -XSS have been combined as a brute-force category. Finally, SQL Injection attacks have been included in the injection attacks category.

The NF-UQ-NIDS dataset has a total of 11,994,893 records, out of which 9,208,048 (76.77%) are benign flows and 2,786,845 (23.23%) are attacks. The table below lists the distribution of the final attack categories.

Search
Clear search
Close search
Google apps
Main menu