45 datasets found

CSE-CIC-IDS2018
kaggle.com
huggingface.co
Updated Aug 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
StrGenIx | Laurens D'hooge (2022). CSE-CIC-IDS2018 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4059899
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4059899
Dataset updated
Aug 11, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
StrGenIx | Laurens D'hooge
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This is an academic intrusion detection dataset. All the credit goes to the original authors: Dr. Iman Sharafaldin, Dr. Arash Habibi Lashkari Dr. Ali Ghorbani. Please cite their original paper.

It was published by the Canadian Institute for Cybersecurity and is the successor to CIC-IDS2017. The biggest difference is the move away from on-premise infrastructure to AWS to generate the dataset. It also vastly increased the representation of 'Infiltration' traffic compared to CIC-IDS2017.

V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files V3: Reorganize to save storage, only keep original CSVs in V1/V2

In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records in this clean version. Baseline classification scores with simple models will be available shorty.
m
CSE-CIC-IDS2018
data.mendeley.com
Updated Feb 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdisalam Mohamed (2024). CSE-CIC-IDS2018 [Dataset]. http://doi.org/10.17632/29hdbdzx2r.1
Explore at:
Unique identifier
https://doi.org/10.17632/29hdbdzx2r.1
Dataset updated
Feb 5, 2024
Authors
Abdisalam Mohamed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A cleaned version of CSE-CIC-IDS2018 dataset
h
CSE-CIC-IDS2018-V2
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abluva Inc, CSE-CIC-IDS2018-V2 [Dataset]. https://huggingface.co/datasets/abluva/CSE-CIC-IDS2018-V2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
Abluva Inc
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This is the updated version CSE-CIC-IDS 2018 dataset. The data is normalised and 1 new class "Comb" which is a combination of existing attacks is added. To cite the dataset, please reference the original paper with DOI: 10.1109/SmartNets61466.2024.10577645. The paper is published in IEEE SmartNets and can be accessed here: https://www.researchgate.net/publication/382034618_Blender-GAN_Multi-Target_Conditional_Generative_Adversarial_Network_for_Novel_Class_Synthetic_Data_Generation. Citation… See the full description on the dataset page: https://huggingface.co/datasets/abluva/CSE-CIC-IDS2018-V2.
CIC-IDS-2018-parquet
kaggle.com
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lima mateus (2024). CIC-IDS-2018-parquet [Dataset]. https://www.kaggle.com/datasets/limamateus/cic-ids-2018-parquet/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
lima mateus
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by lima mateus

Released under Apache 2.0

Contents
s
Citation Trends for "Optimizing Intrusion Detection Systems in Three Phases...
shibatadb.com
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yubetsu (2023). Citation Trends for "Optimizing Intrusion Detection Systems in Three Phases on the CSE-CIC-IDS-2018 Dataset" [Dataset]. https://www.shibatadb.com/article/7VjrMiHC
Explore at:
Dataset updated
Nov 24, 2023
Dataset authored and provided by
Yubetsu
License
https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Time period covered
2024 - 2025
Variables measured
New Citations per Year
Description
Yearly citation counts for the publication titled "Optimizing Intrusion Detection Systems in Three Phases on the CSE-CIC-IDS-2018 Dataset".
h
cic-ids-2018-alldata-textual
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shiva Prasad Gyawali, cic-ids-2018-alldata-textual [Dataset]. https://huggingface.co/datasets/gyawalishiva/cic-ids-2018-alldata-textual
Explore at:
Authors
Shiva Prasad Gyawali
Description
gyawalishiva/cic-ids-2018-alldata-textual dataset hosted on Hugging Face and contributed by the HF Datasets community
Y
Citation Network Graph
shibatadb.com
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yubetsu (2023). Citation Network Graph [Dataset]. https://www.shibatadb.com/article/7VjrMiHC
Explore at:
Dataset updated
Nov 24, 2023
Dataset authored and provided by
Yubetsu
License
https://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Description
Network of 45 papers and 67 citation links related to "Optimizing Intrusion Detection Systems in Three Phases on the CSE-CIC-IDS-2018 Dataset".

Intrusion Detection System Market Analysis North America, APAC, Europe,...

technavio.com

pdf

Updated Oct 23, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Technavio (2024). Intrusion Detection System Market Analysis North America, APAC, Europe, Middle East and Africa, South America - US, China, UK, Germany, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/intrusion-detection-system-market-industry-analysis

Explore at:

pdfAvailable download formats

Dataset updated

Oct 23, 2024

Dataset provided by

TechNavio

Authors

Technavio

License

https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

Time period covered

2024 - 2028

Area covered

United Kingdom, United States

Description

Snapshot img

Intrusion Detection System Market Size 2024-2028

The intrusion detection system market size is forecast to increase by USD 4.65 billion at a CAGR of 14% between 2023 and 2028.

The market is witnessing significant growth due to the escalating number of cyberattacks and the need to secure IT service infrastructure, particularly in the banking and financial services industry (BFSI). IDS solutions employ two primary identification techniques: signature-based and anomaly detection. Signature-based identification relies on known attack patterns, while anomaly detection identifies deviations from normal behavior.
Additionally, with the rise in digital transactions, there is a growing emphasis on securing security architecture through traffic monitoring and intrusion detection. The market is driven by the increasing demand for BFSI applications and the subsequent need to protect against cyber threats. However, the high cost of maintaining IDS solutions remains a challenge. In conclusion, the IDS market is expected to continue growing as organizations prioritize securing their IT infrastructure against cyber threats.

What will be the Size of the Market During the Forecast Period?

Request Free Sample

The Intrusion Detection System (IDS) market is a significant segment of the cybersecurity industry, playing a crucial role in safeguarding IT infrastructure against various cyber threats. IDS solutions help identify and prevent unauthorized access, malicious activities, and potential security breaches. These systems can be categorized into Network Intrusion Detection Systems (NIDS) and Host-based Intrusion Detection Systems (HIDS). IDS and Intrusion Prevention Systems (IPS) are essential components of an organization's cybersecurity strategy. IPS goes beyond simple identification and provides real-time prevention of attacks. Both IDS and IPS are instrumental in mitigating risks from phishing incidents, cyberattacks, and other malicious threats.
Additionally, cybersecurity is a major concern for various sectors, including BFSI applications, telecom, defense, and cloud computing. With the increasing reliance on IT infrastructure and work from home arrangements, cybersecurity expenditure has seen a significant rise. IDS and IPS solutions are integral to securing data and maintaining information security. Cybercrimes are on the rise, with malicious threat actors constantly evolving their tactics. Traditional signature-based identification methods may not be sufficient to detect advanced threats. Anomaly detection, a key feature of modern IDS and IPS solutions, can help identify unusual patterns and potential threats. IDS and IPS solutions are not limited to protecting traditional IT infrastructure.
Simultaneously, they also play a vital role in securing cloud computing environments. IDS and IPS as part of IDP (Intrusion Detection and Prevention) systems offer advanced threat detection and prevention capabilities, ensuring comprehensive protection against cyberattacks. Ransomware attacks have emerged as a major concern, with their disruptive impact on business operations. IDS and IPS solutions can help prevent ransomware attacks by identifying and blocking malicious traffic before it can cause damage. In conclusion, IDS and IPS solutions are essential components of an effective cybersecurity strategy. They help organizations protect their IT infrastructure, data security, and information security against various cyber threats, including phishing incidents, cyberattacks, and malicious threat actors. The market for IDS and IPS solutions is expected to grow as organizations continue to invest in advanced cybersecurity solutions to mitigate risks and maintain business continuity.

How is this market segmented and which is the largest segment?

The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

Deployment

  On-premises
  Cloud-based


Geography

  North America

    US


  APAC

    China
    Japan


  Europe

    Germany
    UK


  Middle East and Africa



  South America

By Deployment Insights

The on-premises segment is estimated to witness significant growth during the forecast period.

The on-premises segment is projected to dominate the market in the US, with substantial growth in terms of revenue. Large enterprises, particularly those with a global footprint, are the primary consumers of on-premises intrusion detection systems. The primary reason for this preference is the control it offers over managing software assets, including data generated and stored within business applications. This deployment model enables organizations to ensure compliance with licensing agreements and automate tasks, making it an attractive choice for many busine

Z
Network traffic datasets created by Single Flow Time Series Analysis
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Josef Koumar (2024). Network traffic datasets created by Single Flow Time Series Analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8035723
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Karel Hynek
Josef Koumar
Tomáš Čejka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Network traffic datasets created by Single Flow Time Series Analysis

Datasets were created for the paper: Network Traffic Classification based on Single Flow Time Series Analysis -- Josef Koumar, Karel Hynek, Tomáš Čejka -- which was published at The 19th International Conference on Network and Service Management (CNSM) 2023. Please cite usage of our datasets as:

J. Koumar, K. Hynek and T. Čejka, "Network Traffic Classification Based on Single Flow Time Series Analysis," 2023 19th International Conference on Network and Service Management (CNSM), Niagara Falls, ON, Canada, 2023, pp. 1-7, doi: 10.23919/CNSM59352.2023.10327876.

This Zenodo repository contains 23 datasets created from 15 well-known published datasets which are cited in the table below. Each dataset contains 69 features created by Time Series Analysis of Single Flow Time Series. The detailed description of features from datasets is in the file: feature_description.pdf

In the following table is a description of each dataset file:

File name Detection problem Citation of original raw dataset

botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.

doh_cic.csv Binary detection of DoH

Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022

dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.

edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020

ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

ids_unsw_nb_15_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

ids_unsw_nb_15_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23

ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

tor_binary.csv Binary detection of TOR Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.

tor_multiclass.csv Multi-class classification of TOR Arash Habibi Lashkari et al. Characterization of Tor Traffic using Time based Features. In ICISSP 2017, pages 253–262. SciTePress, 2017.

vpn_iscx_binary.csv Binary detection of VPN Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.

vpn_iscx_multiclass.csv Multi-class classification of VPN Gerard Draper-Gil et al. Characterization of Encrypted and VPN Traffic Using Time-related. In ICISSP, pages 407–414, 2016.

vpn_vnat_binary.csv Binary detection of VPN Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022

vpn_vnat_multiclass.csv Multi-class classification of VPN Steven Jorgensen et al. Extensible Machine Learning for Encrypted Network Traffic Application Labeling via Uncertainty Quantification. CoRR, abs/2205.05628, 2022
f
Confusion matrix.
plos.figshare.com
xls
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wei Cui; Xiao Liao; Yang Yang; Shiying Feng; Mingyan Song (2025). Confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0322329.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0322329.t002
Dataset updated
May 29, 2025
Dataset provided by
PLOS ONE
Authors
Wei Cui; Xiao Liao; Yang Yang; Shiying Feng; Mingyan Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With the rapid development of smart grids, power grid systems are becoming increasingly complex, posing significant challenges to their security. Traditional network intrusion detection systems often rely on manually engineered features, which are not only resource-intensive but also struggle to handle the diverse range of attack types. This paper aims to address these challenges by proposing an automated DDoS attack detection algorithm using the Informer model. We introduce a windowing technique to segment network traffic into manageable samples, which are then input into the Informer for feature extraction and classification. This model captures both the temporal dependencies and global attention information in the traffic data. Experimental results on the CICIDS-2018 dataset demonstrate the effectiveness of our approach, showing significant improvements in detection accuracy and efficiency. Our findings suggest that the proposed method offers a promising solution for real-time intrusion detection in complex power grid environments.
VHS-22
kaggle.com
Updated Apr 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
H2020 SIMARGL (2022). VHS-22 [Dataset]. https://www.kaggle.com/datasets/h2020simargl/vhs-22-network-traffic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 29, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
H2020 SIMARGL
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
VHS-22 is a heterogeneous, flow-level dataset which combines ISOT, CICIDS-17, Booters and CTU-13 datasets, as well as traffic from Malware Traffic Analysis (MTA) site, to increase variety of malicious and legitimate traffic flows. It contains 27.7 million flows (20.3 million legitimate and 7.4 million of attacks). The flows are represented in the form of 45 features; apart from classical NetFlow features, VHS-22 contains statistical parameters and network-level features. Their detailed description and the results of initial detection experiments are presented in the paper:

Paweł Szumełda, Natan Orzechowski, Mariusz Rawski, and Artur Janicki. 2022. VHS-22 – A Very Heterogeneous Set of Network Traffic Data for Threat Detection. In Proc. European Interdisciplinary Cybersecurity Conference (EICC 2022), June 15–16, 2022, Barcelona, Spain. ACM, New York, NY, USA, https://doi.org/10.1145/3528580.3532843

Every day contains different attacks mixed with legitimate traffic. 01-01-2022 Botnet attacks from ISOT dataset. 02-01-2022 Various attacks from MTA dataset. 03-01-2022 Web attacks from CICIDS-17 dataset. 04-01-2022 Bruteforce attacks from CICIDS-17 dataset. 05-01-2022 Botnet attacks from CICIDS-17 dataset. 06-01-2022 DDoS attacks from CICIDS-17 dataset 07-01-2022 to 11-01-2022 DDoS attacks from Booters dataset. 12-01-2022 to 23-01-2022 Botnet traffic from CTU-13 dataset.

The VHS-22 dataset consists of labeled network flows and all data is publicly available for researchers in .csv format. When using VHS-22, please cite our paper which describes the VHS-22 dataset in detail, as well as the publications describing the source datasets:

Paweł Szumełda, Natan Orzechowski, Mariusz Rawski, and Artur Janicki. 2022. VHS-22 – A Very Heterogeneous Set of Network Traffic Data for Threat Detection. In Proc. European Interdisciplinary Cybersecurity Conference (EICC 2022), June 15–16, 2022, Barcelona, Spain. ACM, New York, NY, USA, https://doi.org/10.1145/3528580.3532843

Sherif Saad, Issa Traore, Ali Ghorbani, Bassam Sayed, David Zhao, Wei Lu, John Felix, and Payman Hakimian. 2011. Detecting P2P botnets through network behavior analysis and machine learning. In Proc. International Conference on Privacy, Security and Trust. IEEE, Montreal, Canada, 174–1

Iman Sharafaldin, Arash Habibi Lashkari, and Ali A. Ghorbani. 2018. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization, In Proc. 4th International Conference on Information Systems Security and Privacy (ICISSP 2018), Funchal, Portugal

José Jair Santanna, Romain Durban, Anna Sperotto, and Aiko Pras. 2015. Inside booters: An analysis on operational databases. In Proc. International Symposium on Integrated Network Management (INM 2015). IFIP/IEEE, Ottawa, Canada, 432–440. https://doi.org/10.1109/INM.2015.71403

Riaz Khan, Xiaosong Zhang, Rajesh Kumar, Abubakar Sharif, Noorbakhsh Amiri Golilarz, and Mamoun Alazab. 2019. An Adaptive Multi-Layer Botnet Detection Technique Using Machine Learning Classifiers. Applied Sciences 9 (06 2019), 2375. https://doi.org/10.3390/app91123

The Malware Traffic Analysis data originate from https://www.malware-traffic-analysis.net, authored by Brad.

The work has been funded by the SIMARGL Project -- Secure Intelligent Methods for Advanced RecoGnition of malware and stegomalware, with the support of the European Commission and the Horizon 2020 Program, under Grant Agreement No. 833042.
f
Algorithm comparison.
plos.figshare.com
xls
Updated May 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wei Cui; Xiao Liao; Yang Yang; Shiying Feng; Mingyan Song (2025). Algorithm comparison. [Dataset]. http://doi.org/10.1371/journal.pone.0322329.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0322329.t003
Dataset updated
May 29, 2025
Dataset provided by
PLOS ONE
Authors
Wei Cui; Xiao Liao; Yang Yang; Shiying Feng; Mingyan Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With the rapid development of smart grids, power grid systems are becoming increasingly complex, posing significant challenges to their security. Traditional network intrusion detection systems often rely on manually engineered features, which are not only resource-intensive but also struggle to handle the diverse range of attack types. This paper aims to address these challenges by proposing an automated DDoS attack detection algorithm using the Informer model. We introduce a windowing technique to segment network traffic into manageable samples, which are then input into the Informer for feature extraction and classification. This model captures both the temporal dependencies and global attention information in the traffic data. Experimental results on the CICIDS-2018 dataset demonstrate the effectiveness of our approach, showing significant improvements in detection accuracy and efficiency. Our findings suggest that the proposed method offers a promising solution for real-time intrusion detection in complex power grid environments.
Z
Network traffic datasets with novel extended IP flow called NetTiSA flow
data.niaid.nih.gov
zenodo.org
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomáš Čejka (2024). Network traffic datasets with novel extended IP flow called NetTiSA flow [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8301042
Explore at:
Dataset updated
Apr 18, 2024
Dataset provided by
Jaroslav Pešek
Karel Hynek
Josef Koumar
Tomáš Čejka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Network traffic datasets with novel extended IP flow called NetTiSA flow

Datasets were created for the paper: NetTiSA: Extended IP Flow with Time-series Features for Universal Bandwidth-constrained High-speed Network Traffic Classification -- Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka -- which is published in The International Journal of Computer and Telecommunications Networking https://doi.org/10.1016/j.comnet.2023.110147Please cite the usage of our datasets as:

Josef Koumar, Karel Hynek, Jaroslav Pešek, Tomáš Čejka, "NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification", Computer Networks, Volume 240, 2024, 110147, ISSN 1389-1286

@article{KOUMAR2024110147, title = {NetTiSA: Extended IP flow with time-series features for universal bandwidth-constrained high-speed network traffic classification}, journal = {Computer Networks}, volume = {240}, pages = {110147}, year = {2024}, issn = {1389-1286}, doi = {https://doi.org/10.1016/j.comnet.2023.110147}, url = {https://www.sciencedirect.com/science/article/pii/S1389128623005923}, author = {Josef Koumar and Karel Hynek and Jaroslav Pešek and Tomáš Čejka} }

This Zenodo repository contains 23 datasets created from 15 well-known published datasets, which are cited in the table below. Each dataset contains the NetTiSA flow feature vector.

NetTiSA flow feature vector

The novel extended IP flow called NetTiSA (Network Time Series Analysed) flow contains a universal bandwidth-constrained feature vector consisting of 20 features. We divide the NetTiSA flow classification features into three groups by computation. The first group of features is based on classical bidirectional flow information---a number of transferred bytes, and packets. The second group contains statistical and time-based features calculated using the time-series analysis of the packet sequences. The third type of features can be computed from the previous groups (i.e., on the flow collector) and improve the classification performance without any impact on the telemetry bandwidth.

Flow features

The flow features are:

Packets is the number of packets in the direction from the source to the destination IP address.

Packets in reverse order is the number of packets in the direction from the destination to the source IP address.

Bytes is the size of the payload in bytes transferred in the direction from the source to the destination IP address.

Bytes in reverse order is the size of the payload in bytes transferred in the direction from the destination to the source IP address.

Statistical and Time-based features

The features that are exported in the extended part of the flow. All of them can be computed (exactly or in approximative) by stream-wise computation, which is necessary for keeping memory requirements low. The second type of feature set contains the following features:

Mean represents mean of the payload lengths of packets

Min is the minimal value from payload lengths of all packets in a flow

Max is the maximum value from payload lengths of all packets in a flow

Standard deviation is a measure of the variation of payload lengths from the mean payload length

Root mean square is the measure of the magnitude of payload lengths of packets

Average dispersion is the average absolute difference between each payload length of the packet and the mean value

Kurtosis is the measure describing the extent to which the tails of a distribution differ from the tails of a normal distribution

Mean of relative times is the mean of the relative times which is a sequence defined as (st = {t_1 - t_1, t_2 - t_1, ..., t_n - t_1} )

Mean of time differences is the mean of the time differences which is a sequence defined as (dt = { t_j - t_i | j = i + 1, i \in {1, 2, \dots, n - 1} }.)

Min from time differences is the minimal value from all time differences, i.e., min space between packets.

Max from time differences is the maximum value from all time differences, i.e., max space between packets.

Time distribution describes the deviation of time differences between individual packets within the time series. The feature is computed by the following equation:(tdist = \frac{ \frac{1}{n-1} \sum_{i=1}^{n-1} \left| \mu_{{dt_{n-1}}} - dt_i \right| }{ \frac{1}{2} \left(max\left({dt_{n-1}}\right) - min\left({dt_{n-1}}\right) \right) })

Switching ratio represents a value change ratio (switching) between payload lengths. The switching ratio is computed by equation:(sr = \frac{s_n}{\frac{1}{2} (n - 1)})

where \(s_n\) is number of switches.

Features computed at the collectorThe third set contains features that are computed from the previous two groups prior to classification. Therefore, they do not influence the network telemetry size and their computation does not put additional load to resource-constrained flow monitoring probes. The NetTiSA flow combined with this feature set is called the Enhanced NetTiSA flow and contains the following features:

Max minus min is the difference between minimum and maximum payload lengths

Percent deviation is the dispersion of the average absolute difference to the mean value

Variance is the spread measure of the data from its mean

Burstiness is the degree of peakedness in the central part of the distribution

Coefficient of variation is a dimensionless quantity that compares the dispersion of a time series to its mean value and is often used to compare the variability of different time series that have different units of measurement

Directions describe a percentage ratio of packet direction computed as (\frac{d_1}{ d_1 + d_0}), where (d_1) is a number of packets in a direction from source to destination IP address and (d_0) the opposite direction. Both (d_1) and (d_0) are inside the classical bidirectional flow.

Duration is the duration of the flow

The NetTiSA flow is implemented into IP flow exporter ipfixprobe.

Description of dataset files

In the following table is a description of each dataset file:

File name

Detection problem

Citation of the original raw dataset

botnet_binary.csv Binary detection of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

botnet_multiclass.csv Multi-class classification of botnet S. García et al. An Empirical Comparison of Botnet Detection Methods. Computers & Security, 45:100–123, 2014.

cryptomining_design.csv Binary detection of cryptomining; the design part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

cryptomining_evaluation.csv Binary detection of cryptomining; the evaluation part Richard Plný et al. Datasets of Cryptomining Communication. Zenodo, October 2022

dns_malware.csv Binary detection of malware DNS Samaneh Mahdavifar et al. Classifying Malicious Domains using DNS Traffic Analysis. In DASC/PiCom/CBDCom/CyberSciTech 2021, pages 60–67. IEEE, 2021.

doh_cic.csv Binary detection of DoH Mohammadreza MontazeriShatoori et al. Detection of doh tunnels using time-series classification of encrypted traffic. In DASC/PiCom/CBDCom/CyberSciTech 2020, pages 63–70. IEEE, 2020

doh_real_world.csv Binary detection of DoH Kamil Jeřábek et al. Collection of datasets with DNS over HTTPS traffic. Data in Brief, 42:108310, 2022

dos.csv Binary detection of DoS Nickolaos Koroniotis et al. Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst., 100:779–796, 2019.

edge_iiot_binary.csv Binary detection of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

edge_iiot_multiclass.csv Multi-class classification of IoT malware Mohamed Amine Ferrag et al. Edge-iiotset: A new comprehensive realistic cyber security dataset of iot and iiot applications: Centralized and federated learning, 2022.

https_brute_force.csv Binary detection of HTTPS Brute Force Jan Luxemburk et al. HTTPS Brute-force dataset with extended network flows, November 2020

ids_cic_binary.csv Binary detection of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

ids_cic_multiclass.csv Multi-class classification of intrusion in IDS Iman Sharafaldin et al. Toward generating a new intrusion detection dataset and intrusion traffic characterization. ICISSp, 1:108–116, 2018.

unsw_binary.csv Binary detection of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

unsw_multiclass.csv Multi-class classification of intrusion in IDS Nour Moustafa and Jill Slay. Unsw-nb15: a comprehensive data set for network intrusion detection systems (unsw-nb15 network data set). In 2015 military communications and information systems conference (MilCIS), pages 1–6. IEEE, 2015.

iot_23.csv Binary detection of IoT malware Sebastian Garcia et al. IoT-23: A labeled dataset with malicious and benign IoT network traffic, January 2020. More details here https://www.stratosphereips.org /datasets-iot23

ton_iot_binary.csv Binary detection of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets. Sustainable Cities and Society, 72:102994, 2021

ton_iot_multiclass.csv Multi-class classification of IoT malware Nour Moustafa. A new distributed architecture for evaluating ai-based security systems at the edge: Network ton iot datasets.
Z
Trace-Share Dataset for Evaluation of Statistical Characteristics...
data.niaid.nih.gov
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cermak, Milan (2020). Trace-Share Dataset for Evaluation of Statistical Characteristics Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3553062
Explore at:
Dataset updated
Jan 24, 2020
Dataset provided by
Cermak, Milan
Madeja, Tomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains all data used during the evaluation of statistical characteristics preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

For more information, see the project repository at https://github.com/Trace-Share.

Selected Attack Traces

We selected 72 different traces of network attacks obtained from various internet databases. File names refer to common names of contained vulnerabilities, malware, or attack tools.

Background Traffic Data

Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

Evaluation Results and Dataset Structure

Traces variants (traces-normalized.zip, traces-adjusted.zip)

./traces-normalized/ — normalized PCAP files and details in YAML format;

./traces-adjusted/ — configuration files for traces combination in YAML format.

Computed statistics (statistics.zip)

./statistics-background/ — background traffic statistics computed by ID2T;

./statistics-combination/ — combined traces statistics computed by ID2T for all adjust options (selected only combinations where ID2T provided all statistics files);

./statistics-difference/ — computed mean and median differences of background and combined traffic traces.

Evaluation results

statistics-difference.ipynb — file containing visualization of statistics differences.
Z
Trace-Share Dataset for Evaluation of Trace Meaning Preservation
data.niaid.nih.gov
zenodo.org
Updated May 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeja, Tomas (2020). Trace-Share Dataset for Evaluation of Trace Meaning Preservation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3547527
Explore at:
Dataset updated
May 7, 2020
Dataset provided by
Cermak, Milan
Madeja, Tomas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains all data used during the evaluation of trace meaning preservation. Archives are protected by password "trace-share" to avoid false detection by antivirus software.

For more information, see the project repository at https://github.com/Trace-Share.

Selected Attack Traces

The following list contains trace datasets used for evaluation. Each attack was chosen to have not only a different meaning but also different statistical properties.

dos_http_flood — the capture of GET and POST requests sent to one server by one attacker (HTTP~traffic);

ftp_bruteforce — short and unsuccessful attempt to guess a user’s password for FTP service (FTP traffic);

ponyloader_botnet — Pony Loader botnet used for stealing of credentials from 3 target devices reporting to single IP with a large number of intermediate addresses (DNS and HTTP traffic);

scan — the capture of nmap tool that scans given subnet using ICMP echo and TCP SYN requests (consist of ARP, ICMP, and TCP traffic);

wannacry_ransomware — the capture of Wanacry ransomware that spreads in a domain with three workstations, a domain controller, and a file-sharing server (SMB and SMBv2 traffic).

Background Traffic Data

Publicly available dataset CSE-CIC-IDS-2018 was used as a background traffic data. The evaluation uses data from the day Thursday-01-03-2018 containing a sufficient proportion of regular traffic without any statistically significant attacks. Only traffic aimed at victim machines (range 172.31.69.0/24) is used to reduce less significant traffic.

Evaluation Results and Dataset Structure

Traces variants (traces.zip)

./traces-original/ — trace PCAP files and crawled details in YAML format;

./traces-normalized — normalized PCAP files and details in YAML format;

./traces-adjusted — adjusted PCAP files using various timestamp generation settings, combination configuration in YAML format, and lables provided by ID2T in XML format.

Extracted alerts (alerts.zip)

./alerts-original/ — extracted Suricata alerts, Suricata log, and full Suricata output for all original trace files;

./alerts-normalized/ — extracted Suricata alerts, Suricata log, and full Suricata output for all normalized trace files;

./alerts-adjusted/ — extracted Suricata alerts, Suricata log, and full Suricata output for all adjusted trace files.

Evaluation results

*.csv files in the root directory — data contains extracted alert signatures and their count per each trace variant.
f
DDoS Detection
figshare.com
zip
Updated Feb 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wei Cui (2025). DDoS Detection [Dataset]. http://doi.org/10.6084/m9.figshare.28428494.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28428494.v1
Dataset updated
Feb 17, 2025
Dataset provided by
figshare
Authors
Wei Cui
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
With the development of smart grids, power grid systems are gradually becoming morecomplex. This poses new challenges for ensuring the security of power grid systems.The prevailing approach to network intrusion detection relies heavily on manuallyengineered features, often requiring rigorous expertise and struggling to accommodate adiverse array of attack types. In response to this challenge, we employed a windowingtechnique to segment network traffic data into manageable samples. These samples aresubsequently input into the Informer network for feature extraction and classification,facilitating intrusion detection. Our proposed algorithm simultaneously considers boththe temporal information of sessions and overall attention information, autonomouslylearning features from traffic data. Experimental evaluations using CICIDS-2018network traffic data demonstrate the algorithm’s effectiveness in DDoS attack detection,yielding promising results.
f
Data types in CICIDS2018 dataset.
plos.figshare.com
xls
Updated Aug 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Congyuan Xu; Jun Yang; Panpan Li (2025). Data types in CICIDS2018 dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0331065.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0331065.t005
Dataset updated
Aug 29, 2025
Dataset provided by
PLOS ONE
Authors
Congyuan Xu; Jun Yang; Panpan Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The widespread deployment of Internet of Things (IoT) devices has made them prime targets for cyberattacks. Existing intrusion detection systems (IDSs) heavily rely on large-scale labeled datasets, which limits their effectiveness in detecting novel attacks under few-shot scenarios. To address this challenge, we propose a meta-learning-based intrusion detection method called MACML (Marrying Attention and Convolution-based Meta-Learning). It integrates a self-attention mechanism to capture global dependencies and a convolutional neural network to extract local features, thereby enhancing the model’s overall perception of traffic characteristics. MACML adopts an optimization-based meta-learning framework that enables rapid adaptation to new tasks using only a small number of training samples, improving detection performance and generalization capability. We evaluate MACML on the CICIDS2018 and CICIoT2023 datasets. Experimental results show that, with only 10 training samples, MACML achieves an average accuracy of 98.75% and a detection rate of 99.17% on the CICIDS2018 dataset. On the CICIoT2023 dataset, it reaches 94.47% accuracy and a 95.32% detection rate, outperforming existing state-of-the-art methods.
Parkes pulsar observations with undefined project IDs for 2018_BPSR_14
researchdata.edu.au
data.csiro.au
datadownload
Updated Nov 10, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CSIRO (2018). Parkes pulsar observations with undefined project IDs for 2018_BPSR_14 [Dataset]. http://doi.org/10.25919/5BE54E9A3F87F
Explore at:
datadownloadAvailable download formats
Unique identifier
https://doi.org/10.25919/5BE54E9A3F87F
Dataset updated
Nov 10, 2018
Dataset authored and provided by
CSIROhttp://www.csiro.au/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 2018 - Dec 31, 2018
Description
Parkes pulsar observations with undefined project IDs for 2018.

This collection contains observations from project P955.

Perimeter Intrusion Detection Systems Market Analysis North America, Europe,...

technavio.com

pdf

Updated Jul 12, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Technavio (2024). Perimeter Intrusion Detection Systems Market Analysis North America, Europe, APAC, Middle East and Africa, South America - US, Germany, China, UK, India - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/perimeter-intrusion-detection-systems-market-industry-analysis

Explore at:

pdfAvailable download formats

Dataset updated

Jul 12, 2024

Dataset provided by

TechNavio

Authors

Technavio

License

https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

Time period covered

2024 - 2028

Area covered

United States

Description

Snapshot img

Perimeter Intrusion Detection Systems Market Size 2024-2028

The perimeter intrusion detection systems market size is forecast to increase by USD 6.42 billion, at a CAGR of 10.03% between 2023 and 2028.

Perimeter intrusion detection systems (PIDS) are essential for securing critical infrastructure against unauthorized access and potential threats, particularly in sectors such as oil refineries and banking and financial institutions. The market for these systems is driven by the need to mitigate criminal activities and prevent terrorist attacks. Technological advancements, including signal processing, artificial intelligence, machine learning, data analytics, computing technologies, video analytics, and visual alarm verification, are key trends in the market. These technologies enhance the system's ability to detect and respond to intrusions effectively. Additionally, the increasing demand for real-time threat detection and response, as well as the need for surveillance data security, further boosts the market's growth. Explosions and other catastrophic events at oil refineries underscore the importance of reliable and effective perimeter security systems.

What will be the Size of the Market During the Forecast Period?

Request Free Sample

The market is witnessing significant growth due to the increasing need for advanced security solutions. These systems play a crucial role in safeguarding critical infrastructure from potential threats such as terrorism, criminal activities, burglaries, thefts, explosions, and other unauthorized intrusions. PIDS utilize various technologies, including sensors, video surveillance systems, and radar, to detect and alert security personnel of any unauthorized access or potential threats. Sensors are the backbone of these systems, with microwave sensors, infrared sensors, fiber optic sensors, and radar sensors being commonly used. Microwave sensors operate by detecting the reflection of microwave energy off intruders or objects, while infrared sensors detect heat signatures. Fiber optic sensors can detect vibrations, strain, temperature changes, and other physical disturbances. Radar sensors use electromagnetic waves to detect objects and movements within their range. Video surveillance systems are another essential component of PIDS. Cameras monitor the perimeter, and video management software processes and stores the footage. Hardware, such as servers and storage devices, ensure the efficient processing and retention of data.
Professional services and managed services further enhance the functionality of these systems. The US oil refineries, power plants, and other critical infrastructure facilities are significant users of PIDS. These systems provide early warning of potential threats, enabling security personnel to respond promptly and effectively. The integration of PIDS with other security systems, such as Access Control Systems (ACS) and Automatic Identification Systems (AIS), further enhances the overall security of these facilities. The US market for Perimeter Intrusion Detection Systems is expected to experience steady growth due to the increasing focus on infrastructure security. The integration of advanced technologies, such as AI and machine learning, is also expected to drive market growth. As the threat landscape continues to evolve, the demand for strong and reliable PIDS solutions will continue to increase.

How is this market segmented and which is the largest segment?

The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

Component

  Solutions
  Services


Geography

  North America

    US


  Europe

    Germany
    UK


  APAC

    China
    India


  Middle East and Africa



  South America

By Component Insights

The solutions segment is estimated to witness significant growth during the forecast period.

Perimeter intrusion detection systems are essential security solutions used in various industries to identify and prevent unauthorized access or suspicious activities. These systems employ advanced technologies such as video analytics, signal processing, artificial intelligence, and machine learning for enhanced security. In industries like oil refineries, where the risk of explosions is high, perimeter intrusion detection systems play a crucial role in safeguarding assets and personnel. Video analytics is a significant component of these systems, utilizing data analytics and computing technologies to analyze video footage for motion detection and other potential threats. Machine learning algorithms are employed to improve the system's accuracy and reduce false alarms.

Visual alarm verification is another feature that helps verify alarms before alerting security personnel, mini

H
2018 U.S. Congressional Election Tweet Ids
dataverse.harvard.edu
Updated Feb 7, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Wrubel; Justin Littman; Dan Kerchner (2019). 2018 U.S. Congressional Election Tweet Ids [Dataset]. http://doi.org/10.7910/DVN/AEZPLU
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/AEZPLU
Dataset updated
Feb 7, 2019
Dataset provided by
Harvard Dataverse
Authors
Laura Wrubel; Justin Littman; Dan Kerchner
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
This dataset contains the tweet ids of 171,248,476 tweets related to the 2018 U.S. Congressional Election. They were collected between January 22, 2018 and January 3, 2019 from the Twitter API using Social Feed Manager. See each collection's README for dates of collection, accounts, and hashtags used in queries. These tweet ids are broken up into 5 collections. Each collection was collected either from the GET statuses/user_timeline method of the Twitter REST API (retrieved on a weekly schedule) or the POST statuses/filter method of the Twitter Stream API. The collections are: Senate candidates (Twitter user timeline): senate_accounts.txt House candidates (Twitter user timeline): house_accounts.txt Election filter (Twitter filter): election-filter-[1-3].txt Partisan Democratic filter (Twitter filter): partisan-dem-[1-4].txt Partisan Republican filter (Twitter filter): partisan-rep-[1-11].txt There is a README.txt file for each collection containing additional documentation on how it was collected. There is also an accounts.csv file for those collections collected from the GET statuses/user_timeline method, listing the Twitter accounts that were collected. The GET statuses/lookup method supports retrieving the complete tweet for a tweet id (known as hydrating). Tools such as Twarc or Hydrator can be used to hydrate tweets. Per Twitter’s Developer Policy, tweet ids may be publicly shared for academic purposes; tweets may not. Questions about this dataset can be sent to sfm@gwu.edu. George Washington University researchers should contact us for access to the tweets.

Facebook

Twitter

Click to copy link

Link copied

Cite

StrGenIx | Laurens D'hooge (2022). CSE-CIC-IDS2018 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4059899

CSE-CIC-IDS2018

Follow-up to CIC-IDS2017, network intrusion detection, CIC @UNB Fredericton

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/4059899

Dataset updated

Aug 11, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

StrGenIx | Laurens D'hooge

License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

This is an academic intrusion detection dataset. All the credit goes to the original authors: Dr. Iman Sharafaldin, Dr. Arash Habibi Lashkari Dr. Ali Ghorbani. Please cite their original paper.

It was published by the Canadian Institute for Cybersecurity and is the successor to CIC-IDS2017. The biggest difference is the move away from on-premise infrastructure to AWS to generate the dataset. It also vastly increased the representation of 'Infiltration' traffic compared to CIC-IDS2017.

V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files V3: Reorganize to save storage, only keep original CSVs in V1/V2

In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records in this clean version. Baseline classification scores with simple models will be available shorty.

Clear search

Close search

Google apps

Main menu

CSE-CIC-IDS2018

CSE-CIC-IDS2018

CSE-CIC-IDS2018-V2

CIC-IDS-2018-parquet

Dataset

Contents

Citation Trends for "Optimizing Intrusion Detection Systems in Three Phases...

cic-ids-2018-alldata-textual

Citation Network Graph

Intrusion Detection System Market Analysis North America, APAC, Europe,...

Snapshot img

Network traffic datasets created by Single Flow Time Series Analysis

Confusion matrix.

VHS-22

Algorithm comparison.

Network traffic datasets with novel extended IP flow called NetTiSA flow

Trace-Share Dataset for Evaluation of Statistical Characteristics...

Trace-Share Dataset for Evaluation of Trace Meaning Preservation

DDoS Detection

Data types in CICIDS2018 dataset.

Parkes pulsar observations with undefined project IDs for 2018_BPSR_14

Perimeter Intrusion Detection Systems Market Analysis North America, Europe,...

Snapshot img

2018 U.S. Congressional Election Tweet Ids

CSE-CIC-IDS2018

Follow-up to CIC-IDS2017, network intrusion detection, CIC @UNB Fredericton