99 datasets found
  1. i

    CICIDS2017

    • ieee-dataport.org
    Updated Feb 28, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haolei Chen (2026). CICIDS2017 [Dataset]. https://ieee-dataport.org/documents/cicids2017
    Explore at:
    Dataset updated
    Feb 28, 2026
    Authors
    Haolei Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    it has been found that the dataset has few major shortcomings. These issues are sufficient enough to biased the detection engine of any typical IDS.

  2. Improved CICIDS2017 and CSECICIDS2018

    • kaggle.com
    zip
    Updated Aug 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ernie (2023). Improved CICIDS2017 and CSECICIDS2018 [Dataset]. https://www.kaggle.com/datasets/ernie55ernie/improved-cicids2017-and-csecicids2018
    Explore at:
    zip(10985642855 bytes)Available download formats
    Dataset updated
    Aug 15, 2023
    Authors
    Ernie
    Description

    This dataset is obtained from Error Prevalence in NIDS datasets: A Case Study on CIC-IDS-2017 and CSE-CIC-IDS-2018.

    It is improved according to the paper [1].

    [1] Liu, Lisa, et al. "Error prevalence in nids datasets: A case study on cic-ids-2017 and cse-cic-ids-2018." 2022 IEEE Conference on Communications and Network Security (CNS). IEEE, 2022.

  3. Intrusion Detection Datasets (BCCC-CIC-IDS-2017)

    • kaggle.com
    zip
    Updated Apr 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Behaviour-Centric Cybersecurity Center (BCCC) (2025). Intrusion Detection Datasets (BCCC-CIC-IDS-2017) [Dataset]. https://www.kaggle.com/datasets/bcccdatasets/intrusion-detection-datasets-bccc-cic-ids-2017
    Explore at:
    zip(393241701 bytes)Available download formats
    Dataset updated
    Apr 18, 2025
    Authors
    Behaviour-Centric Cybersecurity Center (BCCC)
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Using NLFlowLyzer, we successfully generated the “BCCC-CIC-IDS2017” dataset by extracting key flows from raw network traffic data of CIC-IDS2017, resulting in CSV files integrating essential network and transport layer features. This new dataset offers a structured approach for analyzing intrusion detection, combining diverse traffic types into multiple sub-categories. The “BCCC-CIC-IDS2017” dataset enriches the depth and variety needed to rigorously evaluate our proposed profiling model, advancing research in network security and enhancing the development of intrusion detection systems.

    The full research paper outlining the details of the dataset and its underlying principles:

    "NTLFlowLyzer: Toward Generating an Intrusion Detection Dataset and Intruders Behavior Profiling through Network Layer Traffic Analysis and Pattern Extraction, MohammadMoein Shafi, Arash Habibi Lashkari, Arousha Haghighian Roudsari, Computer & Security, Computers & Security, 104160, ISSN 0167-4048 (2024)"

  4. h

    CICIDS2017

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luiz Alberto Crispiniano Garcia, CICIDS2017 [Dataset]. https://huggingface.co/datasets/lacg030175/CICIDS2017
    Explore at:
    Authors
    Luiz Alberto Crispiniano Garcia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CICIDS2017 Network Intrusion Detection Dataset

    The CICIDS2017 dataset from the Canadian Institute for Cybersecurity, provided with temporal and random splits for fair evaluation.

      Configurations
    
    
    
    
    
      temporal (default) — Day-Based Temporal Split
    

    Note: standard is an alias for temporal — both load the same data.

    Train on Monday-Thursday, test on Friday. The model must generalize to unseen attack types (DDoS, Botnet, PortScan). from datasets import load_dataset ds =… See the full description on the dataset page: https://huggingface.co/datasets/lacg030175/CICIDS2017.

  5. Intrusion Detection (CICIDS2017 )

    • kaggle.com
    zip
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elshewey (2025). Intrusion Detection (CICIDS2017 ) [Dataset]. https://www.kaggle.com/datasets/elshewey/intrusion-detection-cicids2017
    Explore at:
    zip(70797530 bytes)Available download formats
    Dataset updated
    Dec 1, 2025
    Authors
    Elshewey
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description – CICIDS2017 Binary (Normal vs Malicious) The CICIDS2017 Binary dataset is a balanced, binary version of the widely used CICIDS2017 Intrusion Detection benchmark, developed by the Canadian Institute for Cybersecurity (CIC) at the University of New Brunswick. The original CICIDS2017 dataset contains realistic network traffic captured over five consecutive days (July 3-7, 2017) from a real network environment with multiple attacker and victim machines. It includes both benign traffic and various up-to-date attack types such as Brute Force, DoS, DDoS, Port Scanning, Web Attacks, and Botnet activities. Using CICFlowMeter, flow-based network features were extracted from the raw traffic, resulting in labeled CSV files with more than 80 features per connection. While the original dataset is highly imbalanced, with normal traffic dominating, this binary version provides a clean, class-balanced dataset suitable for machine learning experiments. This version of the dataset was derived from the official CICIDS2017 flows.

    The dataset includes all numerical flow-based features extracted by CICFlowMeter, such as duration, byte and packet counts, flags, and statistics computed over both forward and backward directions of each flow. While the original dataset contains multi-class labels identifying the attack type, including Normal Traffic, DoS, DDoS, Port Scanning, Brute Force, Web Attacks, and Bots, this binary version introduces a new target column called Attack_Binary. In this column, a value of 0 indicates normal traffic, whereas a value of 1 corresponds to any type of malicious activity.

    The preprocessing performed to create this balanced version involved several steps. First, all cleaned and preprocessed CSV files were combined into a single dataframe containing numeric features and the original Attack Type label. The labels were normalized by converting them to lowercase and removing extra spaces, with "normal traffic" and its variants considered benign. A binary target column was then generated, where any attack type was assigned a value of 1. Because the original dataset is heavily imbalanced, with approximately 2.09 million normal flows versus 0.43 million malicious flows, class balancing was applied using the RandomUnderSampler method from the imbalanced-learn library, resulting in roughly equal numbers of normal and malicious samples. The final CSV contains all numeric flow features along with the Attack_Binary column as the main target for binary classification.

    This dataset is particularly useful for binary intrusion detection tasks, benchmarking class imbalance handling methods, and comparing classic machine learning algorithms (such as Random Forest, SVM, and XGBoost) with deep learning approaches (including DNN, CNN, and RNN). It can also be employed for feature selection studies, explainability analyses, and as educational material in cybersecurity, machine learning, and big data courses.

  6. h

    CICIDS-2017

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bert van keulen, CICIDS-2017 [Dataset]. https://huggingface.co/datasets/bvk/CICIDS-2017
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    bert van keulen
    Description

    Raw network data was collected over a period of 5 days, Monday through Friday, and stored in PCAP files. Monday was used to create most of the Benign data, while the Attack-Network implemented various types of attacks over the next 4 days, such as Brute Force connections (FTP and SSH), several types of DoS attacks, as well as a Botnet attack, Infiltration attacks and subsequent Port-Scanning activity. The PCAP data was processed using a tool developed by one of the authors of [1], called… See the full description on the dataset page: https://huggingface.co/datasets/bvk/CICIDS-2017.

  7. h

    cyberbert_dataset

    • huggingface.co
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaitany Agrawal (2025). cyberbert_dataset [Dataset]. https://huggingface.co/datasets/agrawalchaitany/cyberbert_dataset
    Explore at:
    Dataset updated
    Apr 10, 2025
    Authors
    Chaitany Agrawal
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Cleaned CICIDS2017 Dataset

    This dataset is a cleaned and preprocessed version of the CICIDS2017 dataset created by the Canadian Institute for Cybersecurity, University of New Brunswick.

      Modifications
    

    Removed duplicate records Normalized feature names Filtered specific attack types Piviot the different attack data into single dataset

      Source
    

    Original dataset: CICIDS2017

      License & Citation
    

    This dataset is provided for research purposes. Please refer… See the full description on the dataset page: https://huggingface.co/datasets/agrawalchaitany/cyberbert_dataset.

  8. I

    CICIDS2017 - Comprehensive Network Intrusion Detection Dataset

    • iotdataset.com
    csv
    Updated Jan 20, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University (Canadian Institute for Cybersecurity) (2026). CICIDS2017 - Comprehensive Network Intrusion Detection Dataset [Dataset]. https://iotdataset.com/data/cicids2017-network-intrusion-detection-dataset
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2026
    Dataset provided by
    IoTDataset.com
    Authors
    University (Canadian Institute for Cybersecurity)
    License

    https://www.unb.ca/cic/datasets/ids-2017.html#downloadhttps://www.unb.ca/cic/datasets/ids-2017.html#download

    Time period covered
    2026
    Description

    The most cited cybersecurity dataset worldwide with 2.8+ million network flows capturing 14 types of realistic attack scenarios including DDoS, brute force, botnet, and web attacks alongside benign traffic for advanced intrusion detection systems.

  9. h

    CIC-IDS2017

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max, CIC-IDS2017 [Dataset]. https://huggingface.co/datasets/c01dsnap/CIC-IDS2017
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Max
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    The CICIDS2017 dataset consists of labeled network flows, including full packet payloads in pcap format, the corresponding profiles and the labeled flows (GeneratedLabelledFlows.zip) and CSV files for machine and deep learning purpose (MachineLearningCSV.zip) are publicly available for researchers. If you are using our dataset, you should cite our related paper which outlining the details of the dataset and its underlying principles:

    Iman Sharafaldin, Arash Habibi Lashkari, and Ali A.… See the full description on the dataset page: https://huggingface.co/datasets/c01dsnap/CIC-IDS2017.

  10. Features of the CIC-IDS 2017 network intrusion dataset.

    • plos.figshare.com
    xls
    Updated Apr 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanchit Vashisht; Shalli Rani; Mohammad Shabaz (2025). Features of the CIC-IDS 2017 network intrusion dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0321224.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 14, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sanchit Vashisht; Shalli Rani; Mohammad Shabaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Features of the CIC-IDS 2017 network intrusion dataset.

  11. f

    Distribution of stream records in CICIDS2017 dataset.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rodríguez, Demóstenes Zegarra; Maidin, Siti Sarah; Okey, Ogobuchi Daniel; Udo, Ekikere Umoren; Kleinschmidt, João Henrique (2023). Distribution of stream records in CICIDS2017 dataset. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001087120
    Explore at:
    Dataset updated
    Oct 16, 2023
    Authors
    Rodríguez, Demóstenes Zegarra; Maidin, Siti Sarah; Okey, Ogobuchi Daniel; Udo, Ekikere Umoren; Kleinschmidt, João Henrique
    Description

    Distribution of stream records in CICIDS2017 dataset.

  12. CICIDS2017: Cleaned & Preprocessed

    • kaggle.com
    zip
    Updated Jan 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric Anacleto Ribeiro (2025). CICIDS2017: Cleaned & Preprocessed [Dataset]. https://www.kaggle.com/datasets/ericanacletoribeiro/cicids2017-cleaned-and-preprocessed/discussion
    Explore at:
    zip(210143955 bytes)Available download formats
    Dataset updated
    Jan 12, 2025
    Authors
    Eric Anacleto Ribeiro
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Cleaned and Preprocessed CICIDS2017 Data for Machine Learning

    This dataset provides a cleaned and preprocessed version of the original CICIDS2017 network intrusion detection dataset, prepared for machine learning. It includes the following CSV file:

    1. cicids2017_cleaned.csv: Contains the raw, unscaled feature values after cleaning and preprocessing, ready for further treatment (such as scaling and sampling) after train/test split.

    Original Dataset:

    The CICIDS2017 dataset (available here) is a widely used benchmark dataset in cybersecurity research. It captures network traffic with both benign (normal) activity and various attack scenarios, making it suitable for developing and testing intrusion detection systems. However, the original dataset presents some challenges for direct use in machine learning due to missing values, duplicate entries, inconsistencies, and the need for feature engineering.

    Steps Taken:

    1. File Merging: The original CICIDS2017 dataset is split across multiple CSV files. These files have been merged into a single, unified dataset.
    2. Duplicate Removal: Duplicate rows have been identified and removed to improve data integrity and model performance. The same process was applied to identify and remove duplicate columns.
    3. Infinite Value Handling: Infinite values have been replaced with NaN and then handled along with other missing values.
    4. Missing Value Handling: Rows with missing values have been removed due to their minimal impact on the dataset (less than 1% of total rows). This decision simplifies data handling while minimizing the risk of introducing bias through imputation.
    5. Inconsistent Whitespace: Leading and trailing whitespace in column names have been removed for consistency.
    6. Data-Driven Feature Selection:
      • Columns with only one unique value have been removed, as they do not contribute to the performance of machine learning models.
      • Correlation Analysis: To reduce multicollinearity, one feature from each pair with a near-perfect correlation (>= 0.99) has been removed. This simplifies the dataset and can improve the interpretability of machine learning models.
      • H-Statistics and Tree Feature Selection: The Kruskal-Wallis test has been combined with the built-in feature selection capabilities of Random Forest to eliminate statistically irrelevant columns from the dataset.
    7. Target Feature Handling:
      • The original 'Label' column has been converted into a new column named 'Attack Type', with similar attack labels grouped into broader categories (e.g., DoS Hulk, DoS GoldenEye are grouped as "DoS").
      • Rare attack types ('Infiltration', 'Heartbleed') have been removed to prevent potential overfitting and improve model generalization.

    Source Code and Project:

    • Notebook: The Jupyter Notebook used to generate this dataset is available here.
    • Repository: This dataset is part of a larger project to develop a Raspberry Pi-based Network Intrusion Detection System (NIDS) prototype for Small and Medium Enterprises (SMEs). The complete project repository, including the NIDS prototype code, is available on GitHub.

    Kudos to chethuhn, who, among others, uploaded the original CICIDS2017 to Kaggle.

  13. UNSW-NB15 and CIC-IDS2017 Labelled PCAP Data

    • zenodo.org
    • kaggle.com
    csv
    Updated Oct 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian; Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian (2022). UNSW-NB15 and CIC-IDS2017 Labelled PCAP Data [Dataset]. http://doi.org/10.5281/zenodo.7258579
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian; Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Packet Capture (PCAP) files of UNSW-NB15 and CIC-IDS2017 dataset are processed and labelled utilizing the CSV files. Each packet is labelled by comparing the eight distinct features: *Source IP, Destination IP, Source Port, Destination Port, Starting time, Ending time, Protocol and Time to live*. The dimensions for the dataset is Nx1504. All column of the dataset are integers, therefore you can directly utilize this dataset in you machine learning models. Moreover, details of the whole processing and transformation is provided in the following GitHub Repo:

    https://github.com/Yasir-ali-farrukh/Payload-Byte

    You can utilize the tool available at the above mentioned GitHub repo to generate labelled dataset from scratch. All of the detail of processing and transformation is provided in the following paper:

    ```yaml
    @article{Payload,
    author = "Yasir Ali Farrukh and Irfan Khan and Syed Wali and David Bierbrauer and Nathaniel Bastian",
    title = "{Payload-Byte: A Tool for Extracting and Labeling Packet Capture Files of Modern Network Intrusion Detection Datasets}",
    year = "2022",
    month = "9",
    url = "https://www.techrxiv.org/articles/preprint/Payload-Byte_A_Tool_for_Extracting_and_Labeling_Packet_Capture_Files_of_Modern_Network_Intrusion_Detection_Datasets/20714221",
    doi = "10.36227/techrxiv.20714221.v1"
    }

  14. h

    CICIDS2017-Images-spectrograms

    • huggingface.co
    Updated Jan 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Rashid (2018). CICIDS2017-Images-spectrograms [Dataset]. https://huggingface.co/datasets/rashid-rao/CICIDS2017-Images-spectrograms
    Explore at:
    Dataset updated
    Jan 15, 2018
    Authors
    Muhammad Rashid
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    This directory consists on 24x24 images

    train folder have total 1548421 images from 10 classes test folder have 663609 images from 10 classes

    This Dataset is Spectrogram converted images using method explained in our research article XYZ. Dataset Used: Intrusion detection evaluation dataset (CIC-IDS2017) Image Size: 28x28 Classes:

    BENIGN Bot DDoS DoS GoldenEye DoS Hulk DoS Slowhttptest DoS slowloris Heartbleed Infiltration PortScan

    License: https://www.unb.ca/cic/datasets/ids-2017.html… See the full description on the dataset page: https://huggingface.co/datasets/rashid-rao/CICIDS2017-Images-spectrograms.

  15. h

    CICIDS2017

    • huggingface.co
    Updated Jan 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Love (2018). CICIDS2017 [Dataset]. https://huggingface.co/datasets/bencorn/CICIDS2017
    Explore at:
    Dataset updated
    Jan 15, 2018
    Authors
    Love
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    CICIDS2017 (Unofficial mirror on Hugging Face)

      Dataset Summary
    

    This repository provides a mirrored copy of the CICIDS2017 dataset files (PCAPs and accompanying archives) for easier access and reproducibility in ML/security research workflows. Important: This is not the original distribution. Please refer to the official source for authoritative documentation, updates, and terms.

      Source / Origin
    

    Original dataset name: CICIDS2017 Original publisher: Canadian… See the full description on the dataset page: https://huggingface.co/datasets/bencorn/CICIDS2017.

  16. The results in the CICIDS2017 dataset.

    • plos.figshare.com
    xls
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Congyuan Xu; Yong Zhan; Guanghui Chen; Zhiqiang Wang; Siqing Liu; Weichen Hu (2025). The results in the CICIDS2017 dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0317713.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 16, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Congyuan Xu; Yong Zhan; Guanghui Chen; Zhiqiang Wang; Siqing Liu; Weichen Hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The network intrusion detection system (NIDS) plays a critical role in maintaining network security. However, traditional NIDS relies on a large volume of samples for training, which exhibits insufficient adaptability in rapidly changing network environments and complex attack methods, especially when facing novel and rare attacks. As attack strategies evolve, there is often a lack of sufficient samples to train models, making it difficult for traditional methods to respond quickly and effectively to new threats. Although existing few-shot network intrusion detection systems have begun to address sample scarcity, these systems often fail to effectively capture long-range dependencies within the network environment due to limited observational scope. To overcome these challenges, this paper proposes a novel elevated few-shot network intrusion detection method based on self-attention mechanisms and iterative refinement. This approach leverages the advantages of self-attention to effectively extract key features from network traffic and capture long-range dependencies. Additionally, the introduction of positional encoding ensures the temporal sequence of traffic is preserved during processing, enhancing the model’s ability to capture temporal dynamics. By combining multiple update strategies in meta-learning, the model is initially trained on a general foundation during the training phase, followed by fine-tuning with few-shot data during the testing phase, significantly reducing sample dependency while improving the model’s adaptability and prediction accuracy. Experimental results indicate that this method achieved detection rates of 99.90% and 98.23% on the CICIDS2017 and CICIDS2018 datasets, respectively, using only 10 samples.

  17. CIC-IDS-2017-V2

    • kaggle.com
    zip
    Updated Nov 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abluva Research (2024). CIC-IDS-2017-V2 [Dataset]. https://www.kaggle.com/datasets/abluvaresearch/cic-ids-2017-v2
    Explore at:
    zip(384178816 bytes)Available download formats
    Dataset updated
    Nov 26, 2024
    Authors
    Abluva Research
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The CIC-IDS-V2 is an extended version of the original CIC-IDS 2017 dataset. The dataset is normalised and 1 new class called "Comb" is added which is a combination of synthesised data of multiple non-benign classes.

    To cite the dataset, please reference the original paper with DOI: 10.1109/SmartNets61466.2024.10577645. The paper is published in IEEE SmartNets and can be accessed here.

    Citation info:

    Madhubalan, Akshayraj & Gautam, Amit & Tiwary, Priya. (2024). Blender-GAN: Multi-Target Conditional Generative Adversarial Network for Novel Class Synthetic Data Generation. 1-7. 10.1109/SmartNets61466.2024.10577645.

    This dataset was made by Abluva Inc, a Palo Alto based, research-driven Data Protection firm. Our data protection platform empowers customers to secure data through advanced security mechanisms such as Fine Grained Access control and sophisticated depersonalization algorithms (e.g. Pseudonymization, Anonymization and Randomization). Abluva's Data Protection solutions facilitate data democratization within and outside the organizations, mitigating the concerns related to theft and compliance. The innovative intrusion detection algorithm by Abluva employs patented technologies for an intricately balanced approach that excludes normal access deviations, ensuring intrusion detection without disrupting the business operations. Abluva’s Solution enables organizations to extract further value from their data by enabling secure Knowledge Graphs and deploying Secure Data as a Service among other novel uses of data. Committed to providing a safe and secure environment, Abluva empowers organizations to unlock the full potential of their data.

  18. Ablation results of CICIDS2017 data set.

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haizhen Wang; Xiaojing Yang; Na Jia (2025). Ablation results of CICIDS2017 data set. [Dataset]. http://doi.org/10.1371/journal.pone.0322839.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Haizhen Wang; Xiaojing Yang; Na Jia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Software Defined Networking (SDN) is an emerging network architecture and management method, whose core idea is to separate the network control plane from the data transmission plane. It is precisely because of this characteristic that SDN controllers are susceptible to external malicious attacks, the most common of which are Distributed Denial of Service (DDoS) attacks. This paper suggests a way to find DDoS attacks called ConvLTSM-MHA-TWD. It is based on the Convolutional Long Short-Term Memory Network (ConvLSTM) and three-way decision (TWD). It solves the problem of insufficient feature extraction in SDN environment and improves classification accuracy. This method uses ConvLSTM to extract data features, and uses multi-head attention (MHA) mechanism to learn the long-distance dependence relationship in the input data, and then constructs multi-granularity feature space. ConvLSTM and MHA outputs are added to form a residual connection to further enhance feature extraction and timing modeling capabilities and solve the problem of gradient disappearance during model training. Then the three-way decision theory is used to make decisions on network behaviors immediately. For the network behaviors that cannot be made immediately, the delayed decision is made, and the feature extraction and decision are made on this part of the network behaviors again. Finally, the classification results are output. This paper conducted experiments on data sets CICIDS2017 and DDoS SDN, with accuracy rates of 0.994 and 0.977, respectively, which has better overall performance, and is suitable for training large amounts of data.

  19. Network Intrusion dataset(CIC-IDS- 2017)

    • kaggle.com
    zip
    Updated Aug 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chethan H N (2023). Network Intrusion dataset(CIC-IDS- 2017) [Dataset]. https://www.kaggle.com/datasets/chethuhn/network-intrusion-dataset/versions/1
    Explore at:
    zip(240838240 bytes)Available download formats
    Dataset updated
    Aug 28, 2023
    Authors
    Chethan H N
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is the Intrusion Detection Evaluation Dataset (CIC-IDS2017) you can find the dataset by this link

    This Network dataset has 2 Class one is Normal and another one is Anomaly ,

    These are the things you can try in this data

    1) The main aim is detect the anomaly using labelled data

    2) Also try to detect the patterns in Normal and anomaly data without using labelled data by unsupervised methods

    3) Also try to make a model which detects abnormal behaviour in each system if you can .

    Performance Metrics

    • Accuracy
    • Precision(Weighted,micro,macro)
    • Recall
    • ROC
    • AUC
    • F1-Score(Weighted,micro,macro)
    • Recall
    • Sensititvity
    • Classification Report
    • Custom metric

    Select any of these metrics to validate your model and validate your reason why are you choosing this particular metric?

    These are the details of Dataset you can find it more about this dataset in this link

    Features of Dataset

    https://www.researchgate.net/publication/329758083/figure/tbl1/AS:705248024879106@1545155640395/Network-flow-features-of-the-CICIDS2017-dataset.png" alt="">

    Normal and Attack Activity details of each file

    https://www.researchgate.net/publication/329045441/figure/tbl1/AS:708844267241473@1546013051414/Description-of-files-containing-CICIDS2017-dataset.png" alt="">

    Monday, July 3, 2017

    • Benign (Normal human activities)

    Tuesday, July 4, 2017

    • Brute Force

    • FTP-Patator (9:20 – 10:20 a.m.)

    • SSH-Patator (14:00 – 15:00 p.m.)

    Attacker: Kali, 205.174.165.73

    Victim: WebServer Ubuntu, 205.174.165.68 (Local IP: 192.168.10.50)

    NAT Process on Firewall:

    Attack: 205.174.165.73 -> 205.174.165.80 (Valid IP of the Firewall) -> 172.16.0.1 -> 192.168.10.50

    Reply: 192.168.10.50 -> 172.16.0.1 -> 205.174.165.80 -> 205.174.165.73

    Wednesday, July 5, 2017

    • DoS / DDoS

    • DoS slowloris (9:47 – 10:10 a.m.)

    DoS Slowhttptest (10:14 – 10:35 a.m.)

    • DoS Hulk (10:43 – 11 a.m.)

    • DoS GoldenEye (11:10 – 11:23 a.m.)

    Attacker: Kali, 205.174.165.73

    Victim: WebServer Ubuntu, 205.174.165.68 (Local IP192.168.10.50)

    NAT Process on Firewall:

    Attack: 205.174.165.73 -> 205.174.165.80 (Valid IP of the Firewall) -> 172.16.0.1 -> 192.168.10.50

    Reply: 192.168.10.50 -> 172.16.0.1 -> 205.174.165.80 -> 205.174.165.73

    Heartbleed Port 444 (15:12 - 15:32)

    • Attacker: Kali, 205.174.165.73

    Victim: Ubuntu12, 205.174.165.66 (Local IP192.168.10.51)

    NAT Process on Firewall:

    • Attack: 205.174.165.73 -> 205.174.165.80 (Valid IP of the Firewall) -> 172.16.0.11 -> 192.168.10.51

    Reply: 192.168.10.51 -> 172.16.0.1 -> 205.174.165.80 -> 205.174.165.73

    Thursday, July 6, 2017

    Morning - Web Attack – Brute Force (9:20 – 10 a.m.)

    • Web Attack – XSS (10:15 – 10:35 a.m.)

    • Web Attack – Sql Injection (10:40 – 10:42 a.m.)

    • Attacker: Kali, 205.174.165.73

    • Victim: WebServer Ubuntu, 205.174.165.68 (Local IP192.168.10.50)

    NAT Process on Firewall:

    • Attack: 205.174.165.73 -> 205.174.165.80 (Valid IP of the Firewall) -> 172.16.0.1 -> 192.168.10.50

    • Reply: 192.168.10.50 -> 172.16.0.1 -> 205.174.165.80 -> 205.174.165.73

    Afternoon - Infiltration – Dropbox download

    • Meta exploit Win Vista (14:19 and 14:20-14:21 p.m.) and (14:33 -14:35)

    • Attacker: Kali, 205.174.165.73

    • Victim: Windows Vista, 192.168.10.8

    Infiltration – Cool disk – MAC (14:53 p.m. – 15:00 p.m.)

    Attacker: Kali, 205.174.165.73

    Victim: MAC, 192.168.10.25

    Infiltration – Dropbox download

    Win Vista (15:04 – 15:45 p.m.)

    First Step:

    Attacker: Kali, 205.174.165.73

    Victim: Windows Vista, 192.168.10.8

    Second Step (Portscan + Nmap):

    Attacker:Vista, 192.168.10.8

    Victim: All other clients

    Friday, July 7, 2017

    Morning - Botnet ARES (10:02 a.m. – 11:02 a.m.)

    • Attacker: Kali, 205.174.165.73

    • Victims: Win 10, 192.168.10.15 + Win 7, 192.168.10.9 + Win 10, 192.168.10.14 + Win 8, 192.168.10.5 + Vista, 192.168.10.8

    Afternoon - Port Scan:

    Firewall Rule on (13:55 – 13:57, 13:58 – 14:00, 14:01 – 14:04, 14:05 – 14:07, 14:08 - 14:10, 14:11 – 14:13, 14:14 – 14:16, 14:17 – 14:19, 14:20 – 14:21, 14:22 – 14:24, 14:33 – 14:33, 14:35 - 14:35)

    Firewall rules off (sS 14:51-14:53, sT 14:54-14:56, sF 14:57-14:59, sX 15:00-15:02, sN 15:03-15:05, sP 15:06-15:07, sV 15:08-15:10, sU 15:11-15:12, sO 15:13-15:15, sA 15:16-15:18, sW 15:19-15:21, sR 15:22-15:24, sL 15:25-15:25, sI 15:26-15:27, b 15:28-15:29)

    • Attacker: Kali, 205.174.165.73

    • Victim: Ubuntu16, 205.174.165.68 (Local IP: 192.168.10.50)

    NAT Process on Firewall:

    • Attacker: 205.174.165.73 -> 205.174.165.80 (Valid IP of the Firewall) -> 172.16.0.1

    Afternoon DDoS LOIT (15:56 – 16:16)

    Attackers: Three Win 8.1, 205.174.165.69 - 71

    Victim: Ubuntu16, 205.174.165.68 (Local IP: 192.168.10.50)

    NAT Process on Firewall...

  20. CIC-IDS-Collection

    • kaggle.com
    • huggingface.co
    zip
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    StrGenIx | Laurens D'hooge (2022). CIC-IDS-Collection [Dataset]. https://www.kaggle.com/datasets/dhoogla/cicidscollection
    Explore at:
    zip(864681190 bytes)Available download formats
    Dataset updated
    Nov 9, 2022
    Authors
    StrGenIx | Laurens D'hooge
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Canadian Institute for Cybersecurity has published several datasets for network intrusion detection. Four of them: CIC-IDS2017, CIC-DoS2017, CSE-CIC-IDS2018 and CIC-DDoS2019 are collated here into one collection, cleaned up and with harmonized labeling.

    The intent behind this collection is simple: to have a larger, more varied set of NIDS samples for more powerful analyses by researchers. Too often, researchers still rely on the individual datasets even though the full set is compatible out-of-the-box. The parts have been created for the same purpose and they have been processed with the same feature extraction tool chain.

    This collection also takes into account 2 articles in which flawed features were discovered. Those features have been removed from the dataset. See the cleanup notebook for more information.

    If you make use of this combined version, please credit the original authors. The relevant publications are cited here on Kaggle alongside the individual datasets and they are also readily available at the CIC's official dataset distribution page

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Haolei Chen (2026). CICIDS2017 [Dataset]. https://ieee-dataport.org/documents/cicids2017

CICIDS2017

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 28, 2026
Authors
Haolei Chen
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

it has been found that the dataset has few major shortcomings. These issues are sufficient enough to biased the detection engine of any typical IDS.

Search
Clear search
Close search
Google apps
Main menu