100+ datasets found
  1. CIC-IDS 2018 Dataset

    • kaggle.com
    zip
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nagi (2025). CIC-IDS 2018 Dataset [Dataset]. https://www.kaggle.com/datasets/primus11/cic-ids-2018-dataset/data
    Explore at:
    zip(80066040 bytes)Available download formats
    Dataset updated
    Aug 13, 2025
    Authors
    nagi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    CICIDS Dataset

    The Canadian Institute for Cybersecurity Intrusion Detection System (CICIDS) dataset is a modern and comprehensive benchmark dataset for network intrusion detection research.
    It was created by the Canadian Institute for Cybersecurity (CIC) in collaboration with industry partners to address the limitations of older datasets (such as KDD99 and NSL-KDD) by providing realistic traffic patterns, up-to-date attack types, and a balanced mix of normal and malicious activities.

    Key Characteristics

    • Realistic Traffic Generation: Traffic was captured in a controlled but realistic enterprise-like network, including servers, clients, switches, and routers.
    • Diverse Attack Scenarios:
      • Denial-of-Service (DoS) and Distributed Denial-of-Service (DDoS)
      • Brute force (SSH, FTP)
      • Web-based attacks (XSS, SQL Injection, Command Injection)
      • Infiltration from inside the network
      • Botnet activities
      • Port scanning and reconnaissance
    • Data Capture: Raw traffic was recorded in PCAP format.
    • Feature Extraction: Processed with CICFlowMeter to generate over 80 features, including:
      • Flow-based: Duration, total forward/backward packets, packet length statistics
      • Time-based: Inter-arrival times, active and idle times
      • Content-based: HTTP methods, DNS queries, and more
    • Labeling: Each network flow is annotated as either benign or belonging to a specific attack type.
    • Balance: Designed to include both normal and attack traffic with realistic distribution patterns.

    Advantages

    • Reflects modern threats not covered in older datasets.
    • Provides detailed labels for fine-grained attack classification.
    • Suitable for both binary classification (normal vs. attack) and multi-class classification (attack type detection).
    • Enables research in machine learning, deep learning, and feature selection for IDS.

    Usage

    The CICIDS dataset has become a widely adopted benchmark for evaluating Intrusion Detection Systems (IDS) due to its: - Rich feature set - Real-world attack scenarios - Balanced structure for training and testing models

  2. IDS Dataset 2025

    • kaggle.com
    zip
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranto Kumar (2025). IDS Dataset 2025 [Dataset]. https://www.kaggle.com/datasets/prantokumar/ids-dataset-2025
    Explore at:
    zip(775182589 bytes)Available download formats
    Dataset updated
    May 9, 2025
    Authors
    Pranto Kumar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    An Intrusion Detection System (IDS) dataset is a collection of network traffic data, often labeled to distinguish between normal and malicious activities (intrusions or attacks). These datasets are crucial for developing, training, and evaluating Intrusion Detection Systems, which are security tools designed to monitor network traffic for suspicious behavior and alert administrators to potential threats.

  3. Open CAN IDS datasets’ attack metadata.

    • plos.figshare.com
    xls
    Updated Jan 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs (2024). Open CAN IDS datasets’ attack metadata. [Dataset]. http://doi.org/10.1371/journal.pone.0296879.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions or anomalies on CANs. Producing vehicular CAN data with a variety of intrusions is a difficult task for most researchers as it requires expensive assets and deep expertise. To illuminate this task, we introduce the first comprehensive guide to the existing open CAN intrusion detection system (IDS) datasets. We categorize attacks on CANs including fabrication (adding frames, e.g., flooding or targeting and ID), suspension (removing an ID’s frames), and masquerade attacks (spoofed frames sent in lieu of suspended ones). We provide a quality analysis of each dataset; an enumeration of each datasets’ attacks, benefits, and drawbacks; categorization as real vs. simulated CAN data and real vs. simulated attacks; whether the data is raw CAN data or signal-translated; number of vehicles/CANs; quantity in terms of time; and finally a suggested use case of each dataset. State-of-the-art public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, lacking fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but is missing a corresponding “raw” binary version. This issue pigeon-holes CAN IDS research into testing on limited and often inappropriate data (usually with attacks that are too easily detectable to truly test the method). The scarcity of appropriate data has stymied comparability and reproducibility of results for researchers. As our primary contribution, we present the Real ORNL Automotive Dynamometer (ROAD) CAN IDS dataset, consisting of over 3.5 hours of one vehicle’s CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real (i.e. non-simulated) fuzzing, fabrication, unique advanced attacks, and simulated masquerade attacks. To facilitate a benchmark for CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS research field.

  4. CIC-IDS-Collection

    • kaggle.com
    • huggingface.co
    zip
    Updated Nov 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    StrGenIx | Laurens D'hooge (2022). CIC-IDS-Collection [Dataset]. https://www.kaggle.com/datasets/dhoogla/cicidscollection
    Explore at:
    zip(864681190 bytes)Available download formats
    Dataset updated
    Nov 9, 2022
    Authors
    StrGenIx | Laurens D'hooge
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Canadian Institute for Cybersecurity has published several datasets for network intrusion detection. Four of them: CIC-IDS2017, CIC-DoS2017, CSE-CIC-IDS2018 and CIC-DDoS2019 are collated here into one collection, cleaned up and with harmonized labeling.

    The intent behind this collection is simple: to have a larger, more varied set of NIDS samples for more powerful analyses by researchers. Too often, researchers still rely on the individual datasets even though the full set is compatible out-of-the-box. The parts have been created for the same purpose and they have been processed with the same feature extraction tool chain.

    This collection also takes into account 2 articles in which flawed features were discovered. Those features have been removed from the dataset. See the cleanup notebook for more information.

    If you make use of this combined version, please credit the original authors. The relevant publications are cited here on Kaggle alongside the individual datasets and they are also readily available at the CIC's official dataset distribution page

  5. Cybersecurity 🪪 Intrusion 🦠 Detection Dataset

    • kaggle.com
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dinesh Naveen Kumar Samudrala (2025). Cybersecurity 🪪 Intrusion 🦠 Detection Dataset [Dataset]. https://www.kaggle.com/datasets/dnkumars/cybersecurity-intrusion-detection-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Dinesh Naveen Kumar Samudrala
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This Cybersecurity Intrusion Detection Dataset is designed for detecting cyber intrusions based on network traffic and user behavior. Below, I’ll explain each aspect in detail, including the dataset structure, feature importance, possible analysis approaches, and how it can be used for machine learning.

    1. Understanding the Features

    The dataset consists of network-based and user behavior-based features. Each feature provides valuable information about potential cyber threats.

    A. Network-Based Features

    These features describe network-level information such as packet size, protocol type, and encryption methods.

    1. network_packet_size (Packet Size in Bytes)

      • Represents the size of network packets, ranging between 64 to 1500 bytes.
      • Packets on the lower end (~64 bytes) may indicate control messages, while larger packets (~1500 bytes) often carry bulk data.
      • Attackers may use abnormally small or large packets for reconnaissance or exploitation attempts.
    2. protocol_type (Communication Protocol)

      • The protocol used in the session: TCP, UDP, or ICMP.
      • TCP (Transmission Control Protocol): Reliable, connection-oriented (common for HTTP, HTTPS, SSH).
      • UDP (User Datagram Protocol): Faster but less reliable (used for VoIP, streaming).
      • ICMP (Internet Control Message Protocol): Used for network diagnostics (ping); often abused in Denial-of-Service (DoS) attacks.
    3. encryption_used (Encryption Protocol)

      • Values: AES, DES, None.
      • AES (Advanced Encryption Standard): Strong encryption, commonly used.
      • DES (Data Encryption Standard): Older encryption, weaker security.
      • None: Indicates unencrypted communication, which can be risky.
      • Attackers might use no encryption to avoid detection or weak encryption to exploit vulnerabilities.

    B. User Behavior-Based Features

    These features track user activities, such as login attempts and session duration.

    1. login_attempts (Number of Logins)

      • High values might indicate brute-force attacks (repeated login attempts).
      • Typical users have 1–3 login attempts, while an attack may have hundreds or thousands.
    2. session_duration (Session Length in Seconds)

      • A very long session might indicate unauthorized access or persistence by an attacker.
      • Attackers may try to stay connected to maintain access.
    3. failed_logins (Failed Login Attempts)

      • High failed login counts indicate credential stuffing or dictionary attacks.
      • Many failed attempts followed by a successful login could suggest an account was compromised.
    4. unusual_time_access (Login Time Anomaly)

      • A binary flag (0 or 1) indicating whether access happened at an unusual time.
      • Attackers often operate outside normal business hours to evade detection.
    5. ip_reputation_score (Trustworthiness of IP Address)

      • A score from 0 to 1, where higher values indicate suspicious activity.
      • IP addresses associated with botnets, spam, or previous attacks tend to have higher scores.
    6. browser_type (User’s Browser)

      • Common browsers: Chrome, Firefox, Edge, Safari.
      • Unknown: Could be an indicator of automated scripts or bots.

    2. Target Variable (attack_detected)

    • Binary classification: 1 means an attack was detected, 0 means normal activity.
    • The dataset is useful for supervised machine learning, where a model learns from labeled attack patterns.

    3. Possible Use Cases

    This dataset can be used for intrusion detection systems (IDS) and cybersecurity research. Some key applications include:

    A. Machine Learning-Based Intrusion Detection

    1. Supervised Learning Approaches

      • Classification Models (Logistic Regression, Decision Trees, Random Forest, XGBoost, SVM)
      • Train the model using labeled data (attack_detected as the target).
      • Evaluate using accuracy, precision, recall, F1-score.
    2. Deep Learning Approaches

      • Use Neural Networks (DNN, LSTM, CNN) for pattern recognition.
      • LSTMs work well for time-series-based network traffic analysis.

    B. Anomaly Detection (Unsupervised Learning)

    If attack labels are missing, anomaly detection can be used: - Autoencoders: Learn normal traffic and flag anomalies. - Isolation Forest: Detects outliers based on feature isolation. - One-Class SVM: Learns normal behavior and detects deviations.

    C. Rule-Based Detection

    • If certain thresholds are met (e.g., failed_logins > 10 & ip_reputation_score > 0.8), an alert is triggered.

    4. Challenges & Considerations

    • Adversarial Attacks: Attackers may modify traffic to evade detection.
    • Concept Drift: Cyber threats...
  6. Logs in ROAD CAN intrusion detection dataset.

    • plos.figshare.com
    xls
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs (2024). Logs in ROAD CAN intrusion detection dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0296879.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions or anomalies on CANs. Producing vehicular CAN data with a variety of intrusions is a difficult task for most researchers as it requires expensive assets and deep expertise. To illuminate this task, we introduce the first comprehensive guide to the existing open CAN intrusion detection system (IDS) datasets. We categorize attacks on CANs including fabrication (adding frames, e.g., flooding or targeting and ID), suspension (removing an ID’s frames), and masquerade attacks (spoofed frames sent in lieu of suspended ones). We provide a quality analysis of each dataset; an enumeration of each datasets’ attacks, benefits, and drawbacks; categorization as real vs. simulated CAN data and real vs. simulated attacks; whether the data is raw CAN data or signal-translated; number of vehicles/CANs; quantity in terms of time; and finally a suggested use case of each dataset. State-of-the-art public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, lacking fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but is missing a corresponding “raw” binary version. This issue pigeon-holes CAN IDS research into testing on limited and often inappropriate data (usually with attacks that are too easily detectable to truly test the method). The scarcity of appropriate data has stymied comparability and reproducibility of results for researchers. As our primary contribution, we present the Real ORNL Automotive Dynamometer (ROAD) CAN IDS dataset, consisting of over 3.5 hours of one vehicle’s CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real (i.e. non-simulated) fuzzing, fabrication, unique advanced attacks, and simulated masquerade attacks. To facilitate a benchmark for CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS research field.

  7. Network Intrusion Detection Datasets

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ogobuchi Daniel Okey; Demostenes Zegarra Rodriguez (2023). Network Intrusion Detection Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.23118164.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Ogobuchi Daniel Okey; Demostenes Zegarra Rodriguez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    With the continuous expansion of data exchange, the threat of cybercrime and network invasions is also on the rise. This project aims to address these concerns by investigating an innovative approach: an Attentive Transformer Deep Learning Algorithm for Intrusion Detection of IoT Systems using Automatic Xplainable Feature Selection. The primary focus of this project is to develop an effective Intrusion Detection System (IDS) using the aforementioned algorithm. To accomplish this, carefully curated datasets have been utilized, which have been created through a meticulous process involving data extraction from the University of New Brunswick repository. This repository houses the datasets used in this research and can be accessed publically in order to replicate the findings of this research.

  8. Dataset for Network Intrusion Detection System on SCADA IEC 60870-5-104

    • zenodo.org
    • data.niaid.nih.gov
    Updated Aug 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M. Agus Syamsul Arifin; M. Agus Syamsul Arifin; Deris Stiawan; Deris Stiawan; Susanto; Susanto; Rahmat Budiarto; Rahmat Budiarto; Mohd Yazid Idris; Mohd Yazid Idris (2022). Dataset for Network Intrusion Detection System on SCADA IEC 60870-5-104 [Dataset]. http://doi.org/10.5281/zenodo.7034534
    Explore at:
    Dataset updated
    Aug 31, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    M. Agus Syamsul Arifin; M. Agus Syamsul Arifin; Deris Stiawan; Deris Stiawan; Susanto; Susanto; Rahmat Budiarto; Rahmat Budiarto; Mohd Yazid Idris; Mohd Yazid Idris
    Description

    Security is the main challenge in Supervisory Control and Data Acquisition (SCADA) systems since SCADA systems must be connected to heterogeneous networks to save costs. SCADA devices such as RTUs have limited resources, so a small-scale cyber attack on a computer network will have a major impact on the SCADA system. This study discusses the SCADA system with the IEC 60870-5-104 protocol which is widely used in the power plant industry. A physical testbed is built to simulate the electrical distribution process. The SCADA system in the distribution section is more vulnerable than other parts because it is located directly in the community environment so that many holes can be entered by attackers. The purpose of this study is to obtain relevant datasets in the SCADA system. The simulation carried out in this study is a normal communication between the HMI and the RTU, then attacked to disrupt the communication. The attack activities carried out are port scan, brute force and DoS. DoS attacks carried out are ICMP flood, Syn flood, and IEC 104 flood. IEC 104 flood attack is a modified attack to attack RTU where RTU is flooded with an unknown typeid ASDU (Application Service Data Unit). Attacks are carried out using Kali Linux operating system. All scenarios are recorded and saved in pcap. To prove that there is attack data traffic on the IDS dataset Snort and Suricata are used to detect it. In this study, there are also intrusion detection performance results from Snort and Suricata

  9. i

    TOW-IDS: Automotive Ethernet Intrusion Dataset

    • ieee-dataport.org
    Updated Nov 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MEE LAN HAN (2022). TOW-IDS: Automotive Ethernet Intrusion Dataset [Dataset]. https://ieee-dataport.org/documents/tow-ids-automotive-ethernet-intrusion-dataset
    Explore at:
    Dataset updated
    Nov 1, 2022
    Authors
    MEE LAN HAN
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For academic purposes

  10. Dataset for Detection in Multi-IDS Environment

    • kaggle.com
    zip
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arka Ghosh (2025). Dataset for Detection in Multi-IDS Environment [Dataset]. https://www.kaggle.com/datasets/arkaghoshcs/dataset-for-multi-ids-environment
    Explore at:
    zip(23900605 bytes)Available download formats
    Dataset updated
    Jan 29, 2025
    Authors
    Arka Ghosh
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9718963%2F285300ef3cd7e22695f09be521b9a448%2Funknown.png?generation=1738181187047409&alt=media" alt="">The dataset presented aims to support research in developing robust Intrusion Detection Systems (IDS) for modern networks. It simulates a network environment of a fictitious organization with multiple vulnerable hosts and strategic IDS deployments. The experimental setup uses virtual machines to emulate an attacker machine, vulnerable hosts, and IDS devices, connected via Open vSwitches (OVS) with port mirroring to capture traffic. Attack scenarios include multi-hop attacks targeting internal hosts by exploiting vulnerabilities and bypassing traffic restrictions. The raw PcapNG files are complemented with extracted features in CSV format, supporting Machine Learning (ML) analysis. The dataset is designed for training and evaluating IDS models capable of detecting complex, multi-stage attacks in realistic network environments.

  11. Intrusion detection IDS Data cleaned

    • kaggle.com
    zip
    Updated Aug 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    arar tawil (2024). Intrusion detection IDS Data cleaned [Dataset]. https://www.kaggle.com/datasets/araraltawil/ids-data-cleaned
    Explore at:
    zip(219896832 bytes)Available download formats
    Dataset updated
    Aug 4, 2024
    Authors
    arar tawil
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Intrusion Detection Systems (IDSs) and Intrusion Prevention Systems (IPSs) are the most important defense tools against the sophisticated and ever-growing network attacks. Due to the lack of reliable test and validation datasets, anomaly-based intrusion detection approaches are suffering from consistent and accurate performance evolutions.

    Our evaluations of the existing eleven datasets since 1998 show that most are out of date and unreliable. Some of these datasets suffer from the lack of traffic diversity and volumes, some do not cover the variety of known attacks, while others anonymize packet payload data, which cannot reflect the current trends. Some are also lacking feature set and metadata.

    CICIDS2017 dataset contains benign and the most up-to-date common attacks, which resembles the true real-world data (PCAPs). It also includes the results of the network traffic analysis using CICFlowMeter with labeled flows based on the time stamp, source, and destination IPs, source and destination ports, protocols and attack (CSV files). Also available is the extracted features definition.

    Generating realistic background traffic was our top priority in building this dataset. We have used our proposed B-Profile system (Sharafaldin, et al. 2016) to profile the abstract behavior of human interactions and generates naturalistic benign background traffic. For this dataset, we built the abstract behaviour of 25 users based on the HTTP, HTTPS, FTP, SSH, and email protocols.

    The data capturing period started at 9 a.m., Monday, July 3, 2017 and ended at 5 p.m. on Friday July 7, 2017, for a total of 5 days. Monday is the normal day and only includes the benign traffic. The implemented attacks include Brute Force FTP, Brute Force SSH, DoS, Heartbleed, Web Attack, Infiltration, Botnet and DDoS. They have been executed both morning and afternoon on Tuesday, Wednesday, Thursday and Friday.

  12. Intrusion Detection System Market Analysis North America, APAC, Europe,...

    • technavio.com
    pdf
    Updated Oct 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Intrusion Detection System Market Analysis North America, APAC, Europe, Middle East and Africa, South America - US, China, UK, Germany, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/intrusion-detection-system-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 23, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2024 - 2028
    Area covered
    United Kingdom, United States
    Description

    Snapshot img

    Intrusion Detection System Market Size 2024-2028

    The intrusion detection system market size is forecast to increase by USD 4.65 billion at a CAGR of 14% between 2023 and 2028.

    The market is witnessing significant growth due to the escalating number of cyberattacks and the need to secure IT service infrastructure, particularly in the banking and financial services industry (BFSI). IDS solutions employ two primary identification techniques: signature-based and anomaly detection. Signature-based identification relies on known attack patterns, while anomaly detection identifies deviations from normal behavior.
    Additionally, with the rise in digital transactions, there is a growing emphasis on securing security architecture through traffic monitoring and intrusion detection. The market is driven by the increasing demand for BFSI applications and the subsequent need to protect against cyber threats. However, the high cost of maintaining IDS solutions remains a challenge. In conclusion, the IDS market is expected to continue growing as organizations prioritize securing their IT infrastructure against cyber threats.
    

    What will be the Size of the Market During the Forecast Period?

    Request Free Sample

    The Intrusion Detection System (IDS) market is a significant segment of the cybersecurity industry, playing a crucial role in safeguarding IT infrastructure against various cyber threats. IDS solutions help identify and prevent unauthorized access, malicious activities, and potential security breaches. These systems can be categorized into Network Intrusion Detection Systems (NIDS) and Host-based Intrusion Detection Systems (HIDS). IDS and Intrusion Prevention Systems (IPS) are essential components of an organization's cybersecurity strategy. IPS goes beyond simple identification and provides real-time prevention of attacks. Both IDS and IPS are instrumental in mitigating risks from phishing incidents, cyberattacks, and other malicious threats.
    Additionally, cybersecurity is a major concern for various sectors, including BFSI applications, telecom, defense, and cloud computing. With the increasing reliance on IT infrastructure and work from home arrangements, cybersecurity expenditure has seen a significant rise. IDS and IPS solutions are integral to securing data and maintaining information security. Cybercrimes are on the rise, with malicious threat actors constantly evolving their tactics. Traditional signature-based identification methods may not be sufficient to detect advanced threats. Anomaly detection, a key feature of modern IDS and IPS solutions, can help identify unusual patterns and potential threats. IDS and IPS solutions are not limited to protecting traditional IT infrastructure.
    Simultaneously, they also play a vital role in securing cloud computing environments. IDS and IPS as part of IDP (Intrusion Detection and Prevention) systems offer advanced threat detection and prevention capabilities, ensuring comprehensive protection against cyberattacks. Ransomware attacks have emerged as a major concern, with their disruptive impact on business operations. IDS and IPS solutions can help prevent ransomware attacks by identifying and blocking malicious traffic before it can cause damage. In conclusion, IDS and IPS solutions are essential components of an effective cybersecurity strategy. They help organizations protect their IT infrastructure, data security, and information security against various cyber threats, including phishing incidents, cyberattacks, and malicious threat actors. The market for IDS and IPS solutions is expected to grow as organizations continue to invest in advanced cybersecurity solutions to mitigate risks and maintain business continuity. 
    

    How is this market segmented and which is the largest segment?

    The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    Deployment
    
      On-premises
      Cloud-based
    
    
    Geography
    
      North America
    
        US
    
    
      APAC
    
        China
        Japan
    
    
      Europe
    
        Germany
        UK
    
    
      Middle East and Africa
    
    
    
      South America
    

    By Deployment Insights

    The on-premises segment is estimated to witness significant growth during the forecast period.
    

    The on-premises segment is projected to dominate the market in the US, with substantial growth in terms of revenue. Large enterprises, particularly those with a global footprint, are the primary consumers of on-premises intrusion detection systems. The primary reason for this preference is the control it offers over managing software assets, including data generated and stored within business applications. This deployment model enables organizations to ensure compliance with licensing agreements and automate tasks, making it an attractive choice for many busine

  13. Federated Learning for Distributed Intrusion Detection Systems in Public...

    • zenodo.org
    • data.europa.eu
    bz2
    Updated May 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alireza Bakhshi Zadi Mahmoodi; Alireza Bakhshi Zadi Mahmoodi; Panos Kostakos; Panos Kostakos (2023). Federated Learning for Distributed Intrusion Detection Systems in Public Networks - Validation Dataset [Dataset]. http://doi.org/10.5281/zenodo.7956304
    Explore at:
    bz2Available download formats
    Dataset updated
    May 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alireza Bakhshi Zadi Mahmoodi; Alireza Bakhshi Zadi Mahmoodi; Panos Kostakos; Panos Kostakos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset has been meticulously prepared and utilized as a validation set during the evaluation phase of "Meta IDS" to asses the performance of various machine learning models. It is now made available for interested users and researchers who seek a reliable and diverse dataset for training and testing their own custom models.

    The validation dataset comprises a comprehensive collection of labeled entries, that determines whether the packet type is "malicious" or "benign." It covers complex design patterns that are commonly encountered in real-world applications. The dataset is designed to be representative, encompassing edge and fog layers that are in contact with cloud layer, thereby enabling thorough testing and evaluation of different models. Each sample in the dataset is labeled with the corresponding ground truth, providing a reliable reference for model performance evaluation.

    To ensure convenient distribution and storage, the dataset has been broken down into three separate batches, each containing a portion of the dataset. This allows for convenient downloading and management of the dataset. The three batches are provided as individual compressed files.

    In order to extract the data, follow the following instructions:

    • Download and install bzip2 (if not already installed) from the official website or your package manager.
    • Place the compressed dataset file in a directory of your choice.
    • Open a terminal or command prompt and navigate to the directory where the compressed dataset file is located.
    • Execute the following command to uncompress the dataset:
      • bzip2 -d filename.bz2
    • Replace "filename.bz2" with the actual name of the compressed dataset file.

    Once uncompressed, you will have access to the dataset in its original format for further exploration, analysis, and model training etc. The total storage required for extraction is approximately 800 GB in total, with the first batch requiring approximately 302 GB, the second batch requiring approximately 203 GB, and the third batch requiring approximately 297 GB of data storage.

    The first batch contains 1,049,527,992 entries, where as the second batch contains 711,043,331 entries, and for the third and last batch we have 1,029,303,062 entries. The following table provides the feature names along with their explanation and example value once the dataset is extracted.

    FeatureDescriptionExample Value
    ip.srcSource IP address in the packeta05d4ecc38da01406c9635ec694917e969622160e728495e3169f62822444e17
    ip.dstDestination IP address in the packeta52db0d87623d8a25d0db324d74f0900deb5ca4ec8ad9f346114db134e040ec5
    frame.time_epochEpoch time of the frame1676165569.930869
    arp.hw.typeHardware type1
    arp.hw.sizeHardware size6
    arp.proto.sizeProtocol size4
    arp.opcodeOpcode2
    data.lenLength2713
    eth.dst.lgDestination LG bit1
    eth.dst.igDestination IG bit1
    eth.src.lgSource LG bit1
    eth.src.igSource IG bit1
    frame.offset_shiftTime shift for this packet0
    frame.lenframe length on the wire1208
    frame.cap_lenFrame length stored into the capture file215
    frame.markedFrame is marked0
    frame.ignoredFrame is ignored0
    frame.encap_typeEncapsulation type1
    greGeneric Routing Encapsulation'Generic Routing
    Encapsulation (IP)’
    ip.versionVersion6
    ip.hdr_lenHeader length24
    ip.dsfield.dscpDifferentiated Services
    Codepoint
    56
    ip.dsfield.ecnExplicit Congestion
    Notification
    2
    ip.lenTotal length614
    ip.flags.rbReserved bit0
    ip.flags.dfDon't fragment1
    ip.flags.mfMore fragments0
    ip.frag_offsetFragment offset0
    ip.ttlTime to live31
    ip.protoProtocol47
    ip.checksum.statusHeader checksum status2
    tcp.srcportTCP source port53425
    tcp.flagsFlags0x00000098
    tcp.flags.nsNonce0
    tcp.flags.cwrCongestion Window Reduced
    (CWR)
    1
    udp.srcportUDP source port64413
    udp.dstportUDP destination port54087
    udp.streamStream index1345
    udp.lengthLength225
    udp.checksum.statusChecksum status3
    packet_typeType of the packet which is either "benign" or "malicious"0

    Furthermore, in compliance with the GDPR and to ensure the privacy of individuals, all IP addresses present in the dataset have been anonymized through hashing. This anonymization process helps protect the identity of individuals while preserving the integrity and utility of the dataset for research and model development purposes.

    Please note that while the dataset provides valuable insights and a solid foundation for machine learning tasks, it is not a substitute for extensive real-world data collection. However, it serves as a valuable resource for researchers, practitioners, and enthusiasts in the machine learning community, offering a compliant and anonymized dataset for developing and validating custom models in a specific problem domain.

    By leveraging the validation dataset for machine learning model evaluation and custom model training, users can accelerate their research and development efforts, building upon the knowledge gained from my thesis while contributing to the advancement of the field.

  14. Z

    Data from: Dataset for IDS testing

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Jun 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukaseder, Thomas; Wagner, Mathias (2020). Dataset for IDS testing [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892998
    Explore at:
    Dataset updated
    Jun 14, 2020
    Dataset provided by
    Ulm University
    Authors
    Lukaseder, Thomas; Wagner, Mathias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset constructed to trigger IDS rules based on the community data set of the Snort Intrusion Detection System

  15. IoTNet24 Dataset for IDS

    • kaggle.com
    zip
    Updated Mar 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wittigenZ (2024). IoTNet24 Dataset for IDS [Dataset]. https://www.kaggle.com/datasets/wittigenz/hydras
    Explore at:
    zip(123042 bytes)Available download formats
    Dataset updated
    Mar 27, 2024
    Authors
    wittigenZ
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Overview: This dataset presents a subset of network traffic data collected from 20 captures of malicious traffic and 3 captures of live benign traffic on Internet of Things (IoT) devices. It is primarily designed for the development and evaluation of Intrusion Detection Systems (IDS) targeted at IoT devices. The dataset, although not balanced, provides valuable insights into the detection of malicious activities within IoT networks. It contains a total of 23,000+ rows, with duplicates removed for clarity and efficiency.

    Data Features: The dataset includes six key features extracted from the Zeek processing performed by the dataset creators. Each feature serves as a crucial input for building IDS models:

    Responder's Port (id.resp_p): This feature denotes the port number of the responder in the network connection. It is represented as an integer.

    Transport Layer Protocol (proto): Indicates the transport layer protocol used in the connection, with possible values being TCP, UDP, or ICMP (although only TCP and UDP are present in this subset). This feature is stored as a string.

    Connection State (conn_state): Describes the state of the connection, using various indicators such as S0, S1, SF, REJ, among others. This feature is optional and stored as a string.

    Number of Packets Sent by Originator (orig_pkts): Represents the count of packets transmitted by the originator in the connection. It is stored as an optional integer.

    Number of IP Level Bytes Sent by Originator (orig_ip_bytes): Indicates the number of IP level bytes transmitted by the originator. It is stored as an optional integer.

    Number of IP Level Bytes Sent by Responder (resp_ip_bytes): Denotes the number of IP level bytes sent by the responder in the connection. This feature is stored as an optional integer.

    Target Label: The dataset is suited for binary classification tasks, particularly for distinguishing between malicious and benign traffic. The target label, represented by the 'label' feature, specifies whether a data point corresponds to malicious or benign activity. It is stored as a string with enumerated values: 'Malicious' or 'Benign'.

    Data Preprocessing Recommendations: Given that the dataset lacks balanced representation and detailed criteria for sample selection, it's essential to preprocess the data before constructing models. To ensure best practices and model generalization, steps such as data balancing, feature scaling, and potentially feature engineering should be considered. A mock-up processing of this dataset into a model can serve as a preliminary step before utilizing the full dataset for training IDS models aimed at IoT devices.

  16. z

    IEC 60870-5-104 Intrusion Detection Dataset

    • zenodo.org
    • data.europa.eu
    bin, pdf
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Panagiotis; Panagiotis; Konstantinos; Thomas; Thomas; Vasileios; Vasileios; Panagiotis; Panagiotis; Konstantinos (2024). IEC 60870-5-104 Intrusion Detection Dataset [Dataset]. http://doi.org/10.21227/fj7s-f281
    Explore at:
    bin, pdfAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodo
    Authors
    Panagiotis; Panagiotis; Konstantinos; Thomas; Thomas; Vasileios; Vasileios; Panagiotis; Panagiotis; Konstantinos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IEC 60870-5-104

    Intrusion Detection Dataset

    Readme File

    ITHACA – University of Western Macedonia - https://ithaca.ece.uowm.gr/

    Authors: Panagiotis Radoglou-Grammatikis, Thomas Lagkas, Vasileios Argyriou, Panagiotis Sarigiannidis

    Publication Date: September 23, 2022

    1.Introduction

    The evolution of the Industrial Internet of Things (IIoT) introduces several benefits, such as real-time monitoring, pervasive control and self-healing. However, despite the valuable services, security and privacy issues still remain given the presence of legacy and insecure communication protocols like IEC 60870-5-104. IEC 60870-5-104 is an industrial protocol widely applied in critical infrastructures, such as the smart electrical grid and industrial healthcare systems. The IEC 60870-5-104 Intrusion Detection Dataset was implemented in the context of the research paper entitled "Modeling, Detecting, and Mitigating Threats Against Industrial Healthcare Systems: A Combined Software Defined Networking and Reinforcement Learning Approach" [1], in the context of two H2020 projects: ELECTRON: rEsilient and seLf-healed EleCTRical pOwer Nanogrid (101021936) and SDN-microSENSE: SDN - microgrid reSilient Electrical eNergy SystEm (833955). This dataset includes labelled Transmission Control Protocol (TCP)/Internet Protocol (IP) network flow statistics (Common-Separated Values (CSV) format) and IEC 60870-5-104 flow statistics (CSV format) related to twelve IEC 60870-5-104 cyberattacks. In particular, the cyberattacks are related to unauthorised commands and Denial of Service (DoS) activities against IEC 60870-5-104. Moreover, the relevant Packet Capture (PCAP) files are available. The dataset can be utilised for Artificial Intelligence (AI)-based Intrusion Detection Systems (IDS), taking full advantage of Machine Learning (ML) and Deep Learning (DL).

    2.Instructions

    The IEC 60870-5-104 dataset was implemented following the methodology of A. Gharib et al. in [2], including eleven features: (a) Complete Network Configuration, (b) Complete Traffic, (c) Labelled Dataset, (d) Complete Interaction, (e) Complete Capture, (f) Available Protocols, (g) Attack Diversity, (h) Heterogeneity, (i) Feature Set and (j) Metadata.

    A network topology consisting of (a) seven industrial entities, (b) one Human Machine Interfaces (HMI) and (c) three cyberattackers was used to construct the IEC 60870-5-104 Intrusion Detection Dataset. The industrial entities use IEC TestServer[1], while the HMI uses Qtester104[2]. On the other hand, the cyberattackers use Kali Linux[3] equipped with Metasploit[4], OpenMUC j60870[5] and Ettercap[6]. The cyberattacks were performed during the following days.

    • On Saturday, April 25, 2020, a DoS cyberattack (M_SP_NA_1_DoS) was executed for 2 hours, using the M_SP_NA_1 command.
    • On Sunday, April 26, 2020, two cyberattacks were executed, namely (a) DoS (C_CI_NA_1_DoS) and (b) unauthorised injection (C_CI_NA_1), using the C_CI_NA_1 command for 2 hours.
    • On Monday, April 27, 2020, one unauthorised injection attack (C_SE_NA_1) was executed for 4 hours, using the C_SE_NA_1 command.
    • Tuesday, April 28, 2020 two cyberattacks were executed, namely (a) unauthorised injection (C_SC_NA_1) and (b) DoS (C_SE_NA_1_DoS), using the C_SC_NA_1 and C_SE_NA_1 commands for 2 hours and 4 hours, respectively.
    • Wednesday, April 29, 2020, one DoS (C_SC_NA_1) cyberattack was performed for 2 hours, using the C_SC_NA_1 command.
    • Friday, June 05, 2020, two cyberattacks were executed, namely (a) DoS (C_RD_NA_1_DoS) and (b) unauthorised injection (C_RD_NA_1), using the C_RD_NA_1 command for 2 and 4 hours, respectively.
    • Saturday, June 06, 2020, two cyberattacks were executed, namely (a) DoS (C_RP_NA_1_DoS) and (b) unauthorised injection (C_RP_NA_1), using the C_RP_NA_1 command for 2 and 4 hours, respectively.
    • Monday, June 08, 2020, a Man In The Middle (MITM) cyberattack was executed for 2 hours, filtering and dropping the IEC 60870-5-104 packets.

    For each attack, a 7zip file is provided, including the network traffic and the network flow statistics for each entity. Moreover, a relevant diagram is provided, illustrating the corresponding cyberattack. In particular, for each entity, a folder is given, including (a) the relevant pcap file, (b) Transmission Control Protocol (TCP) / Internet Protocol (IP) network flow statistics in a Common Separated Value (CSV) format and (c) IEC 60870-5-104 flow statistics in a CSV format. The TCP/IP network flow statistics were generated by CICFlowMeter[7], while the IEC 60870-5-104 flow statistics were generated based on a Custom IEC 60870-5-104 Python Parser[8], taking full advantage of Scapy[9].

    3.Dataset Structure

    The dataset consists of the following files:

    • 20200425_UOWM_IEC104_Dataset_m_sp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the M_SP_NA_1 attack.
    • 20200426_UOWM_IEC104_Dataset_c_ci_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_CI_NA_1_DoS attack.
    • 20200426_UOWM_IEC104_Dataset_c_ci_na_1.7z: A 7zip file including the pcap and CSV files related to C_CI_NA_1 attack.
    • 20200427_UOWM_IEC104_Dataset_c_se_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1 attack.
    • 20200428_UOWM_IEC104_Dataset_c_sc_na_1.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1 attack.
    • 20200428_UOWM_IEC104_Dataset_c_se_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SE_NA_1_DoS attack.
    • 20200429_UOWM_IEC104_Dataset_c_sc_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_SC_NA_1_DoS attack.
    • 20200605_UOWM_IEC104_Dataset_c_rd_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1_DoS attack.
    • 20200605_UOWM_IEC104_Dataset_c_rd_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RD_NA_1 attack.
    • 20200606_UOWM_IEC104_Dataset_c_rp_na_1_DoS.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1_DoS attack.
    • 20200606_UOWM_IEC104_Dataset_c_rp_na_1.7z: A 7zip file including the pcap and CSV files related to the C_RP_NA_1 attack.
    • 20200608_UOWM_IEC104_Dataset_mitm_drop.7z: A 7zip file including the pcap and CSV files related to the MITM attack.
    • Balanced_IEC104_Train_Test_CSV_Files.zip: This zip file includes balanced CSV files from CICFlowMeter and the Custom IEC 60870-5-104 Python Parser that could be utilised for training ML and DL methods. The zip file includes different folders for the corresponding flow timeout values used for CICFlowMeter and IEC 60870-5-104 Python Parser, respectively.

    Each 7zip file includes respective folders related to the entities/devices (described in the following section) participating in each attack. In particular, for each entity/device, there is a folder including (a) the overall network traffic (pcap file) related to this entity/device during each attack, (b) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the overall network traffic, (c) the IEC 60870-5-104 network traffic (pcap file) related to this entity/device during each attack, (d) the TCP/IP network flow statistics (CSV file) from CICFlowMeter for the IEC 608770-5-104 network traffic, (e) the IEC 60870-5-104 flow statistics (CSV file) from the Custom IEC 60870-5-104 Python Parser for the IEC 608770-5-104 network traffic and finally, (f) an image showing how the attack was executed. Finally, it is noteworthy that the network flow from both CICFlowMeter and Custom IEC 60870-5-104 Python Parser in each CSV file are labelled based on the IEC 60870-5-104 cyberattacks executed for the generation of this dataset. The description of these attacks is given in the following section, while the various features from CICFlowMeter and Custom IEC 60870-5-104 Python Parser are presented in Section 5.

    4.Testbed & IEC 60870-5-104 Attacks

    The testbed created for generating this dataset is composed of five virtual RTU devices emulated by IEC TestServer and two real RTU devices. Moreover, there is another workstation which plays the role of Master Terminal Unit (MTU) and HMI, sending legitimate IEC 60870-5-104 commands to the corresponding RTUs. For this purpose, the workstation uses QTester104. In addition, there are three attackers that act as malicious insiders executing the following cyberattacks against the aforementioned RTUs. Finally, the network traffic data of each entity/device was captured through tshark.

    Table 1: IEC 60870-5-104 Cyberattacks Description

    IEC 60870-5-104 Cyberattack Description

    Description

    Dataset Files

    MITM Drop

    During this attack, the cyberattacker is placed between two endpoints, thus monitoring and dropping the network traffic

  17. h

    resampled_IDS_datasets

    • huggingface.co
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Le (2025). resampled_IDS_datasets [Dataset]. http://doi.org/10.57967/hf/4961
    Explore at:
    Dataset updated
    Jul 17, 2025
    Authors
    Le
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for resampled_IDS_datasets

    Intrusion Detection Systems (IDS) play a crucial role in securing computer networks against malicious activities. However, their efficacy is consistently hindered by the persistent challenge of class imbalance in real-world datasets. While various methods, such as resampling techniques, ensemble methods, cost-sensitive learning, data augmentation, and so on, have individually addressed imbalance classification issues, there exists a notable… See the full description on the dataset page: https://huggingface.co/datasets/Thi-Thu-Huong/resampled_IDS_datasets.

  18. Automotive CAN signal reverse engineering works.

    • plos.figshare.com
    xls
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs (2024). Automotive CAN signal reverse engineering works. [Dataset]. http://doi.org/10.1371/journal.pone.0296879.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Miki E. Verma; Robert A. Bridges; Michael D. Iannacone; Samuel C. Hollifield; Pablo Moriano; Steven C. Hespeler; Bill Kay; Frank L. Combs
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Although ubiquitous in modern vehicles, Controller Area Networks (CANs) lack basic security properties and are easily exploitable. A rapidly growing field of CAN security research has emerged that seeks to detect intrusions or anomalies on CANs. Producing vehicular CAN data with a variety of intrusions is a difficult task for most researchers as it requires expensive assets and deep expertise. To illuminate this task, we introduce the first comprehensive guide to the existing open CAN intrusion detection system (IDS) datasets. We categorize attacks on CANs including fabrication (adding frames, e.g., flooding or targeting and ID), suspension (removing an ID’s frames), and masquerade attacks (spoofed frames sent in lieu of suspended ones). We provide a quality analysis of each dataset; an enumeration of each datasets’ attacks, benefits, and drawbacks; categorization as real vs. simulated CAN data and real vs. simulated attacks; whether the data is raw CAN data or signal-translated; number of vehicles/CANs; quantity in terms of time; and finally a suggested use case of each dataset. State-of-the-art public CAN IDS datasets are limited to real fabrication (simple message injection) attacks and simulated attacks often in synthetic data, lacking fidelity. In general, the physical effects of attacks on the vehicle are not verified in the available datasets. Only one dataset provides signal-translated data but is missing a corresponding “raw” binary version. This issue pigeon-holes CAN IDS research into testing on limited and often inappropriate data (usually with attacks that are too easily detectable to truly test the method). The scarcity of appropriate data has stymied comparability and reproducibility of results for researchers. As our primary contribution, we present the Real ORNL Automotive Dynamometer (ROAD) CAN IDS dataset, consisting of over 3.5 hours of one vehicle’s CAN data. ROAD contains ambient data recorded during a diverse set of activities, and attacks of increasing stealth with multiple variants and instances of real (i.e. non-simulated) fuzzing, fabrication, unique advanced attacks, and simulated masquerade attacks. To facilitate a benchmark for CAN IDS methods that require signal-translated inputs, we also provide the signal time series format for many of the CAN captures. Our contributions aim to facilitate appropriate benchmarking and needed comparability in the CAN IDS research field.

  19. d

    ADFA IDS (Intrusion detection systems) datasets comprising labeled host,...

    • dataone.org
    • dataverse.harvard.edu
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNSW Canberra (2024). ADFA IDS (Intrusion detection systems) datasets comprising labeled host, network and windows stealthy attacks settings [Dataset]. http://doi.org/10.7910/DVN/IFTZPF
    Explore at:
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    UNSW Canberra
    Description

    These are ADFA IDS datasets that contain network IDS datasets and host IDS datasets. These datasets were generated by former UNSW Ph.D. students, postdocs, and academic visitors under the supervision of Prof. Jiankun Hu, who acts as the communication contact. Please read through the file "How to use ADFA-IDS-Datasets, Giden's Ph. Thesis, and web page file for details. NGIDS-DS dataset: It was created by former Ph.D. student Mr. Waqas Haider. This dataset contains the network IDS dataset, which was generated at the next-generation cyber range infrastructure of the Australian Centre OF Cyber Security (ACCS) in the University of New South Wales (UNSW)@ Australian Defence Force Academy(ADFA), Canberra. It is part of the ongoing projects in the ADFA related to cyber security. ADFA-LD, ADFA-WD-SAA, and ADFA-WD datasets: They were coreated by former Ph.D. student Mr. Gideon Creech. They contain Windows host IDS datasets and stealthy attack IDS datasets. netflow_ids_label dataset: It was created by the academic visitor Dr. Quang Anh Tran and UNSW postdoc Dr. Frank Jiang, which provides network flow lables to the 1999 DARPA IDS dataset created by MIT. Please read the relevant real-time network flow publication paper attached. TSE-DS dataset: It was created by former Ph.D. students/postdocs Dr. Nam Tran and Dr. Xuefei Yin. It is a labeled false data injection attack detection dataset.

  20. o

    Electricity and Gas IDS Dataset

    • osti.gov
    Updated Nov 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOE (2021). Electricity and Gas IDS Dataset [Dataset]. http://doi.org/10.25584/PNNLDH/1839095
    Explore at:
    Dataset updated
    Nov 1, 2021
    Dataset provided by
    DOE
    Pacific Northwest National Laboratory 2
    Description

    The following dataset was collected from a set of cybersecurity experiments conducted in an Electricity and Natural Gas environment. The architecture was instantiated within the powerNET testbed at Pacific Northwest National Laboratory, and is comprised of both simulated components and hardware-in-the-loop devices. The test environment consisted of a substation and control center network representative of electrical systems. In addition, it also contained a compressor station, and an odorizer and pressure regulation station representative of oil and natural gas systems. The various devices on the electrical and gas systems were organized into multiple networks to mimic real-world deployments. There were 14 testing scenarios overall that covered a wide variety of cybersecurity and infrastructure events.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nagi (2025). CIC-IDS 2018 Dataset [Dataset]. https://www.kaggle.com/datasets/primus11/cic-ids-2018-dataset/data
Organization logo

CIC-IDS 2018 Dataset

CIC- IDS data for Intrusion detection system

Explore at:
zip(80066040 bytes)Available download formats
Dataset updated
Aug 13, 2025
Authors
nagi
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

CICIDS Dataset

The Canadian Institute for Cybersecurity Intrusion Detection System (CICIDS) dataset is a modern and comprehensive benchmark dataset for network intrusion detection research.
It was created by the Canadian Institute for Cybersecurity (CIC) in collaboration with industry partners to address the limitations of older datasets (such as KDD99 and NSL-KDD) by providing realistic traffic patterns, up-to-date attack types, and a balanced mix of normal and malicious activities.

Key Characteristics

  • Realistic Traffic Generation: Traffic was captured in a controlled but realistic enterprise-like network, including servers, clients, switches, and routers.
  • Diverse Attack Scenarios:
    • Denial-of-Service (DoS) and Distributed Denial-of-Service (DDoS)
    • Brute force (SSH, FTP)
    • Web-based attacks (XSS, SQL Injection, Command Injection)
    • Infiltration from inside the network
    • Botnet activities
    • Port scanning and reconnaissance
  • Data Capture: Raw traffic was recorded in PCAP format.
  • Feature Extraction: Processed with CICFlowMeter to generate over 80 features, including:
    • Flow-based: Duration, total forward/backward packets, packet length statistics
    • Time-based: Inter-arrival times, active and idle times
    • Content-based: HTTP methods, DNS queries, and more
  • Labeling: Each network flow is annotated as either benign or belonging to a specific attack type.
  • Balance: Designed to include both normal and attack traffic with realistic distribution patterns.

Advantages

  • Reflects modern threats not covered in older datasets.
  • Provides detailed labels for fine-grained attack classification.
  • Suitable for both binary classification (normal vs. attack) and multi-class classification (attack type detection).
  • Enables research in machine learning, deep learning, and feature selection for IDS.

Usage

The CICIDS dataset has become a widely adopted benchmark for evaluating Intrusion Detection Systems (IDS) due to its: - Rich feature set - Real-world attack scenarios - Balanced structure for training and testing models

Search
Clear search
Close search
Google apps
Main menu