75 datasets found
  1. h

    UNSW-NB15

    • huggingface.co
    Updated Mar 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Witold Wydmański (2023). UNSW-NB15 [Dataset]. https://huggingface.co/datasets/wwydmanski/UNSW-NB15
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 19, 2023
    Authors
    Witold Wydmański
    Description

    Source

    https://www.kaggle.com/datasets/dhoogla/unswnb15?resource=download

      Dataset
    

    This is an academic intrusion detection dataset. All the credit goes to the original authors: dr. Nour Moustafa and dr. Jill Slay. Please cite their original paper and all other appropriate articles listed on the UNSW-NB15 page. The full dataset also offers the pcap, BRO and Argus files along with additional documentation. The modifications to the predesignated train-test sets are minimal… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/UNSW-NB15.

  2. UNSW-NB15 complete Dataset

    • kaggle.com
    zip
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harshwardhan Bhangale (2025). UNSW-NB15 complete Dataset [Dataset]. https://www.kaggle.com/datasets/harshwardhanbhangale/unsw-complete-dataset
    Explore at:
    zip(143768237 bytes)Available download formats
    Dataset updated
    Apr 28, 2025
    Authors
    Harshwardhan Bhangale
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Overview The UNSW-NB15 dataset was generated in the Cyber Range Lab at UNSW Canberra with the IXIA PerfectStorm tool. It captures a hybrid of realistic benign traffic and nine modern attack families—Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode, Worms—all recorded as raw pcap files and distilled into flow-level CSVs.

    Files included in this mirror

    File Rows Purpose UNSW_NB15_training-set.csv 175 341 Author-supplied training split UNSW_NB15_testing-set.csv 82 332 Author-supplied test split UNSW-NB15_features.csv 49 Human-readable feature definitions (Full 2.54 M-row shards UNSW-NB15_1–4.csv are available from the official site if you need the entire corpus.)
    Each record contains 49 engineered features—extracted via Argus and Bro/Zeek—plus a label column that marks the traffic as normal (0) or attack (1), with the attack_cat field specifying the attack family.

    Common research uses

    Training / benchmarking machine-learning and deep-learning intrusion-detection models

    Feature-selection and class-imbalance studies

    Comparative evaluations against KDD-99, CIC-IDS 2018, Kyoto 2006+, etc.

    Citation

    Nour Moustafa and Jill Slay, “UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set),” Military Communications and Information Systems Conference (MilCIS), 2015. DOI: 10.1109/MilCIS.2015.7348942

    Please cite this paper if you use the dataset in academic work.

    Licence Released by UNSW Canberra under the GNU General Public License v3.0 (GPL-3.0). This Kaggle mirror preserves the original licence; see https://www.gnu.org/licenses/gpl-3.0.html and the official project page https://research.unsw.edu.au/projects/unsw-nb15-dataset for full terms.

    Contact For questions or pcap access please email the authors (Dr Nour Moustafa: nour.moustafa@unsw.edu.au).

  3. r

    The UNSW-NB15 dataset

    • researchdata.edu.au
    Updated 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moustafa Nour; University of New South Wales; University of New South Wales; The University of New South Wales; Nour Moustafa (2019). The UNSW-NB15 dataset [Dataset]. http://doi.org/10.26190/5D7AC5B1E8485
    Explore at:
    Dataset updated
    2019
    Dataset provided by
    University of New South Wales
    UNSW, Sydney
    Authors
    Moustafa Nour; University of New South Wales; University of New South Wales; The University of New South Wales; Nour Moustafa
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Time period covered
    Sep 30, 2015 - Present
    Description

    The raw network packets of the UNSW-NB 15 dataset was created by the IXIA PerfectStorm tool in the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) for generating a hybrid of real modern normal activities and synthetic contemporary attack behaviours. Tcpdump tool is utilised to capture 100 GB of the raw traffic (e.g., Pcap files). This data set has nine types of attacks, namely, Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode and Worms. The Argus, Bro-IDS tools are used and twelve algorithms are developed to generate totally 49 features with the class label.

  4. UNSW-NB15 dataset

    • kaggle.com
    zip
    Updated Aug 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nagi (2025). UNSW-NB15 dataset [Dataset]. https://www.kaggle.com/datasets/primus11/unsw-nb15-dataset
    Explore at:
    zip(12487656 bytes)Available download formats
    Dataset updated
    Aug 13, 2025
    Authors
    nagi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    UNSW-NB15 Dataset

    The UNSW-NB15 dataset is a modern and comprehensive benchmark dataset for network intrusion detection research.
    It was created by the Cyber Range Lab at the Australian Centre for Cyber Security (ACCS) in 2015 to address the limitations of older datasets (such as KDD99 and NSL-KDD) by providing realistic traffic patterns, contemporary attack types, and a balanced representation of normal and malicious activities.

    Key Characteristics

    • Realistic Traffic Generation: Traffic was generated using the IXIA PerfectStorm tool, simulating both legitimate and malicious network behavior in a controlled environment.
    • Diverse Attack Scenarios:
      • Fuzzers
      • Analysis
      • Backdoors
      • DoS (Denial-of-Service)
      • Exploits
      • Generic attacks
      • Reconnaissance
      • Shellcode
      • Worms
    • Data Capture: Raw network traffic was captured in PCAP format.
    • Feature Extraction: Features were generated using Argus and Bro-IDS tools, resulting in 49 attributes, including:
      • Flow-based: Duration, source/destination bytes, packet counts
      • Content-based: Payload characteristics, HTTP methods
      • Time-based: Flow inter-arrival times, active and idle periods
      • Additional generated features: Statistical measures for deeper analysis
    • Labeling: Each record is labeled as either benign or belonging to one of the nine attack categories.
    • Data Volume: Contains 2,540,044 records, split into training and testing sets.

    Advantages

    • Represents modern cyber threats absent in older datasets.
    • Includes multiple attack categories for fine-grained classification.
    • Suitable for binary classification (normal vs. attack) and multi-class classification (attack type identification).
    • Balanced design for realistic intrusion detection system (IDS) evaluation.

    Usage

    The UNSW-NB15 dataset is widely used as a benchmark in intrusion detection and cybersecurity research due to its: - Comprehensive attack coverage - Rich set of network flow features - Realistic traffic patterns for both training and testing models

  5. Attack types and their description in the UNSW-NB15 dataset.

    • plos.figshare.com
    xls
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman (2024). Attack types and their description in the UNSW-NB15 dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0302294.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 23, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Arshad Hashmi; Omar M. Barukab; Ahmad Hamza Osman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Attack types and their description in the UNSW-NB15 dataset.

  6. UNSW-NB15 Dataset

    • kaggle.com
    zip
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Devops (2025). UNSW-NB15 Dataset [Dataset]. https://www.kaggle.com/datasets/freshersstaff/unsw-nb15-dataset
    Explore at:
    zip(164408675 bytes)Available download formats
    Dataset updated
    Nov 27, 2025
    Authors
    Devops
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The UNSW-NB15 dataset consists of raw network packets that were generated by a tool called IXIA PerfectStorm in the Cyber Range Lab. It contains a hybrid of real modern normal activities and synthetic contemporary attack behaviours. The dataset has nine types of attacks, including Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic, Reconnaissance, Shellcode and Worms. The Argus and Bro-IDS tools were used, and twelve algorithms were developed to generate 49 features along with the class label. The dataset has a total of 2,540,044 records stored in four CSV files, with the training set and testing set containing 175,341 and 82,332 records respectively. The ground truth table is named UNSW-NB15_GT.csv, and the list of event files is called UNSW-NB15_LIST_EVENTS.csv. The dataset has been used in various research papers for intrusion detection, network forensics, privacy-preserving, and threat intelligence approaches in different systems, such as Network Systems, Internet of Things (IoT), SCADA, Industrial IoT, and Industry 4.0. The authors of the dataset have granted free use of the dataset for academic research purposes, while commercial use requires their approval.

  7. Z

    The UNSW-NB15 dataset with binarized features

    • data.niaid.nih.gov
    • nde-dev.biothings.io
    Updated Feb 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umuroglu, Yaman (2021). The UNSW-NB15 dataset with binarized features [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4519766
    Explore at:
    Dataset updated
    Feb 9, 2021
    Dataset provided by
    Xilinx Inc.
    Authors
    Umuroglu, Yaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Binarized version of the UNSW-NB15 dataset, where the original features (a mix of strings, categorical values, floating point values etc) are converted to a bit string of 593 bits. Each value in each feature is either 0 or 1, stored as a uint8 value. The uint8 values are represented as numpy arrays, provided separately for training and test data (same train/test split as the original dataset is used). The final binary value in each sample is the expected output.

    Among others, this dataset has been used for quantized neural network research:

    Umuroglu, Y., Akhauri, Y., Fraser, N. J., & Blott, M. (2020, August). LogicNets: Co-Designed Neural Networks and Circuits for Extreme-Throughput Applications. In 2020 30th International Conference on Field-Programmable Logic and Applications (FPL) (pp. 291-297). IEEE.

    The method for binarization is identical to the one described in 10.5281/zenodo.3258657 :

    "T. Murovič, A. Trost, Massively Parallel Combinational Binary Neural Networks for Edge Processing, Elektrotehniški vestnik, vol. 86, no. 1-2, pp. 47-53, 2019"

    The original UNSW-NB15 dataaset is by:

    Moustafa, Nour, and Jill Slay. "UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set)." Military Communications and Information Systems Conference (MilCIS), 2015. IEEE, 2015.

  8. CIC UNSW-NB15 Augmented Dataset

    • kaggle.com
    Updated Sep 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasir Hussein Shakir (2025). CIC UNSW-NB15 Augmented Dataset [Dataset]. https://www.kaggle.com/datasets/yasserhessein/cic-unsw-nb15-augmented-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 11, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Yasir Hussein Shakir
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card: CIC-UNSW-NB15 1. Overview The CIC-UNSW-NB15 is a modern network intrusion detection system (NIDS) dataset. It is a refined and augmented version of the original UNSW-NB15 dataset, created by reprocessing the raw network traffic using CICFlowMeter, a tool developed by the Canadian Institute for Cybersecurity (CIC). This reprocessing results in a different and more extensive set of network flow features, making it valuable for benchmarking machine learning models for network security.

    1. Key Features & Composition Total Samples: 440,043 network flow records.

    Classes: 10 classes (1 Benign, 9 Attack categories).

    Balance: Intentionally balanced to an 80% (Benign) to 20% (Malicious) ratio to better reflect real-world network traffic distributions.

    Features: The dataset contains a large set of network flow features (e.g., duration, protocol, packet sizes, inter-arrival times, flags) extracted by CICFlowMeter. The exact number of features is not specified in your text but is typically over 80 in standard CICFlowMeter outputs.

    1. Attack Categories The dataset includes the following ten labels:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4333519%2Fd4257b9595f3e3714ace9fbf19e7ab9b%2F1-s2.0-S0016003224008615-gr1.jpg?generation=1757570743471631&alt=media" alt="">

    1. Citation : If you use this dataset in your research, please cite the following paper:

    H. Mohammadian, A. H. Lashkari, A. Ghorbani. “Poisoning and Evasion: Deep Learning-Based NIDS under Adversarial Attacks,” 21st Annual International Conference on Privacy, Security and Trust (PST), 2024.

  9. r

    NF-UNSW-NB15

    • researchdata.edu.au
    Updated May 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann (2023). NF-UNSW-NB15 [Dataset]. http://doi.org/10.48610/5D0832D
    Explore at:
    Dataset updated
    May 15, 2023
    Dataset provided by
    The University of Queensland
    Authors
    Mr Mohanad Sarhan; Mr Mohanad Sarhan; Dr Siamak Layeghy; Dr Siamak Layeghy; Associate Professor Marius Portmann; Associate Professor Marius Portmann
    License

    http://guides.library.uq.edu.au/deposit_your_data/terms_and_conditionshttp://guides.library.uq.edu.au/deposit_your_data/terms_and_conditions

    Description

    NetFlow Version 1 of the datasets is made up of 8 basic NetFlow features. The details of the datasets are published in; Sarhan M., Layeghy S., Moustafa N., Portmann M. (2021) NetFlow Datasets for Machine Learning-Based Network Intrusion Detection Systems. In: Big Data Technologies and Applications. BDTA 2020, WiCON 2020. Springer, Cham. The use of the datasets for academic research purposes is granted in perpetuity after citing the above papers. For commercial purposes, it should be agreed upon by the authors. Please get in touch with the author Mohanad Sarhan for more details.

  10. NF-UNSW-NB15

    • kaggle.com
    zip
    Updated Jan 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    StrGenIx | Laurens D'hooge (2023). NF-UNSW-NB15 [Dataset]. https://www.kaggle.com/datasets/dhoogla/nfunswnb15
    Explore at:
    zip(15183289 bytes)Available download formats
    Dataset updated
    Jan 13, 2023
    Authors
    StrGenIx | Laurens D'hooge
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    NF-UNSW-NB15 is the Netflow version of the UNSW-NB15 dataset. This is one dataset in the NF-collection by the university of Queensland aimed at standardizing network-security datasets to achieve interoperability and larger analyses.

    All credit goes to the original authors: Dr. Mohanad Sarhan, Dr. Siamak Layeghy, Dr. Nour Moustafa & Dr. Marius Portmann. Please cite their original conference article when using this dataset.

    V1: Base dataset in CSV format as downloaded from here V2: Cleaning -> parquet files

    In the parquet files all data types are already set correctly, there are 0 records with missing information and 0 duplicate records.

  11. Distribution of training and testing data by connection type from the...

    • plos.figshare.com
    xls
    Updated Mar 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi (2025). Distribution of training and testing data by connection type from the UNSW-NB15 dataset [25]. [Dataset]. http://doi.org/10.1371/journal.pone.0317346.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Distribution of training and testing data by connection type from the UNSW-NB15 dataset [25].

  12. H

    UNSW-NB15 V3

    • dataverse.harvard.edu
    • huggingface.co
    • +1more
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research, Abluva (2024). UNSW-NB15 V3 [Dataset]. http://doi.org/10.7910/DVN/FNKBUE
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Research, Abluva
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The dataset is an extended version of UNSW-NB 15. It has 1 additional class synthesised and the data is normalised for ease of use. To cite the dataset, please reference the original paper with DOI: 10.1109/SmartNets61466.2024.10577645. The paper is published in IEEE SmartNets and can be accessed here: https://www.researchgate.net/publication/382034618_Blender-GAN_Multi-Target_Conditional_Generative_Adversarial_Network_for_Novel_Class_Synthetic_Data_Generation. Citation info: Madhubalan, Akshayraj & Gautam, Amit & Tiwary, Priya. (2024). Blender-GAN: Multi-Target Conditional Generative Adversarial Network for Novel Class Synthetic Data Generation. 1-7. 10.1109/SmartNets61466.2024.10577645. This dataset was made by Abluva Inc, a Palo Alto based, research-driven Data Protection firm. Our data protection platform empowers customers to secure data through advanced security mechanisms such as Fine Grained Access control and sophisticated depersonalization algorithms (e.g. Pseudonymization, Anonymization and Randomization). Abluva's Data Protection solutions facilitate data democratization within and outside the organizations, mitigating the concerns related to theft and compliance. The innovative intrusion detection algorithm by Abluva employs patented technologies for an intricately balanced approach that excludes normal access deviations, ensuring intrusion detection without disrupting the business operations. Abluva’s Solution enables organizations to extract further value from their data by enabling secure Knowledge Graphs and deploying Secure Data as a Service among other novel uses of data. Committed to providing a safe and secure environment, Abluva empowers organizations to unlock the full potential of their data.

  13. Experimental results.

    • plos.figshare.com
    xls
    Updated Mar 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi (2025). Experimental results. [Dataset]. http://doi.org/10.1371/journal.pone.0317346.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The swift proliferation and extensive incorporation of the Internet into worldwide networks have rendered the utilization of Intrusion Detection Systems (IDS) essential for preserving network security. Nonetheless, Intrusion Detection Systems have considerable difficulties, especially in precisely identifying attacks from minority classes. Current methodologies in the literature predominantly adhere to one of two strategies: either disregarding minority classes or use resampling techniques to equilibrate class distributions. Nonetheless, these methods may constrain overall system efficacy. This research utilizes Shapley Additive Explanations (SHAP) for feature selection with Recursive Feature Elimination with Cross-Validation (RFECV), employing XGBoost as the classifier. The model attained precision, recall, and F1-scores of 0.8095, 0.8293, and 0.8193, respectively, signifying improved identification of minority class attacks, namely “worms,” within the UNSW NB15 dataset. To enhance the validation of the proposed approach, we utilized the CICIDS2019 and CICIoT2023 datasets, with findings affirming its efficacy in detecting and classifying minority class attacks.

  14. h

    unsw-nb15

    • huggingface.co
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shownok (2025). unsw-nb15 [Dataset]. https://huggingface.co/datasets/AdnanShownok/unsw-nb15
    Explore at:
    Dataset updated
    Nov 21, 2025
    Authors
    Shownok
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    AdnanShownok/unsw-nb15 dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. UNSW-NB15 Subset Merged files (1140,045 samples)

    • kaggle.com
    zip
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Aly Bouke (2023). UNSW-NB15 Subset Merged files (1140,045 samples) [Dataset]. https://www.kaggle.com/datasets/mohamedalybouke/unsw-nb15-subset-merged-files-1140045-samples
    Explore at:
    zip(119203365 bytes)Available download formats
    Dataset updated
    Jul 26, 2023
    Authors
    Mohamed Aly Bouke
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Description: UNSW-NB15 Network Traffic for DDoS Attack Detection

    This dataset, named UNSW-NB15, is a comprehensive collection of network traffic data designed for DDoS (Distributed Denial of Service) attack detection. It was generated by the Australian Center for Cyber Security (ACCS) in collaboration with researchers worldwide to address the limitations of previous datasets that were no longer realistic representations of modern threat environments.

    Dataset Creation:

    The UNSW-NB15 dataset was created using the IXIA PerfectStorm tool, a powerful network traffic generation platform. The tool allowed the researchers to produce a hybrid collection of normal and abnormal network traffic, accurately mimicking modern network conditions. The dataset consists of four Comma-separated values (CSV) files, totaling 2,540,047 entries. For simplicity and ease of use, we have merged two of these files, UNSW-NB15_3 and UNSW-NB15_4, into a single CSV file containing 1,140,045 samples.

    To ensure reliable evaluation of the DDoS attack detection model, we split the dataset into a training set, which accounts for 70% of the samples, and a testing set, comprising the remaining 30%. This partitioning allows researchers to develop and validate their models effectively.

    Dataset Features:

    The UNSW-NB15 dataset includes 49 properties for each network record, capturing various network traffic characteristics. These features encompass a mix of nominal, numeric, and time-stamp values. The nominal features in the dataset are proto, service, state, and attack_cat, which are highlighted in blue in Table 3 of the associated research paper. To focus specifically on DDoS attack detection, we have excluded the Label class from the dataset and certain fields (srcip, sport, dstip, dsport, Stime, and Ltime) based on the suggestions of the dataset creators. This ensures that the dataset aligns with the scope and objectives of the research.

    To understand the complete context and background of the dataset creation, as well as the proposed DDoS attack detection tree-based model using Gini index feature selection, we highly recommend referring to the associated research paper:

    "An intelligent DDoS attack detection tree-based model using Gini index feature selection method" by Mohamed Aly Bouke, Azizol Abdullah, Sameer Hamoud ALshatebi, Mohd Taufik Abdullah, and Hayate El Atigh.

    Link to the Paper: https://doi.org/10.1016/j.micpro.2023.104823

  16. h

    UNSW-NB15-small

    • huggingface.co
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mouwiya S. A. Al-Qaisieh (2024). UNSW-NB15-small [Dataset]. https://huggingface.co/datasets/Mouwiya/UNSW-NB15-small
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2024
    Authors
    Mouwiya S. A. Al-Qaisieh
    Description

    Mouwiya/UNSW-NB15-small dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. UNSW-NB15 and CIC-IDS2017 Labelled PCAP Data

    • zenodo.org
    • kaggle.com
    csv
    Updated Oct 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian; Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian (2022). UNSW-NB15 and CIC-IDS2017 Labelled PCAP Data [Dataset]. http://doi.org/10.5281/zenodo.7258579
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian; Yasir Ali Farrukh Farrukh; Irfan Khan; Syed Wali; David Bierbrauer; John A Pavlik; Nathaniel D. Bastian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Packet Capture (PCAP) files of UNSW-NB15 and CIC-IDS2017 dataset are processed and labelled utilizing the CSV files. Each packet is labelled by comparing the eight distinct features: *Source IP, Destination IP, Source Port, Destination Port, Starting time, Ending time, Protocol and Time to live*. The dimensions for the dataset is Nx1504. All column of the dataset are integers, therefore you can directly utilize this dataset in you machine learning models. Moreover, details of the whole processing and transformation is provided in the following GitHub Repo:

    https://github.com/Yasir-ali-farrukh/Payload-Byte

    You can utilize the tool available at the above mentioned GitHub repo to generate labelled dataset from scratch. All of the detail of processing and transformation is provided in the following paper:

    ```yaml
    @article{Payload,
    author = "Yasir Ali Farrukh and Irfan Khan and Syed Wali and David Bierbrauer and Nathaniel Bastian",
    title = "{Payload-Byte: A Tool for Extracting and Labeling Packet Capture Files of Modern Network Intrusion Detection Datasets}",
    year = "2022",
    month = "9",
    url = "https://www.techrxiv.org/articles/preprint/Payload-Byte_A_Tool_for_Extracting_and_Labeling_Packet_Capture_Files_of_Modern_Network_Intrusion_Detection_Datasets/20714221",
    doi = "10.36227/techrxiv.20714221.v1"
    }

  18. h

    UNSW-NB15

    • huggingface.co
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Górka (2025). UNSW-NB15 [Dataset]. https://huggingface.co/datasets/bastyje/UNSW-NB15
    Explore at:
    Dataset updated
    Apr 28, 2025
    Authors
    Sebastian Górka
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    bastyje/UNSW-NB15 dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. f

    Partitioning the dataset into a set of subsets.

    • figshare.com
    xls
    Updated Mar 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi (2025). Partitioning the dataset into a set of subsets. [Dataset]. http://doi.org/10.1371/journal.pone.0317346.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Chadia E. L. Asry; Ibtissam Benchaji; Samira Douzi; Bouabid E. L. Ouahidi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The swift proliferation and extensive incorporation of the Internet into worldwide networks have rendered the utilization of Intrusion Detection Systems (IDS) essential for preserving network security. Nonetheless, Intrusion Detection Systems have considerable difficulties, especially in precisely identifying attacks from minority classes. Current methodologies in the literature predominantly adhere to one of two strategies: either disregarding minority classes or use resampling techniques to equilibrate class distributions. Nonetheless, these methods may constrain overall system efficacy. This research utilizes Shapley Additive Explanations (SHAP) for feature selection with Recursive Feature Elimination with Cross-Validation (RFECV), employing XGBoost as the classifier. The model attained precision, recall, and F1-scores of 0.8095, 0.8293, and 0.8193, respectively, signifying improved identification of minority class attacks, namely “worms,” within the UNSW NB15 dataset. To enhance the validation of the proposed approach, we utilized the CICIDS2019 and CICIoT2023 datasets, with findings affirming its efficacy in detecting and classifying minority class attacks.

  20. h

    unsw-nb15-preprocessed

    • huggingface.co
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Louie Cervantes (2025). unsw-nb15-preprocessed [Dataset]. https://huggingface.co/datasets/louiecerv/unsw-nb15-preprocessed
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2025
    Authors
    Louie Cervantes
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Dataset Name

    This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

      Dataset Sources [optional]
    

    Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/louiecerv/unsw-nb15-preprocessed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Witold Wydmański (2023). UNSW-NB15 [Dataset]. https://huggingface.co/datasets/wwydmanski/UNSW-NB15

UNSW-NB15

wwydmanski/UNSW-NB15

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2023
Authors
Witold Wydmański
Description

Source

https://www.kaggle.com/datasets/dhoogla/unswnb15?resource=download

  Dataset

This is an academic intrusion detection dataset. All the credit goes to the original authors: dr. Nour Moustafa and dr. Jill Slay. Please cite their original paper and all other appropriate articles listed on the UNSW-NB15 page. The full dataset also offers the pcap, BRO and Argus files along with additional documentation. The modifications to the predesignated train-test sets are minimal… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/UNSW-NB15.

Search
Clear search
Close search
Google apps
Main menu