100+ datasets found
  1. h

    Code_Vulnerability_Security_DPO

    • huggingface.co
    Updated Apr 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Byte (2024). Code_Vulnerability_Security_DPO [Dataset]. https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 21, 2024
    Authors
    Byte
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Cybernative.ai Code Vulnerability and Security Dataset

      Dataset Description
    

    The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.

  2. h

    Security-TTP-Mapping

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tu Nguyen, Security-TTP-Mapping [Dataset]. http://doi.org/10.57967/hf/1811
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Tu Nguyen
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    The Security Attack Pattern (TTP) Recognition or Mapping Task

    We share in this repo the MITRE ATT&CK mapping datasets, with training, validation and test splits. The datasets can be considered as an emerging and challenging multilabel classification NLP task, with over 600 hierarchical classes. NOTE: due to their security nature, these datasets contain textual information about malware and other security aspects.

      Datasets
    
    
    
    
    
    
      TRAM
    

    This dataset belongs to CTID… See the full description on the dataset page: https://huggingface.co/datasets/tumeteor/Security-TTP-Mapping.

  3. h

    secqa

    • huggingface.co
    Updated Nov 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zefang Liu (2025). secqa [Dataset]. https://huggingface.co/datasets/zefang-liu/secqa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 16, 2025
    Authors
    Zefang Liu
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    SecQA

    SecQA is a specialized dataset created for the evaluation of Large Language Models (LLMs) in the domain of computer security. It consists of multiple-choice questions, generated using GPT-4 and the Computer Systems Security: Planning for Success textbook, aimed at assessing the understanding and application of LLMs' knowledge in computer security.

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    SecQA is an innovative dataset designed to benchmark the… See the full description on the dataset page: https://huggingface.co/datasets/zefang-liu/secqa.

  4. h

    security-analysis-dataset

    • huggingface.co
    Updated Nov 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    stock mining (2024). security-analysis-dataset [Dataset]. https://huggingface.co/datasets/automatedstockminingorg/security-analysis-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 5, 2024
    Authors
    stock mining
    Description

    automatedstockminingorg/security-analysis-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    security-attacks-MITRE

    • huggingface.co
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dattaraj Rao (2024). security-attacks-MITRE [Dataset]. https://huggingface.co/datasets/dattaraj/security-attacks-MITRE
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Authors
    Dattaraj Rao
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    dattaraj/security-attacks-MITRE dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    CybersecurityQAA

    • huggingface.co
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technologies (2024). CybersecurityQAA [Dataset]. https://huggingface.co/datasets/Rowden/CybersecurityQAA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2024
    Authors
    Technologies
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Cybersecurity Question-Answer-Assertion (QAA) Dataset

      Dataset Summary
    

    The Cybersecurity QAA dataset is designed to evaluate the capabilities of large language models (LLMs) in delivering cybersecurity advice and information, particularly for UK small and medium-sized enterprises (SMEs). The dataset comprises 1,563 question-answer-assertion triples across various cybersecurity topics, such as network security, data protection, and user access management.… See the full description on the dataset page: https://huggingface.co/datasets/Rowden/CybersecurityQAA.

  7. h

    Cybersecurity-Dataset-v1

    • huggingface.co
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alican Kiraz (2025). Cybersecurity-Dataset-v1 [Dataset]. https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-v1
    Explore at:
    Dataset updated
    Jun 17, 2025
    Authors
    Alican Kiraz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Cybersecurity Defense Training Dataset

      Dataset Description
    

    This dataset contains 2,500 high-quality instruction-response pairs focused on defensive cybersecurity education. The dataset is designed to train AI models to provide accurate, detailed, and ethically-aligned guidance on information security principles while refusing to assist with malicious activities.

      Dataset Summary
    

    Language: English License: Apache 2.0 Format: Parquet Size: 2,500 rows Domain:… See the full description on the dataset page: https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-v1.

  8. h

    Data from: huggingface

    • huggingface.co
    Updated Sep 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cyber Security Ncrack (2022). huggingface [Dataset]. https://huggingface.co/datasets/HTTP404ERROR/huggingface
    Explore at:
    Dataset updated
    Sep 22, 2022
    Authors
    Cyber Security Ncrack
    License

    https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/

    Description

    HTTP404ERROR/huggingface dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    security-paper-datasets

    • huggingface.co
    Updated Oct 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    clouditera (2023). security-paper-datasets [Dataset]. https://huggingface.co/datasets/clouditera/security-paper-datasets
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2023
    Dataset provided by
    Clouditera
    Authors
    clouditera
    Description

    Dataset Card for "security-paper-datasets"

    More Information needed

  10. h

    DecodingTrust

    • huggingface.co
    Updated Aug 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Secure Learning Lab (2024). DecodingTrust [Dataset]. https://huggingface.co/datasets/AI-Secure/DecodingTrust
    Explore at:
    Dataset updated
    Aug 11, 2024
    Dataset authored and provided by
    Secure Learning Lab
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

      Overview
    

    This repo contains the source code of DecodingTrust. This research endeavor is designed to help researchers better understand the capabilities, limitations, and potential risks associated with deploying these state-of-the-art Large Language Models (LLMs). See our paper for details. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Boxin Wang, Weixin Chen, Hengzhi… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/DecodingTrust.

  11. h

    LLM-Sec-Evaluation

    • huggingface.co
    Updated Jul 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max (2023). LLM-Sec-Evaluation [Dataset]. https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation
    Explore at:
    Dataset updated
    Jul 17, 2023
    Authors
    Max
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    LLM Security Evaluation

    This repo contains scripts for evaluating LLM security abilities. We gathered hundreds of questions cover different ascepts of security, such as vulnerablities, pentest, threat intelligence, etc. All the questions can be viewed at https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.

      Suppoted LLM
    

    ChatGLM Baichuan Vicuna (GGML format)

      Usage
    

    Because of different LLM requires for different running environment, we highly recommended… See the full description on the dataset page: https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.

  12. h

    CyberExploitDB

    • huggingface.co
    Updated Oct 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esteban Cara de Sexo (2024). CyberExploitDB [Dataset]. https://huggingface.co/datasets/Canstralian/CyberExploitDB
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 6, 2024
    Authors
    Esteban Cara de Sexo
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    model_details: description: A comprehensive database and analysis tool for cyber exploits, vulnerabilities, and related information. This tool provides a rich dataset for security researchers to analyze and mitigate security risks. task_categories:

    data_analysis

    structure:

    data/ exploits.csv vulnerabilities.csv

    assets/ favicon.svg

    .streamlit/ config.toml

    main.py data_processor.py visualizations.py README.md

    intended_use: Designed for security researchers, developers, and… See the full description on the dataset page: https://huggingface.co/datasets/Canstralian/CyberExploitDB.

  13. fortress_public

    • huggingface.co
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scale AI (2025). fortress_public [Dataset]. https://huggingface.co/datasets/ScaleAI/fortress_public
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    Scale AIhttps://scale.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains adversarial prompts and associated rubrics designed to evaluate the safety and security of large language models (LLMs), as described in the paper FORTRESS: Frontier Risk Evaluation for National Security and Public Safety. Please exercise care and caution when using these data, as they contain potentially sensitive or harmful information related to public safety and national security. This dataset should be used for safety evaluations only, and it is prohibited to use… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/fortress_public.

  14. h

    MMDecodingTrust-I2T

    • huggingface.co
    Updated Sep 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Secure Learning Lab (2024). MMDecodingTrust-I2T [Dataset]. https://huggingface.co/datasets/AI-Secure/MMDecodingTrust-I2T
    Explore at:
    Dataset updated
    Sep 23, 2024
    Dataset authored and provided by
    Secure Learning Lab
    Description

    Overview

    This repo contains the image-to-text dataset of MMDT (Multimodal DecodingTrust). This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and potential risks involved in deploying the state-of-the-art Multimodal foundation models (MMFMs). This dataset focuses on the following six primary perspectives of trustworthiness, including safety, hallucination, fairness, privacy, adversarial robustness, and… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/MMDecodingTrust-I2T.

  15. h

    infosec-security-qa

    • huggingface.co
    Updated Oct 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dũng Võ (2025). infosec-security-qa [Dataset]. https://huggingface.co/datasets/tuandunghcmut/infosec-security-qa
    Explore at:
    Dataset updated
    Oct 8, 2025
    Authors
    Dũng Võ
    Description

    tuandunghcmut/infosec-security-qa dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. riskrubric-results

    • huggingface.co
    Updated Sep 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Noma Security (2025). riskrubric-results [Dataset]. https://huggingface.co/datasets/nomasecurity/riskrubric-results
    Explore at:
    Dataset updated
    Sep 20, 2025
    Dataset provided by
    Noma Security, Inc.
    Authors
    Noma Security
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    nomasecurity/riskrubric-results dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. h

    rule-security-risks

    • huggingface.co
    Updated Aug 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gratzi (2024). rule-security-risks [Dataset]. https://huggingface.co/datasets/Tgratzi/rule-security-risks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2024
    Authors
    Gratzi
    Description

    Tgratzi/rule-security-risks dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    security-issues

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evan Phibbs, security-issues [Dataset]. https://huggingface.co/datasets/evanphibbs/security-issues
    Explore at:
    Authors
    Evan Phibbs
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card:

    This dataset was uploaded to Hugging Face using TrainChimp.

  19. h

    INTEGRATED-AI-SECURITY-SOLUTIONS

    • huggingface.co
    Updated Dec 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ADVANCED-VAPORWARE (2024). INTEGRATED-AI-SECURITY-SOLUTIONS [Dataset]. https://huggingface.co/datasets/ADVANCED-VAPORWARE/INTEGRATED-AI-SECURITY-SOLUTIONS
    Explore at:
    Dataset updated
    Dec 29, 2024
    Authors
    ADVANCED-VAPORWARE
    Description

    ADVANCED-VAPORWARE/INTEGRATED-AI-SECURITY-SOLUTIONS dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    airport-security

    • huggingface.co
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominic Blake (2023). airport-security [Dataset]. https://huggingface.co/datasets/domblake/airport-security
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 12, 2023
    Authors
    Dominic Blake
    Description

    domblake/airport-security dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Byte (2024). Code_Vulnerability_Security_DPO [Dataset]. https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO

Code_Vulnerability_Security_DPO

Code Vulnerability and Security DPO Dataset

CyberNative/Code_Vulnerability_Security_DPO

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 21, 2024
Authors
Byte
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Cybernative.ai Code Vulnerability and Security Dataset

  Dataset Description

The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.

Search
Clear search
Close search
Google apps
Main menu