100+ datasets found

h
Code_Vulnerability_Security_DPO
huggingface.co
Updated Apr 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Byte (2024). Code_Vulnerability_Security_DPO [Dataset]. https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 21, 2024
Authors
Byte
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Cybernative.ai Code Vulnerability and Security Dataset

Dataset Description

The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.
h
Security-TTP-Mapping
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tu Nguyen, Security-TTP-Mapping [Dataset]. http://doi.org/10.57967/hf/1811
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/1811
Authors
Tu Nguyen
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
The Security Attack Pattern (TTP) Recognition or Mapping Task

We share in this repo the MITRE ATT&CK mapping datasets, with training, validation and test splits. The datasets can be considered as an emerging and challenging multilabel classification NLP task, with over 600 hierarchical classes. NOTE: due to their security nature, these datasets contain textual information about malware and other security aspects.

Datasets TRAM

This dataset belongs to CTID… See the full description on the dataset page: https://huggingface.co/datasets/tumeteor/Security-TTP-Mapping.
h
secqa
huggingface.co
Updated Nov 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zefang Liu (2025). secqa [Dataset]. https://huggingface.co/datasets/zefang-liu/secqa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 16, 2025
Authors
Zefang Liu
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
SecQA

SecQA is a specialized dataset created for the evaluation of Large Language Models (LLMs) in the domain of computer security. It consists of multiple-choice questions, generated using GPT-4 and the Computer Systems Security: Planning for Success textbook, aimed at assessing the understanding and application of LLMs' knowledge in computer security.

Dataset Details Dataset Description

SecQA is an innovative dataset designed to benchmark the… See the full description on the dataset page: https://huggingface.co/datasets/zefang-liu/secqa.
h
security-analysis-dataset
huggingface.co
Updated Nov 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
stock mining (2024). security-analysis-dataset [Dataset]. https://huggingface.co/datasets/automatedstockminingorg/security-analysis-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 5, 2024
Authors
stock mining
Description
automatedstockminingorg/security-analysis-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
h
security-attacks-MITRE
huggingface.co
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dattaraj Rao (2024). security-attacks-MITRE [Dataset]. https://huggingface.co/datasets/dattaraj/security-attacks-MITRE
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 19, 2024
Authors
Dattaraj Rao
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
dattaraj/security-attacks-MITRE dataset hosted on Hugging Face and contributed by the HF Datasets community
h
CybersecurityQAA
huggingface.co
Updated Nov 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technologies (2024). CybersecurityQAA [Dataset]. https://huggingface.co/datasets/Rowden/CybersecurityQAA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 6, 2024
Authors
Technologies
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for Cybersecurity Question-Answer-Assertion (QAA) Dataset

Dataset Summary

The Cybersecurity QAA dataset is designed to evaluate the capabilities of large language models (LLMs) in delivering cybersecurity advice and information, particularly for UK small and medium-sized enterprises (SMEs). The dataset comprises 1,563 question-answer-assertion triples across various cybersecurity topics, such as network security, data protection, and user access management.… See the full description on the dataset page: https://huggingface.co/datasets/Rowden/CybersecurityQAA.
h
Cybersecurity-Dataset-v1
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alican Kiraz (2025). Cybersecurity-Dataset-v1 [Dataset]. https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-v1
Explore at:
Dataset updated
Jun 17, 2025
Authors
Alican Kiraz
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Cybersecurity Defense Training Dataset

Dataset Description

This dataset contains 2,500 high-quality instruction-response pairs focused on defensive cybersecurity education. The dataset is designed to train AI models to provide accurate, detailed, and ethically-aligned guidance on information security principles while refusing to assist with malicious activities.

Dataset Summary

Language: English License: Apache 2.0 Format: Parquet Size: 2,500 rows Domain:… See the full description on the dataset page: https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-v1.
h
Data from: huggingface
huggingface.co
Updated Sep 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cyber Security Ncrack (2022). huggingface [Dataset]. https://huggingface.co/datasets/HTTP404ERROR/huggingface
Explore at:
Dataset updated
Sep 22, 2022
Authors
Cyber Security Ncrack
License
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
Description
HTTP404ERROR/huggingface dataset hosted on Hugging Face and contributed by the HF Datasets community
h
security-paper-datasets
huggingface.co
Updated Oct 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
clouditera (2023). security-paper-datasets [Dataset]. https://huggingface.co/datasets/clouditera/security-paper-datasets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2023
Dataset provided by
Clouditera
Authors
clouditera
Description
Dataset Card for "security-paper-datasets"

More Information needed
h
DecodingTrust
huggingface.co
Updated Aug 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Secure Learning Lab (2024). DecodingTrust [Dataset]. https://huggingface.co/datasets/AI-Secure/DecodingTrust
Explore at:
Dataset updated
Aug 11, 2024
Dataset authored and provided by
Secure Learning Lab
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models

Overview

This repo contains the source code of DecodingTrust. This research endeavor is designed to help researchers better understand the capabilities, limitations, and potential risks associated with deploying these state-of-the-art Large Language Models (LLMs). See our paper for details. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Boxin Wang, Weixin Chen, Hengzhi… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/DecodingTrust.
h
LLM-Sec-Evaluation
huggingface.co
Updated Jul 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max (2023). LLM-Sec-Evaluation [Dataset]. https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation
Explore at:
Dataset updated
Jul 17, 2023
Authors
Max
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
LLM Security Evaluation

This repo contains scripts for evaluating LLM security abilities. We gathered hundreds of questions cover different ascepts of security, such as vulnerablities, pentest, threat intelligence, etc. All the questions can be viewed at https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.

Suppoted LLM

ChatGLM Baichuan Vicuna (GGML format)

Usage

Because of different LLM requires for different running environment, we highly recommended… See the full description on the dataset page: https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.
h
CyberExploitDB
huggingface.co
Updated Oct 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Esteban Cara de Sexo (2024). CyberExploitDB [Dataset]. https://huggingface.co/datasets/Canstralian/CyberExploitDB
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 6, 2024
Authors
Esteban Cara de Sexo
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
model_details: description: A comprehensive database and analysis tool for cyber exploits, vulnerabilities, and related information. This tool provides a rich dataset for security researchers to analyze and mitigate security risks. task_categories:

data_analysis

structure:

data/ exploits.csv vulnerabilities.csv

assets/ favicon.svg

.streamlit/ config.toml

main.py data_processor.py visualizations.py README.md

intended_use: Designed for security researchers, developers, and… See the full description on the dataset page: https://huggingface.co/datasets/Canstralian/CyberExploitDB.
fortress_public
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scale AI (2025). fortress_public [Dataset]. https://huggingface.co/datasets/ScaleAI/fortress_public
Explore at:
Dataset updated
Jun 17, 2025
Dataset authored and provided by
Scale AIhttps://scale.com/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains adversarial prompts and associated rubrics designed to evaluate the safety and security of large language models (LLMs), as described in the paper FORTRESS: Frontier Risk Evaluation for National Security and Public Safety. Please exercise care and caution when using these data, as they contain potentially sensitive or harmful information related to public safety and national security. This dataset should be used for safety evaluations only, and it is prohibited to use… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/fortress_public.
h
MMDecodingTrust-I2T
huggingface.co
Updated Sep 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Secure Learning Lab (2024). MMDecodingTrust-I2T [Dataset]. https://huggingface.co/datasets/AI-Secure/MMDecodingTrust-I2T
Explore at:
Dataset updated
Sep 23, 2024
Dataset authored and provided by
Secure Learning Lab
Description
Overview

This repo contains the image-to-text dataset of MMDT (Multimodal DecodingTrust). This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and potential risks involved in deploying the state-of-the-art Multimodal foundation models (MMFMs). This dataset focuses on the following six primary perspectives of trustworthiness, including safety, hallucination, fairness, privacy, adversarial robustness, and… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/MMDecodingTrust-I2T.
h
infosec-security-qa
huggingface.co
Updated Oct 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dũng Võ (2025). infosec-security-qa [Dataset]. https://huggingface.co/datasets/tuandunghcmut/infosec-security-qa
Explore at:
Dataset updated
Oct 8, 2025
Authors
Dũng Võ
Description
tuandunghcmut/infosec-security-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
riskrubric-results
huggingface.co
Updated Sep 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Noma Security (2025). riskrubric-results [Dataset]. https://huggingface.co/datasets/nomasecurity/riskrubric-results
Explore at:
Dataset updated
Sep 20, 2025
Dataset provided by
Noma Security, Inc.
Authors
Noma Security
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
nomasecurity/riskrubric-results dataset hosted on Hugging Face and contributed by the HF Datasets community
h
rule-security-risks
huggingface.co
Updated Aug 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gratzi (2024). rule-security-risks [Dataset]. https://huggingface.co/datasets/Tgratzi/rule-security-risks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 11, 2024
Authors
Gratzi
Description
Tgratzi/rule-security-risks dataset hosted on Hugging Face and contributed by the HF Datasets community
h
security-issues
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evan Phibbs, security-issues [Dataset]. https://huggingface.co/datasets/evanphibbs/security-issues
Explore at:
Authors
Evan Phibbs
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card:

This dataset was uploaded to Hugging Face using TrainChimp.
h
INTEGRATED-AI-SECURITY-SOLUTIONS
huggingface.co
Updated Dec 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ADVANCED-VAPORWARE (2024). INTEGRATED-AI-SECURITY-SOLUTIONS [Dataset]. https://huggingface.co/datasets/ADVANCED-VAPORWARE/INTEGRATED-AI-SECURITY-SOLUTIONS
Explore at:
Dataset updated
Dec 29, 2024
Authors
ADVANCED-VAPORWARE
Description
ADVANCED-VAPORWARE/INTEGRATED-AI-SECURITY-SOLUTIONS dataset hosted on Hugging Face and contributed by the HF Datasets community
h
airport-security
huggingface.co
Updated Jul 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dominic Blake (2023). airport-security [Dataset]. https://huggingface.co/datasets/domblake/airport-security
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 12, 2023
Authors
Dominic Blake
Description
domblake/airport-security dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Byte (2024). Code_Vulnerability_Security_DPO [Dataset]. https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO

Code_Vulnerability_Security_DPO

Code Vulnerability and Security DPO Dataset

CyberNative/Code_Vulnerability_Security_DPO

Explore at:

4 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 21, 2024

Authors

Byte

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Cybernative.ai Code Vulnerability and Security Dataset

  Dataset Description

The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.

Clear search

Close search

Google apps

Main menu

Code_Vulnerability_Security_DPO

Security-TTP-Mapping

secqa

security-analysis-dataset

security-attacks-MITRE

CybersecurityQAA

Cybersecurity-Dataset-v1

Data from: huggingface

security-paper-datasets

DecodingTrust

LLM-Sec-Evaluation

CyberExploitDB

fortress_public

MMDecodingTrust-I2T

infosec-security-qa

riskrubric-results

rule-security-risks

security-issues

INTEGRATED-AI-SECURITY-SOLUTIONS

airport-security

Code_Vulnerability_Security_DPO

Code Vulnerability and Security DPO Dataset

CyberNative/Code_Vulnerability_Security_DPO