Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Cybernative.ai Code Vulnerability and Security Dataset
Dataset Description
The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
The Security Attack Pattern (TTP) Recognition or Mapping Task
We share in this repo the MITRE ATT&CK mapping datasets, with training, validation and test splits. The datasets can be considered as an emerging and challenging multilabel classification NLP task, with over 600 hierarchical classes. NOTE: due to their security nature, these datasets contain textual information about malware and other security aspects.
Datasets
TRAM
This dataset belongs to CTID… See the full description on the dataset page: https://huggingface.co/datasets/tumeteor/Security-TTP-Mapping.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
SecQA
SecQA is a specialized dataset created for the evaluation of Large Language Models (LLMs) in the domain of computer security. It consists of multiple-choice questions, generated using GPT-4 and the Computer Systems Security: Planning for Success textbook, aimed at assessing the understanding and application of LLMs' knowledge in computer security.
Dataset Details
Dataset Description
SecQA is an innovative dataset designed to benchmark the… See the full description on the dataset page: https://huggingface.co/datasets/zefang-liu/secqa.
Facebook
Twitterautomatedstockminingorg/security-analysis-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
dattaraj/security-attacks-MITRE dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for Cybersecurity Question-Answer-Assertion (QAA) Dataset
Dataset Summary
The Cybersecurity QAA dataset is designed to evaluate the capabilities of large language models (LLMs) in delivering cybersecurity advice and information, particularly for UK small and medium-sized enterprises (SMEs). The dataset comprises 1,563 question-answer-assertion triples across various cybersecurity topics, such as network security, data protection, and user access management.… See the full description on the dataset page: https://huggingface.co/datasets/Rowden/CybersecurityQAA.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Cybersecurity Defense Training Dataset
Dataset Description
This dataset contains 2,500 high-quality instruction-response pairs focused on defensive cybersecurity education. The dataset is designed to train AI models to provide accurate, detailed, and ethically-aligned guidance on information security principles while refusing to assist with malicious activities.
Dataset Summary
Language: English License: Apache 2.0 Format: Parquet Size: 2,500 rows Domain:… See the full description on the dataset page: https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-v1.
Facebook
Twitterhttps://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
HTTP404ERROR/huggingface dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDataset Card for "security-paper-datasets"
More Information needed
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models
Overview
This repo contains the source code of DecodingTrust. This research endeavor is designed to help researchers better understand the capabilities, limitations, and potential risks associated with deploying these state-of-the-art Large Language Models (LLMs). See our paper for details. DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models Boxin Wang, Weixin Chen, Hengzhi… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/DecodingTrust.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
LLM Security Evaluation
This repo contains scripts for evaluating LLM security abilities. We gathered hundreds of questions cover different ascepts of security, such as vulnerablities, pentest, threat intelligence, etc. All the questions can be viewed at https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.
Suppoted LLM
ChatGLM Baichuan Vicuna (GGML format)
Usage
Because of different LLM requires for different running environment, we highly recommended… See the full description on the dataset page: https://huggingface.co/datasets/c01dsnap/LLM-Sec-Evaluation.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
model_details: description: A comprehensive database and analysis tool for cyber exploits, vulnerabilities, and related information. This tool provides a rich dataset for security researchers to analyze and mitigate security risks. task_categories:
data_analysis
structure:
data/ exploits.csv vulnerabilities.csv
assets/ favicon.svg
.streamlit/ config.toml
main.py data_processor.py visualizations.py README.md
intended_use: Designed for security researchers, developers, and… See the full description on the dataset page: https://huggingface.co/datasets/Canstralian/CyberExploitDB.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains adversarial prompts and associated rubrics designed to evaluate the safety and security of large language models (LLMs), as described in the paper FORTRESS: Frontier Risk Evaluation for National Security and Public Safety. Please exercise care and caution when using these data, as they contain potentially sensitive or harmful information related to public safety and national security. This dataset should be used for safety evaluations only, and it is prohibited to use… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/fortress_public.
Facebook
TwitterOverview
This repo contains the image-to-text dataset of MMDT (Multimodal DecodingTrust). This research endeavor is designed to help researchers and practitioners better understand the capabilities, limitations, and potential risks involved in deploying the state-of-the-art Multimodal foundation models (MMFMs). This dataset focuses on the following six primary perspectives of trustworthiness, including safety, hallucination, fairness, privacy, adversarial robustness, and… See the full description on the dataset page: https://huggingface.co/datasets/AI-Secure/MMDecodingTrust-I2T.
Facebook
Twittertuandunghcmut/infosec-security-qa dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
nomasecurity/riskrubric-results dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterTgratzi/rule-security-risks dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card:
This dataset was uploaded to Hugging Face using TrainChimp.
Facebook
TwitterADVANCED-VAPORWARE/INTEGRATED-AI-SECURITY-SOLUTIONS dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterdomblake/airport-security dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Cybernative.ai Code Vulnerability and Security Dataset
Dataset Description
The Cybernative.ai Code Vulnerability and Security Dataset is a dataset of synthetic Data Programming by Demonstration (DPO) pairs, focusing on the intricate relationship between secure and insecure code across a variety of programming languages. This dataset is meticulously crafted to serve as a pivotal resource for researchers, cybersecurity professionals, and AI developers who are keen on… See the full description on the dataset page: https://huggingface.co/datasets/CyberNative/Code_Vulnerability_Security_DPO.