Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
building IoT IDS requires the availability of datasets to process
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Author: Leonid Zelenskiy Supervisor: Andrey Sadovykh Repository: github.com/RamPrin/LLM_TM
This repository contains a curated dataset for evaluating threat modeling approaches, specifically focused on cybersecurity testing and large language models (LLMs). It includes structured software system specifications and corresponding cybersecurity threats to support experiments in automated threat detection, analysis, and model evaluation.
test_dataset
A CSV file containing cleaned system specifications suitable for natural language processing (NLP) tools. Format:
validation_dataset
A CSV file listing manually identified cybersecurity threats used for validation. Format:
This dataset supports research in automating the threat modeling process, with particular relevance to cybersecurity testing scenarios. It enables benchmarking of model performance in recognizing and generating relevant security threats based on structured system descriptions.
System information was extracted from a variety of reliable and public documentation, including RFCs, whitepapers, technical blogs, and academic research. Below is a mapping of each system to its original source.
System | Source |
---|---|
OAuth 2.0 | RFC 6819 |
SSL | SSL Labs |
DNS | Netmeister |
S3 | TrustOnCloud |
Google Cloud Service | NCC Group |
IoT Authentication | SAFECode Whitepaper |
PCI DSS | Shostack Paper |
Certificate Transparency | IETF Draft |
Kubernetes (K8S) | Link 1, Link 2, Link 3 |
CI/CD | GitHub Threat Matrix |
AWS ECS Fargate | Sysdig |
Password Store Manager | Stanford Paper |
IoT Supply Chain | ENISA |
Trinity | Link |
Connected Cars | Trend Micro |
Email Encryption Gateway | NCC Group |
Bitcoin | GitHub |
Containers | Checklist, CloudSecDocs |
Medical Devices | MITRE |
Contact Tracing Applications | LinkedIn Article |
Vehicle Charging | PNNL |
Agentic AI | OWASP GenAI |
Trusted Firmware-M | Trusted Firmware Docs |
ROS 2 Robotic System | ROS2 Design Docs |
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
building IoT IDS requires the availability of datasets to process