100+ datasets found

h
llm-system-prompts-benchmark
huggingface.co
Updated 10. jan. 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Naomi Bashkansky (2024). llm-system-prompts-benchmark [Dataset]. https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
10. jan. 2024
Authors
Naomi Bashkansky
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Dataset Name

This datset is a collection of 100 system prompts for large language models.

Dataset Details Dataset Description

These 100 system prompts test a model's ability to follow grammatical patterns; answer basic multiple choice questions; act according to a particular persona; memorize information; and speak in French. Files:

hundred_system_prompts.py: refer to this to see the (prompt, probe, function) triplets, as well as the… See the full description on the dataset page: https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark.
h
synthetic_multilingual_llm_prompts
huggingface.co
Updated 11. juni 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gretel.ai (2024). synthetic_multilingual_llm_prompts [Dataset]. https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
11. juni 2024
Dataset provided by
Gretel.ai
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Image generated by DALL-E. See prompt for more details

📝🌐 Synthetic Multilingual LLM Prompts

Welcome to the "Synthetic Multilingual LLM Prompts" dataset! This comprehensive collection features 1,250 synthetic LLM prompts generated using Gretel Navigator, available in seven different languages. To ensure accuracy and diversity in prompts, and translation quality and consistency across the different languages, we employed Gretel Navigator both as a generation tool and as an… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts.
h
malicious-llm-prompts-v4
huggingface.co
Updated 26. jan. 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sagar Patel (2025). malicious-llm-prompts-v4 [Dataset]. https://huggingface.co/datasets/codesagar/malicious-llm-prompts-v4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
26. jan. 2025
Authors
Sagar Patel
Description
Dataset Card for "malicious-llm-prompts-v4"

More Information needed
h
paper-llm-prompts
huggingface.co
Updated 3. juli 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
B F (2023). paper-llm-prompts [Dataset]. https://huggingface.co/datasets/beephids/paper-llm-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
3. juli 2023
Authors
B F
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
beephids/paper-llm-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
awesome-chatgpt-prompts
huggingface.co
Updated 15. des. 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
15. des. 2023
Authors
Fatih Kadir Akın
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
🧠 Awesome ChatGPT Prompts [CSV dataset]

This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

License

CC-0
h
speedrender-llm-prompt
huggingface.co
Updated 12. feb. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kaique Pereira (2025). speedrender-llm-prompt [Dataset]. https://huggingface.co/datasets/kaiquedu/speedrender-llm-prompt
Explore at:
Dataset updated
12. feb. 2025
Authors
Kaique Pereira
Description
kaiquedu/speedrender-llm-prompt dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Official_LLM_System_Prompts
huggingface.co
Updated 20. nov. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nymbo (2025). Official_LLM_System_Prompts [Dataset]. https://huggingface.co/datasets/Nymbo/Official_LLM_System_Prompts
Explore at:
Dataset updated
20. nov. 2025
Authors
Nymbo
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Official LLM System Prompts

This short dataset contains a few system prompts leaked from proprietary models. Contains date-stamped prompts from OpenAI, Anthropic, MS Copilot, GitHub Copilot, Grok, and Perplexity.
h
in-the-wild-jailbreak-prompts
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TrustAIRLab, in-the-wild-jailbreak-prompts [Dataset]. https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
TrustAIRLab
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
In-The-Wild Jailbreak Prompts on LLMs

This is the official repository for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models by Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. In this project, employing our new framework JailbreakHub, we conduct the first measurement study on jailbreak prompts in the wild, with 15,140 prompts collected from December 2022 to December 2023 (including 1,405… See the full description on the dataset page: https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts.
h
System-Prompt-Library-030825
huggingface.co
Updated 17. okt. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Rosehill (2025). System-Prompt-Library-030825 [Dataset]. http://doi.org/10.57967/hf/6319
Explore at:
Unique identifier
https://doi.org/10.57967/hf/6319
Dataset updated
17. okt. 2025
Authors
Daniel Rosehill
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
System Prompts Dataset - August 2025

Point-in-time export from Daniel Rosehill's system prompt library as of August 3rd, 2025

Overview

This repository contains a comprehensive collection of 944 system prompts designed for various AI applications, agent workflows, and conversational AI systems. While many of these prompts now serve as the foundation for more complex agent-based workflows, they continue to provide essential building blocks for AI system design and… See the full description on the dataset page: https://huggingface.co/datasets/danielrosehill/System-Prompt-Library-030825.
h
prompt-injections
huggingface.co
Updated 4. mai 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HSE LLM @ Saint Petersburg (2025). prompt-injections [Dataset]. https://huggingface.co/datasets/hse-llm/prompt-injections
Explore at:
Dataset updated
4. mai 2025
Dataset authored and provided by
HSE LLM @ Saint Petersburg
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
hse-llm/prompt-injections dataset hosted on Hugging Face and contributed by the HF Datasets community
h
self-align-prompts
huggingface.co
Updated 31. juli 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mii-llm (2024). self-align-prompts [Dataset]. https://huggingface.co/datasets/mii-llm/self-align-prompts
Explore at:
Dataset updated
31. juli 2024
Dataset authored and provided by
mii-llm
Description
mii-llm/self-align-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
vigil-instruction-bypass-all-MiniLM-L6-v2
huggingface.co
Updated 16. okt. 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adam Swanda (2023). vigil-instruction-bypass-all-MiniLM-L6-v2 [Dataset]. https://huggingface.co/datasets/deadbits/vigil-instruction-bypass-all-MiniLM-L6-v2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
16. okt. 2023
Authors
Adam Swanda
Description
Vigil: LLM Instruction Bypass all-MiniLM-L6-v2

Repo: github.com/deadbits/vigil-llm

Vigil is a Python framework and REST API for assessing Large Language Model (LLM) prompts against a set of scanners to detect prompt injections, jailbreaks, and other potentially risky inputs. This repository contains all-MiniLM-L6-v2 embeddings for all Instruction Bypass style prompts ("Ignore instructions ...") used by Vigil. You can use the parquet2vdb.py utility to load the embeddings in the… See the full description on the dataset page: https://huggingface.co/datasets/deadbits/vigil-instruction-bypass-all-MiniLM-L6-v2.
h
OpenEndedLLMPrompts
huggingface.co
Updated 6. juli 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shreyan (2024). OpenEndedLLMPrompts [Dataset]. https://huggingface.co/datasets/shreyanmitra/OpenEndedLLMPrompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
6. juli 2024
Authors
Shreyan
Description
Dataset Card for OpenEndedLLMPrompts

A cleaned and consolidated set of questions (without context) and answers for LLM hallucination detection. Each question-answer pair is not the work of the author, but was selected from OpenAssistant/oasst2. If you use any of the data provided, please cite this source in addition to the following paper Shreyan Mitra and Leilani Gilpin. Detecting LLM Hallucinations Pre-generation (paper pending) The original dataset was provided in a tree… See the full description on the dataset page: https://huggingface.co/datasets/shreyanmitra/OpenEndedLLMPrompts.
h
deepseek-r1-reasoning-prompts
huggingface.co
Updated 27. jan. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
umar igan (2025). deepseek-r1-reasoning-prompts [Dataset]. https://huggingface.co/datasets/umarigan/deepseek-r1-reasoning-prompts
Explore at:
Dataset updated
27. jan. 2025
Authors
umar igan
Description
I created a reasoning prompt dataset from deepseek-r1 model with the purpose of fine-tuning small language models to use them to generate better reasoning prompt to use with bigger llm models.

Metadata

The metadata is made available through a series of parquet files with the following schema:

id: A unique identifier for the qa. question: answer: Answer from deepseek-r1 think model. reasoning: Reasoning from deepseek-r1 model.
h
prompt-difficulty
huggingface.co
Updated 2. des. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alan Tseng (2025). prompt-difficulty [Dataset]. https://huggingface.co/datasets/agentlans/prompt-difficulty
Explore at:
Dataset updated
2. des. 2025
Authors
Alan Tseng
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Prompt Difficulty Meta-Analysis

Introduction

Large language model (LLM) prompts vary widely in complexity, required knowledge, and reasoning demands. Some prompts are straightforward, while others require advanced understanding and multi-step reasoning. This study analyzes the difficulty of English ChatGPT prompts using classifiers trained on multiple difficulty-labelled datasets.
The goal is to produce a consistent, data-driven difficulty score that can be used to… See the full description on the dataset page: https://huggingface.co/datasets/agentlans/prompt-difficulty.
h
HALoGEN-prompts
huggingface.co
Updated 20. jan. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhilasha Ravichander (2025). HALoGEN-prompts [Dataset]. https://huggingface.co/datasets/lasha-nlp/HALoGEN-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
20. jan. 2025
Authors
Abhilasha Ravichander
Description
HALOGEN🔦: Fantastic LLM Hallucinations and Where to Find Them

This repository contains the prompts of HALOGEN🔦: Fantastic LLM Hallucinations and Where to Find Them by *Abhilasha Ravichander, *Shrusti Ghela, David Wadden, and Yejin Choi Website | Paper | HALoGEN prompts | LLM Hallucinations | Decomposers and Verifiers | Scoring Functions

Overview

Despite their impressive ability to generate high-quality and fluent text, generative large language models (LLMs) also… See the full description on the dataset page: https://huggingface.co/datasets/lasha-nlp/HALoGEN-prompts.
real-toxicity-prompts
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2, real-toxicity-prompts [Dataset]. http://doi.org/10.57967/hf/0002
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/0002
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Real Toxicity Prompts

Dataset Summary

RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.

Languages

English

Dataset Structure Data Instances

Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.
h
llm-prompt-recovery
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tuhina Tripathi, llm-prompt-recovery [Dataset]. https://huggingface.co/datasets/tuhinatripathi/llm-prompt-recovery
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Tuhina Tripathi
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
tuhinatripathi/llm-prompt-recovery dataset hosted on Hugging Face and contributed by the HF Datasets community
llmail-inject-challenge
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Microsoft, llmail-inject-challenge [Dataset]. https://huggingface.co/datasets/microsoft/llmail-inject-challenge
Explore at:
Dataset authored and provided by
Microsofthttp://microsoft.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Summary

This dataset contains a large number of attack prompts collected as part of the now closed LLMail-Inject: Adaptive Prompt Injection Challenge. We first describe the details of the challenge, and then we provide a documentation of the dataset For the accompanying code, check out: https://github.com/microsoft/llmail-inject-challenge.

Citation

@article{abdelnabi2025, title = {LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/llmail-inject-challenge.
h
Prompt-R1
huggingface.co
Updated 1. des. 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenjin Liu (2025). Prompt-R1 [Dataset]. https://huggingface.co/datasets/QwenQKing/Prompt-R1
Explore at:
Dataset updated
1. des. 2025
Authors
Wenjin Liu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Prompt-R1: Enhancing LLM interaction on behalf of humans

Prompt-R1: Collaborative Automatic Prompting Framework via End-to-end Reinforcement Learning

📄 Paper | 🚀 Quick Start | 💬 Contact

Overview

Prompt-R1 has addressed a critical challenge in interacting with large language models (LLMs)—the inability of users to provide accurate and effective interaction prompts for complex tasks. Prompt-R1 is an end-to-end reinforcement learning (RL)… See the full description on the dataset page: https://huggingface.co/datasets/QwenQKing/Prompt-R1.

Facebook

Twitter

Click to copy link

Link copied

Cite

Naomi Bashkansky (2024). llm-system-prompts-benchmark [Dataset]. https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark

llm-system-prompts-benchmark

Naomibas/llm-system-prompts-benchmark

100 system prompts for benchmarking large language models

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

10. jan. 2024

Authors

Naomi Bashkansky

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for Dataset Name

This datset is a collection of 100 system prompts for large language models.

  Dataset Details





  Dataset Description

These 100 system prompts test a model's ability to follow grammatical patterns; answer basic multiple choice questions; act according to a particular persona; memorize information; and speak in French. Files:

hundred_system_prompts.py: refer to this to see the (prompt, probe, function) triplets, as well as the… See the full description on the dataset page: https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark.

Clear search

Close search

Google apps

Main menu

llm-system-prompts-benchmark

synthetic_multilingual_llm_prompts

malicious-llm-prompts-v4

paper-llm-prompts

awesome-chatgpt-prompts

speedrender-llm-prompt

Official_LLM_System_Prompts

in-the-wild-jailbreak-prompts

System-Prompt-Library-030825

prompt-injections

self-align-prompts

vigil-instruction-bypass-all-MiniLM-L6-v2

OpenEndedLLMPrompts

deepseek-r1-reasoning-prompts

prompt-difficulty

HALoGEN-prompts

real-toxicity-prompts

llm-prompt-recovery

llmail-inject-challenge

Prompt-R1

llm-system-prompts-benchmark

Naomibas/llm-system-prompts-benchmark

100 system prompts for benchmarking large language models