Se han encontrado más de 100 conjuntos de datos

h
llm-system-prompts-benchmark
huggingface.co
Última actualización: 10 ene 2024
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Naomi Bashkansky (2024). llm-system-prompts-benchmark [Dataset]. https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
10 ene 2024
Autores
Naomi Bashkansky
Licencia
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
Se ha obtenido la información de la licencia automáticamente
Descripción
Dataset Card for Dataset Name

This datset is a collection of 100 system prompts for large language models.

Dataset Details Dataset Description

These 100 system prompts test a model's ability to follow grammatical patterns; answer basic multiple choice questions; act according to a particular persona; memorize information; and speak in French. Files:

hundred_system_prompts.py: refer to this to see the (prompt, probe, function) triplets, as well as the… See the full description on the dataset page: https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark.
h
synthetic_multilingual_llm_prompts
huggingface.co
Última actualización: 11 jun 2024
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Gretel.ai (2024). synthetic_multilingual_llm_prompts [Dataset]. https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
11 jun 2024
Conjunto de datos proporcionado por
Gretel.ai
Licencia
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
Se ha obtenido la información de la licencia automáticamente
Descripción
Image generated by DALL-E. See prompt for more details

📝🌐 Synthetic Multilingual LLM Prompts

Welcome to the "Synthetic Multilingual LLM Prompts" dataset! This comprehensive collection features 1,250 synthetic LLM prompts generated using Gretel Navigator, available in seven different languages. To ensure accuracy and diversity in prompts, and translation quality and consistency across the different languages, we employed Gretel Navigator both as a generation tool and as an… See the full description on the dataset page: https://huggingface.co/datasets/gretelai/synthetic_multilingual_llm_prompts.
h
malicious-llm-prompts-v4
huggingface.co
Última actualización: 26 ene 2025
y más versiones
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Sagar Patel (2025). malicious-llm-prompts-v4 [Dataset]. https://huggingface.co/datasets/codesagar/malicious-llm-prompts-v4
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
26 ene 2025
Autores
Sagar Patel
Descripción
Dataset Card for "malicious-llm-prompts-v4"

More Information needed
h
paper-llm-prompts
huggingface.co
Última actualización: 3 jul 2023
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
B F (2023). paper-llm-prompts [Dataset]. https://huggingface.co/datasets/beephids/paper-llm-prompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
3 jul 2023
Autores
B F
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
beephids/paper-llm-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
awesome-chatgpt-prompts
huggingface.co
Última actualización: 15 dic 2023
y más versiones
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Fatih Kadir Akın (2023). awesome-chatgpt-prompts [Dataset]. https://huggingface.co/datasets/fka/awesome-chatgpt-prompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
15 dic 2023
Autores
Fatih Kadir Akın
Licencia
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Descripción
🧠 Awesome ChatGPT Prompts [CSV dataset]

This is a Dataset Repository of Awesome ChatGPT Prompts View All Prompts on GitHub

License

CC-0
h
speedrender-llm-prompt
huggingface.co
Última actualización: 12 feb 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Kaique Pereira (2025). speedrender-llm-prompt [Dataset]. https://huggingface.co/datasets/kaiquedu/speedrender-llm-prompt
Ver en:
Fecha de actualización del conjunto de datos
12 feb 2025
Autores
Kaique Pereira
Descripción
kaiquedu/speedrender-llm-prompt dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Official_LLM_System_Prompts
huggingface.co
Última actualización: 20 nov 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Nymbo (2025). Official_LLM_System_Prompts [Dataset]. https://huggingface.co/datasets/Nymbo/Official_LLM_System_Prompts
Ver en:
Fecha de actualización del conjunto de datos
20 nov 2025
Autores
Nymbo
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
Official LLM System Prompts

This short dataset contains a few system prompts leaked from proprietary models. Contains date-stamped prompts from OpenAI, Anthropic, MS Copilot, GitHub Copilot, Grok, and Perplexity.
h
in-the-wild-jailbreak-prompts
huggingface.co
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
TrustAIRLab, in-the-wild-jailbreak-prompts [Dataset]. https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Conjunto de datos creado y proporcionado por
TrustAIRLab
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
In-The-Wild Jailbreak Prompts on LLMs

This is the official repository for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models by Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. In this project, employing our new framework JailbreakHub, we conduct the first measurement study on jailbreak prompts in the wild, with 15,140 prompts collected from December 2022 to December 2023 (including 1,405… See the full description on the dataset page: https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts.
h
System-Prompt-Library-030825
huggingface.co
Última actualización: 17 oct 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Daniel Rosehill (2025). System-Prompt-Library-030825 [Dataset]. http://doi.org/10.57967/hf/6319
Ver en:
Identificador único
https://doi.org/10.57967/hf/6319
Fecha de actualización del conjunto de datos
17 oct 2025
Autores
Daniel Rosehill
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
System Prompts Dataset - August 2025

Point-in-time export from Daniel Rosehill's system prompt library as of August 3rd, 2025

Overview

This repository contains a comprehensive collection of 944 system prompts designed for various AI applications, agent workflows, and conversational AI systems. While many of these prompts now serve as the foundation for more complex agent-based workflows, they continue to provide essential building blocks for AI system design and… See the full description on the dataset page: https://huggingface.co/datasets/danielrosehill/System-Prompt-Library-030825.
h
prompt-injections
huggingface.co
Última actualización: 4 may 2025
y más versiones
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
HSE LLM @ Saint Petersburg (2025). prompt-injections [Dataset]. https://huggingface.co/datasets/hse-llm/prompt-injections
Ver en:
Fecha de actualización del conjunto de datos
4 may 2025
Conjunto de datos creado y proporcionado por
HSE LLM @ Saint Petersburg
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
hse-llm/prompt-injections dataset hosted on Hugging Face and contributed by the HF Datasets community
h
JailbreakPrompts
huggingface.co
Última actualización: 26 jun 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Simon Knuts (2025). JailbreakPrompts [Dataset]. https://huggingface.co/datasets/Simsonsun/JailbreakPrompts
Ver en:
Fecha de actualización del conjunto de datos
26 jun 2025
Autores
Simon Knuts
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
Independent Jailbreak Datasets for LLM Guardrail Evaluation

Constructed for the thesis:“Contamination Effects: How Training Data Leakage Affects Red Team Evaluation of LLM Jailbreak Detection” The effectiveness of LLM guardrails is commonly evaluated using open-source red teaming tools. However, this study reveals that significant data contamination exists between the training sets of binary jailbreak classifiers (ProtectAI, Katanemo, TestSavantAI, etc.) and the test prompts used in… See the full description on the dataset page: https://huggingface.co/datasets/Simsonsun/JailbreakPrompts.
h
self-align-prompts
huggingface.co
Última actualización: 31 jul 2024
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
mii-llm (2024). self-align-prompts [Dataset]. https://huggingface.co/datasets/mii-llm/self-align-prompts
Ver en:
Fecha de actualización del conjunto de datos
31 jul 2024
Conjunto de datos creado y proporcionado por
mii-llm
Descripción
mii-llm/self-align-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
prompt-difficulty
huggingface.co
Última actualización: 2 dic 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Alan Tseng (2025). prompt-difficulty [Dataset]. https://huggingface.co/datasets/agentlans/prompt-difficulty
Ver en:
Fecha de actualización del conjunto de datos
2 dic 2025
Autores
Alan Tseng
Licencia
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
Se ha obtenido la información de la licencia automáticamente
Descripción
Prompt Difficulty Meta-Analysis

Introduction

Large language model (LLM) prompts vary widely in complexity, required knowledge, and reasoning demands. Some prompts are straightforward, while others require advanced understanding and multi-step reasoning. This study analyzes the difficulty of English ChatGPT prompts using classifiers trained on multiple difficulty-labelled datasets.
The goal is to produce a consistent, data-driven difficulty score that can be used to… See the full description on the dataset page: https://huggingface.co/datasets/agentlans/prompt-difficulty.
h
vigil-instruction-bypass-all-MiniLM-L6-v2
huggingface.co
Última actualización: 16 oct 2023
y más versiones
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Adam Swanda (2023). vigil-instruction-bypass-all-MiniLM-L6-v2 [Dataset]. https://huggingface.co/datasets/deadbits/vigil-instruction-bypass-all-MiniLM-L6-v2
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
16 oct 2023
Autores
Adam Swanda
Descripción
Vigil: LLM Instruction Bypass all-MiniLM-L6-v2

Repo: github.com/deadbits/vigil-llm

Vigil is a Python framework and REST API for assessing Large Language Model (LLM) prompts against a set of scanners to detect prompt injections, jailbreaks, and other potentially risky inputs. This repository contains all-MiniLM-L6-v2 embeddings for all Instruction Bypass style prompts ("Ignore instructions ...") used by Vigil. You can use the parquet2vdb.py utility to load the embeddings in the… See the full description on the dataset page: https://huggingface.co/datasets/deadbits/vigil-instruction-bypass-all-MiniLM-L6-v2.
h
OpenEndedLLMPrompts
huggingface.co
Última actualización: 6 jul 2024
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Shreyan (2024). OpenEndedLLMPrompts [Dataset]. https://huggingface.co/datasets/shreyanmitra/OpenEndedLLMPrompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
6 jul 2024
Autores
Shreyan
Descripción
Dataset Card for OpenEndedLLMPrompts

A cleaned and consolidated set of questions (without context) and answers for LLM hallucination detection. Each question-answer pair is not the work of the author, but was selected from OpenAssistant/oasst2. If you use any of the data provided, please cite this source in addition to the following paper Shreyan Mitra and Leilani Gilpin. Detecting LLM Hallucinations Pre-generation (paper pending) The original dataset was provided in a tree… See the full description on the dataset page: https://huggingface.co/datasets/shreyanmitra/OpenEndedLLMPrompts.
h
deepseek-r1-reasoning-prompts
huggingface.co
Última actualización: 27 ene 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
umar igan (2025). deepseek-r1-reasoning-prompts [Dataset]. https://huggingface.co/datasets/umarigan/deepseek-r1-reasoning-prompts
Ver en:
Fecha de actualización del conjunto de datos
27 ene 2025
Autores
umar igan
Descripción
I created a reasoning prompt dataset from deepseek-r1 model with the purpose of fine-tuning small language models to use them to generate better reasoning prompt to use with bigger llm models.

Metadata

The metadata is made available through a series of parquet files with the following schema:

id: A unique identifier for the qa. question: answer: Answer from deepseek-r1 think model. reasoning: Reasoning from deepseek-r1 model.
h
HALoGEN-prompts
huggingface.co
Última actualización: 20 ene 2025
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Abhilasha Ravichander (2025). HALoGEN-prompts [Dataset]. https://huggingface.co/datasets/lasha-nlp/HALoGEN-prompts
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Fecha de actualización del conjunto de datos
20 ene 2025
Autores
Abhilasha Ravichander
Descripción
HALOGEN🔦: Fantastic LLM Hallucinations and Where to Find Them

This repository contains the prompts of HALOGEN🔦: Fantastic LLM Hallucinations and Where to Find Them by *Abhilasha Ravichander, *Shrusti Ghela, David Wadden, and Yejin Choi Website | Paper | HALoGEN prompts | LLM Hallucinations | Decomposers and Verifiers | Scoring Functions

Overview

Despite their impressive ability to generate high-quality and fluent text, generative large language models (LLMs) also… See the full description on the dataset page: https://huggingface.co/datasets/lasha-nlp/HALoGEN-prompts.
real-toxicity-prompts
huggingface.co
y más versiones
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Ai2, real-toxicity-prompts [Dataset]. http://doi.org/10.57967/hf/0002
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Identificador único
https://doi.org/10.57967/hf/0002
Conjunto de datos proporcionado por
Allen Institute for AIhttp://allenai.org/
Autores
Ai2
Licencia
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
Se ha obtenido la información de la licencia automáticamente
Descripción
Dataset Card for Real Toxicity Prompts

Dataset Summary

RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.

Languages

English

Dataset Structure Data Instances

Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.
h
llm-prompt-recovery
huggingface.co
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Tuhina Tripathi, llm-prompt-recovery [Dataset]. https://huggingface.co/datasets/tuhinatripathi/llm-prompt-recovery
Ver en:
CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.
Autores
Tuhina Tripathi
Licencia
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Descripción
tuhinatripathi/llm-prompt-recovery dataset hosted on Hugging Face and contributed by the HF Datasets community
llmail-inject-challenge
huggingface.co
Compartir
Facebook
Twitter
Email
Haz clic para copiar el enlace
Enlace copiado
Citar
Microsoft, llmail-inject-challenge [Dataset]. https://huggingface.co/datasets/microsoft/llmail-inject-challenge
Ver en:
Conjunto de datos creado y proporcionado por
Microsofthttp://microsoft.com/
Licencia
MIT Licensehttps://opensource.org/licenses/MIT
Se ha obtenido la información de la licencia automáticamente
Descripción
Dataset Summary

This dataset contains a large number of attack prompts collected as part of the now closed LLMail-Inject: Adaptive Prompt Injection Challenge. We first describe the details of the challenge, and then we provide a documentation of the dataset For the accompanying code, check out: https://github.com/microsoft/llmail-inject-challenge.

Citation

@article{abdelnabi2025, title = {LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/llmail-inject-challenge.

Facebook

Twitter

Haz clic para copiar el enlace

Enlace copiado

Citar

Naomi Bashkansky (2024). llm-system-prompts-benchmark [Dataset]. https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark

llm-system-prompts-benchmark

Naomibas/llm-system-prompts-benchmark

100 system prompts for benchmarking large language models

Ver en:

CroissantCroissant es un formato para conjuntos de datos de aprendizaje automático. Consulta más información al respecto en mlcommons.org/croissant.

Fecha de actualización del conjunto de datos

10 ene 2024

Autores

Naomi Bashkansky

Licencia

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
Se ha obtenido la información de la licencia automáticamente

Descripción

Dataset Card for Dataset Name

This datset is a collection of 100 system prompts for large language models.

  Dataset Details





  Dataset Description

These 100 system prompts test a model's ability to follow grammatical patterns; answer basic multiple choice questions; act according to a particular persona; memorize information; and speak in French. Files:

hundred_system_prompts.py: refer to this to see the (prompt, probe, function) triplets, as well as the… See the full description on the dataset page: https://huggingface.co/datasets/Naomibas/llm-system-prompts-benchmark.

Búsqueda

Borrar búsqueda

Cerrar búsqueda

Aplicaciones de Google

Menú principal

llm-system-prompts-benchmark

synthetic_multilingual_llm_prompts

malicious-llm-prompts-v4

paper-llm-prompts

awesome-chatgpt-prompts

speedrender-llm-prompt

Official_LLM_System_Prompts

in-the-wild-jailbreak-prompts

System-Prompt-Library-030825

prompt-injections

JailbreakPrompts

self-align-prompts

prompt-difficulty

vigil-instruction-bypass-all-MiniLM-L6-v2

OpenEndedLLMPrompts

deepseek-r1-reasoning-prompts

HALoGEN-prompts

real-toxicity-prompts

llm-prompt-recovery

llmail-inject-challenge

llm-system-prompts-benchmark

Naomibas/llm-system-prompts-benchmark

100 system prompts for benchmarking large language models