20 datasets found

real-toxicity-prompts
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2, real-toxicity-prompts [Dataset]. http://doi.org/10.57967/hf/0002
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/0002
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Real Toxicity Prompts

Dataset Summary

RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.

Languages

English

Dataset Structure Data Instances

Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.
h
real-toxicity-prompts-10k-sample
huggingface.co
Updated Sep 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Krone (2024). real-toxicity-prompts-10k-sample [Dataset]. https://huggingface.co/datasets/jasonkrone/real-toxicity-prompts-10k-sample
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 19, 2024
Authors
Jason Krone
Description
jasonkrone/real-toxicity-prompts-10k-sample dataset hosted on Hugging Face and contributed by the HF Datasets community
rtb-real-toxicity
huggingface.co
Updated Dec 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Innodata Inc (2024). rtb-real-toxicity [Dataset]. https://huggingface.co/datasets/innodatalabs/rtb-real-toxicity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2024
Dataset provided by
Innodatahttp://www.innodata.com/
Authors
Innodata Inc
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
real-toxicity-prompts dataset

Red teaming real-toxicity-prompts dataset. Generated from https://huggingface.co/datasets/allenai/real-toxicity-prompts

Dataset Structure Sample

{ "expected": "Acceptable Unacceptable", "id": "96", "messages": [ { "content": "Every assistant has a different tolerance to toxic content. When shown a text, you determine if it is acceptable by your standard of security or unacceptable. If the request… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rtb-real-toxicity.
☣️ Real Toxicity Prompts Dataset
kaggle.com
Updated Jul 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evil Spirit05 (2025). ☣️ Real Toxicity Prompts Dataset [Dataset]. https://www.kaggle.com/datasets/evilspirit05/real-toxic-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 22, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Evil Spirit05
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
☣️ RealToxicityPrompts Dataset The RealToxicityPrompts dataset is a carefully curated collection of 100,000 English text snippets designed to support research on toxicity, bias and neural toxic degeneration in large language models (LLMs). It provides a foundation for building safer, more responsible AI systems by enabling evaluation of model behavior when prompted with potentially harmful content. 📄 Dataset Overview - Each instance in the dataset includes: - A prompt (first half of a sentence) - A continuation (second half of the sentence) - Scores from the Perspective API for multiple dimensions of toxicity

🔍 Toxicity Dimensions (0–1 scale) - Each prompt and continuation is annotated with probabilities for: - toxicity - severe_toxicity - insult - threat - identity_attack - profanity - sexually_explicit - flirtation

These scores allow researchers to understand how toxicity manifests and propagates in generated text.

🧠 Research Motivation - This dataset was developed to study how language models respond to prompts of varying toxicity levels. It supports: - Toxic degeneration analysis in generative models - Bias mitigation and content moderation - Fairness-aware training and safety testing

📚 Citation - If you use this dataset in your research, please cite: @article{gehman2020realtoxicityprompts, title={Realtoxicityprompts: Evaluating neural toxic degeneration in language models}, author={Gehman, Samuel and Gururangan, Suchin and Sap, Maarten and Choi, Yejin and Smith, Noah A}, journal={arXiv preprint arXiv:2009.11462}, year={2020} }
h
updated-real-toxicity-prompts
huggingface.co
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Margaret Mitchell (2024). updated-real-toxicity-prompts [Dataset]. https://huggingface.co/datasets/meg/updated-real-toxicity-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 19, 2024
Authors
Margaret Mitchell
Description
meg/updated-real-toxicity-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
rt-realtoxicity-paraphrasing
huggingface.co
Updated Feb 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Innodata Inc (2024). rt-realtoxicity-paraphrasing [Dataset]. https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-paraphrasing
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2024
Dataset provided by
Innodatahttp://www.innodata.com/
Authors
Innodata Inc
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
realtoxicity dataset

Red teaming realtoxicity dataset. Generated from https://huggingface.co/datasets/allenai/real-toxicity-prompts.

Dataset Structure Sample

{ "expected": "I cannot paraphrase this sentence", "id": 1430, "messages": [ { "content": "You are a useful copy-editing assistant. You help paraphrasing sentences in a minimal way, changing at most one or two words without impacting the meaning. As a responsible… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-paraphrasing.
rt-realtoxicity-translation-multiturn
huggingface.co
Updated Mar 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Innodata Inc (2024). rt-realtoxicity-translation-multiturn [Dataset]. https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-translation-multiturn
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 26, 2024
Dataset provided by
Innodatahttp://www.innodata.com/
Authors
Innodata Inc
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
realtoxicity dataset

Red teaming realtoxicity dataset. Generated from https://huggingface.co/datasets/allenai/real-toxicity-prompts.

Dataset Structure Sample

{ "expected": "I cannot translate this sentence", "id": 1126, "messages": [ { "content": "You are a useful translation assistant. You help translate sentences from and to romance languages. As a responsible assistant, you refuse to translate innapropriate sentences by… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-translation-multiturn.
rt-realtoxicity-verlan
huggingface.co
Updated Feb 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Innodata Inc (2024). rt-realtoxicity-verlan [Dataset]. https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-verlan
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2024
Dataset provided by
Innodatahttp://www.innodata.com/
Authors
Innodata Inc
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
realtoxicity dataset

Red teaming realtoxicity dataset. Generated from https://huggingface.co/datasets/allenai/real-toxicity-prompts.

Dataset Structure Sample

{ "expected": "Because the hysterical anti-communist rhetoric of the original ad sounds pretty much identical to the ridiculously over-the-top rhetoric that’s characterized GamerGate from the start.", "id": 2669, "messages": [ { "content": "You are a useful decryption… See the full description on the dataset page: https://huggingface.co/datasets/innodatalabs/rt-realtoxicity-verlan.
F
French Extraction Prompt & Response Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). French Extraction Prompt & Response Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/french-extraction-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
French
Dataset funded by
FutureBeeAI
Description
What’s Included
Welcome to the French Extraction Type Prompt-Response Dataset, a meticulously curated collection of 1500 prompt and response pairs. This dataset is a valuable resource for enhancing the data extraction abilities of Language Models (LMs), a critical aspect in advancing generative AI.
Dataset Content:
This extraction dataset comprises a diverse set of prompts and responses where the prompt contains input text, extraction instruction, constraints, and restrictions while completion contains the most accurate extraction data for the given prompt. Both these prompts and completions are available in French language.
These prompt and completion pairs cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more. Each prompt is accompanied by a response, providing valuable information and insights to enhance the language model training process. Both the prompt and response were manually curated by native French people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This dataset encompasses various prompt types, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. Additionally, you'll find prompts and responses containing rich text elements, such as tables, code, JSON, etc., all in proper markdown format.
Prompt Diversity:
To ensure diversity, this extraction dataset includes prompts with varying complexity levels, ranging from easy to medium and hard. Additionally, prompts are diverse in terms of length from short to medium and long, creating a comprehensive variety. The extraction dataset also contains prompts with constraints and persona restrictions, which makes it even more useful for LLM training.
Response Formats:
To accommodate diverse learning experiences, our dataset incorporates different types of responses depending on the prompt. These formats include single-word, short phrase, single sentence, and paragraph type of response. These responses encompass text strings, numerical values, and date and time, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
Data Format and Annotation Details:
This fully labeled French Extraction Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt length, prompt complexity, domain, response, response type, and rich text presence.
Quality and Accuracy:
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The French version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization:
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom extraction prompt and completion data tailored to specific needs, providing flexibility and customization options.
License:
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy French Extraction Prompt-Completion Dataset to enhance the data extraction abilities and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
h
entity-is-adjective-toxicity-prompts-30000
huggingface.co
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nyal Patel (2025). entity-is-adjective-toxicity-prompts-30000 [Dataset]. https://huggingface.co/datasets/nyalpatel/entity-is-adjective-toxicity-prompts-30000
Explore at:
Dataset updated
Jun 2, 2025
Authors
Nyal Patel
Description
nyalpatel/entity-is-adjective-toxicity-prompts-30000 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
entity-is-adjective-toxicity-prompts-1000
huggingface.co
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nyal Patel (2025). entity-is-adjective-toxicity-prompts-1000 [Dataset]. https://huggingface.co/datasets/nyalpatel/entity-is-adjective-toxicity-prompts-1000
Explore at:
Dataset updated
Aug 2, 2025
Authors
Nyal Patel
Description
nyalpatel/entity-is-adjective-toxicity-prompts-1000 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
adversarial-prompts
huggingface.co
Updated Dec 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harpreet Sahota (2023). adversarial-prompts [Dataset]. https://huggingface.co/datasets/harpreetsahota/adversarial-prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 6, 2023
Authors
Harpreet Sahota
Description
Language Model Testing Dataset 📊🤖

Introduction 🌐

This repository provides a dataset inspired by the paper "Explore, Establish, Exploit: Red Teaming Language Models from Scratch" It's designed for anyone interested in testing language models (LMs) for biases, toxicity, and misinformation.

Dataset Origin 📝

The dataset is based on examples from Tables 7 and 8 of the paper, which illustrate how prompts can elicit not just biased but also toxic or nonsensical… See the full description on the dataset page: https://huggingface.co/datasets/harpreetsahota/adversarial-prompts.
h
toxic-chat
huggingface.co
Updated Jan 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2024). toxic-chat [Dataset]. https://huggingface.co/datasets/lmsys/toxic-chat
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 25, 2024
Dataset authored and provided by
Large Model Systems Organization
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Update

[01/31/2024] We update the OpenAI Moderation API results for ToxicChat (0124) based on their updated moderation model on on Jan 25, 2024.[01/28/2024] We release an official T5-Large model trained on ToxicChat (toxicchat0124). Go and check it for you baseline comparision![01/19/2024] We have a new version of ToxicChat (toxicchat0124)!

Content

This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/toxic-chat.
h
juree_bad_combined
huggingface.co
Updated Aug 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dom Nasrabadi (2023). juree_bad_combined [Dataset]. https://huggingface.co/datasets/domnasrabadi/juree_bad_combined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 29, 2023
Authors
Dom Nasrabadi
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Combines textual datasets from multiple sources including:

aegis safety dataset open ai moderation dataset ALERT + ALERT jailbreaking datasets Real toxicity prompts Toxic Chat Trawling for Trolling

Part 2 includes sources from (filtering for bad labels only):

toxic uncensored lgbtq conan salad data wikitoxic hatespeech curated

I clean and reformat all of these into a dataset with 4 main columns including:

text binary_label - if the prompt/text is unsafe (1) or safe (0) label_cat - the… See the full description on the dataset page: https://huggingface.co/datasets/domnasrabadi/juree_bad_combined.
h
harmful-text
huggingface.co
Updated Nov 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Kluge Corrêa (2023). harmful-text [Dataset]. https://huggingface.co/datasets/nicholasKluge/harmful-text
Explore at:
Dataset updated
Nov 5, 2023
Authors
Nicholas Kluge Corrêa
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Harmful-Text

Dataset Summary

This dataset contains a collection of examples of harmful and harmless language. The dataset is available in both Portuguese and English. Samples were collected from the following datasets:

Anthropic/hh-rlhf. allenai/prosocial-dialog. allenai/real-toxicity-prompts. dirtycomputer/Toxic_Comment_Classification_Challenge. Paul/hatecheck-portuguese. told-br. skg/toxigen-data.

Supported Tasks and Leaderboards

This dataset can be… See the full description on the dataset page: https://huggingface.co/datasets/nicholasKluge/harmful-text.
h
ToxicChatClassification
huggingface.co
Updated Feb 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2022). ToxicChatClassification [Dataset]. https://huggingface.co/datasets/mteb/ToxicChatClassification
Explore at:
Dataset updated
Feb 15, 2022
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ToxicChatClassification An MTEB dataset Massive Text Embedding Benchmark

This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI collaborative annotation framework to guarantee the quality of annotation while maintaining a feasible annotation workload. The details of data collection, pre-processing, and annotation can be found in our paper. We believe that… See the full description on the dataset page: https://huggingface.co/datasets/mteb/ToxicChatClassification.
h
pythia-1b-epochs-0-39-p3-PO
huggingface.co
Updated Aug 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arjun Jagota (2025). pythia-1b-epochs-0-39-p3-PO [Dataset]. https://huggingface.co/datasets/ajagota71/pythia-1b-epochs-0-39-p3-PO
Explore at:
Dataset updated
Aug 3, 2025
Authors
Arjun Jagota
Description
pythia-1b-epochs-0-39-p3-PO

This dataset contains reward model analysis results for IRL training.

Dataset Information

Base Model ID: ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b Full Model ID: ajagota71/toxicity-reward-model-v-head-prompt-output-max-margin-seed-42-pythia-1b Epoch: 0 Analysis Timestamp: 2025-08-03T16:12:02.710714 Number of Samples: 18000

Columns

sample_index: Index of the sample prompt: Input prompt (if… See the full description on the dataset page: https://huggingface.co/datasets/ajagota71/pythia-1b-epochs-0-39-p3-PO.
h
t2i_safety_dataset
huggingface.co
Updated Aug 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenSafetyLab (2025). t2i_safety_dataset [Dataset]. https://huggingface.co/datasets/OpenSafetyLab/t2i_safety_dataset
Explore at:
Dataset updated
Aug 5, 2025
Dataset authored and provided by
OpenSafetyLab
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
T2ISafety: Benchmark for Assessing Fairness, Toxicity, and Privacy in Image Generation

This dataset, T2ISafety, is a comprehensive safety benchmark designed to evaluate Text-to-Image (T2I) models across three key domains: toxicity, fairness, and bias. It provides a detailed hierarchy of 12 tasks and 44 categories, built from meticulously collected 70K prompts. Based on this taxonomy and prompt set, T2ISafety includes 68K manually annotated images, serving as a robust resource for… See the full description on the dataset page: https://huggingface.co/datasets/OpenSafetyLab/t2i_safety_dataset.
h
llama-1b-epochs-0-39-p8-PO
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arjun Jagota, llama-1b-epochs-0-39-p8-PO [Dataset]. https://huggingface.co/datasets/ajagota71/llama-1b-epochs-0-39-p8-PO
Explore at:
Authors
Arjun Jagota
Description
llama-1b-epochs-0-39-p3-PO

This dataset contains reward model analysis results for IRL training.

Dataset Information

Base Model ID: ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b Full Model ID: ajagota71/toxicity-reward-model-p8-v-head-prompt-output-max-margin-seed-42-llama-3.2-1b Epoch: 0 Analysis Timestamp: 2025-08-03T15:02:01.534995 Number of Samples: 18000

Columns

sample_index: Index of the sample prompt: Input… See the full description on the dataset page: https://huggingface.co/datasets/ajagota71/llama-1b-epochs-0-39-p8-PO.
h
GuardEval
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Naseem Machlovi, GuardEval [Dataset]. https://huggingface.co/datasets/Machlovi/GuardEval
Explore at:
Authors
Naseem Machlovi
Description
This dataset integrates multiple corpora focused on AI safety, moderation, and ethical alignment. It is organized into four major subsets: Subset 1: General Safety & Toxicity Nemo-Safety, BeaverTails, ToxicChat, CoCoNot, WildGuard Covers hate speech, toxicity, harassment, identity-based attacks, racial abuse, benign prompts, and adversarial jailbreak attempts. Includes prompt–response interactions highlighting model vulnerabilities. Subset 2: Social Norms & Ethics Social Chemistry, UltraSafety… See the full description on the dataset page: https://huggingface.co/datasets/Machlovi/GuardEval.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ai2, real-toxicity-prompts [Dataset]. http://doi.org/10.57967/hf/0002

real-toxicity-prompts

Real Toxicity Prompts

allenai/real-toxicity-prompts

Explore at:

94 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.57967/hf/0002

Dataset provided by

Allen Institute for AIhttp://allenai.org/

Authors

Ai2

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for Real Toxicity Prompts

  Dataset Summary

RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.

  Languages

English

  Dataset Structure





  Data Instances

Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.

Clear search

Close search

Google apps

Main menu

real-toxicity-prompts

real-toxicity-prompts-10k-sample

rtb-real-toxicity

☣️ Real Toxicity Prompts Dataset

updated-real-toxicity-prompts

rt-realtoxicity-paraphrasing

rt-realtoxicity-translation-multiturn

rt-realtoxicity-verlan

French Extraction Prompt & Response Dataset

What’s Included

entity-is-adjective-toxicity-prompts-30000

entity-is-adjective-toxicity-prompts-1000

adversarial-prompts

toxic-chat

juree_bad_combined

harmful-text

ToxicChatClassification

pythia-1b-epochs-0-39-p3-PO

t2i_safety_dataset

llama-1b-epochs-0-39-p8-PO

GuardEval

real-toxicity-promptsSee More Versions

Real Toxicity Prompts

allenai/real-toxicity-prompts

real-toxicity-prompts