Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Update
[01/31/2024] We update the OpenAI Moderation API results for ToxicChat (0124) based on their updated moderation model on on Jan 25, 2024.[01/28/2024] We release an official T5-Large model trained on ToxicChat (toxicchat0124). Go and check it for you baseline comparision![01/19/2024] We have a new version of ToxicChat (toxicchat0124)!
Content
This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/toxic-chat.
ToxicChat is a novel benchmark dataset constructed based on real user queries from an open-source chatbot. Unlike previous toxicity detection benchmarks that primarily rely on social media content, ToxicChat captures the rich and nuanced phenomena inherent in real-world user-AI interactions. This unique dataset reveals significant domain differences compared to social media contents, making it a valuable resource for exploring the challenges of toxicity detection in user-AI conversations¹.
Here are some key details about the ToxicChat dataset:
Construction: ToxicChat was created using real user queries collected from an open-source chatbot. Challenges: It contains phenomena that can be tricky for current toxicity detection models to identify. Domain Difference: ToxicChat exhibits a significant domain difference when compared to social media content. Purpose: ToxicChat serves as a benchmark to drive advancements in building a safe and healthy environment for user-AI interactions.
Source: Conversation with Bing, 3/17/2024 (1) ToxicChat: Unveiling Hidden Challenges of Toxicity Detection in Real .... https://aclanthology.org/2023.findings-emnlp.311/. (2) arXiv:2310.17389v1 [cs.CL] 26 Oct 2023. https://arxiv.org/pdf/2310.17389. (3) README.md · lmsys/toxic-chat at main - Hugging Face. https://huggingface.co/datasets/lmsys/toxic-chat/blob/main/README.md. (4) The Toxicity Dataset - GitHub. https://github.com/surge-ai/toxicity. (5) undefined. https://aclanthology.org/2023.findings-emnlp.311. (6) undefined. https://aclanthology.org/2023.findings-emnlp.311.pdf.
dffesalbon/dota-2-toxic-chat-data dataset hosted on Hugging Face and contributed by the HF Datasets community
IanLi233/Toxic-Chat-V2 dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset was created by ali waleed
akcit-ijf/toxic-chat dataset hosted on Hugging Face and contributed by the HF Datasets community
❤️🩹 Sensai: Toxic Chat Dataset
Sensai is a toxic chat dataset consists of live chats from Virtual YouTubers' live streams. Download the dataset from Kaggle Datasets and join #livechat-dataset channel on holodata Discord for discussions.
Provenance
Source: YouTube Live Chat events (all streams covered by Holodex, including Hololive, Nijisanji, 774inc, etc) Temporal Coverage: From 2021-01-15T05:15:33Z Update Frequency: At least once per month
Research Ideas… See the full description on the dataset page: https://huggingface.co/datasets/holodata/sensai.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Gaming Communication Management: "EE_Chat" can be used by game developers and gaming companies to understand the communication patterns among players. It can help to analyze in-game messaging, detect toxic behaviour or keywords, manage group interactions, and gather insights on player behaviour.
Online Gaming Experience Improvement: This model can be used to analyze the chatting patterns, popular topics, frequent message times, and the interactions between players. These insights can then be used to improve the chatting function, enhance user experience, and boost overall game engagement.
Chat Support Systems: EE_Chat can be adapted for use in understanding customer service interactions on various platforms. It can be used to categorize messages, analyze response times, and understand the effectiveness of the customer support team.
Interactive Game Streaming: Streamers and content creators can integrate "EE_Chat" into their streaming platforms for live interaction with their audience. The model can help them keep track of messages, prioritize certain types of messages, or filter out unwanted content.
Marketing & Advertising: This model can be used by businesses to analyze discussions in gaming communities. They can understand popular chat channels, the most active times, and trending topics. It can help in identifying huge potential markets, effective advertising spots, and creating targeted marketing campaigns.
BRlkl/toxic-chat-pt dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Real Toxicity Prompts
Dataset Summary
RealToxicityPrompts is a dataset of 100k sentence snippets from the web for researchers to further address the risk of neural toxic degeneration in models.
Languages
English
Dataset Structure
Data Instances
Each instance represents a prompt and its metadata: { "filename":"0766186-bc7f2a64cb271f5f56cf6f25570cd9ed.txt", "begin":340, "end":564, "challenging":false… See the full description on the dataset page: https://huggingface.co/datasets/allenai/real-toxicity-prompts.
Dataset Card for WildChat-nontoxic
Note: a newer version with 1 million conversations and demographic information can be found here.
Dataset Description
Paper: https://wenting-zhao.github.io/papers/wildchat.pdf
License: https://allenai.org/licenses/impact-lr
Language(s) (NLP): multi-lingual
Point of Contact: Yuntian Deng
Dataset Summary
WildChat-nontoxic is the nontoxic subset of the WildChat dataset, a collection of 530K conversations… See the full description on the dataset page: https://huggingface.co/datasets/allenai/WildChat-nontoxic.
Toxic Conversation
This is a version of the Jigsaw Unintended Bias in Toxicity Classification dataset. It contains comments from the Civil Comments platform together with annotations if the comment is toxic or not. 10 annotators annotated each example and, as recommended in the task page, set a comment as toxic when target >= 0.5 The dataset is inbalanced, with only about 8% of the comments marked as toxic.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
⚠️ Warning: this dataset contains examples of toxic, offensive, and inappropriate language. The current LLM safety landscape struggles with accurate real-world benchmarking of content moderation systems. This dataset is a subset of a larger benchmark dataset constructed by the Dynamo AI research team. It consists of real-world chats written by humans with toxic intent. In order to ensure impartiality and standardization, we drew upon open-source and human-annotated examples from Allen… See the full description on the dataset page: https://huggingface.co/datasets/dynamoai/dynamoai-benchmark-safety.
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Dataset Card for WildChat
Dataset Description
Paper: https://arxiv.org/abs/2405.01470
Interactive Search Tool: https://wildvisualizer.com (paper)
License: ODC-BY
Language(s) (NLP): multi-lingual
Point of Contact: Yuntian Deng
Dataset Summary
WildChat is a collection of 1 million conversations between human users and ChatGPT, alongside demographic data, including state, country, hashed IP addresses, and request headers. We collected WildChat by… See the full description on the dataset page: https://huggingface.co/datasets/allenai/WildChat-1M.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
SEA Toxicity Detection
SEA Toxicity Detection evaluates a model's ability to identify toxic content such as hate speech and abusive language in text. It is sampled from MLHSD for Indonesian, TTD for Thai, and ViHSD for Vietnamese.
Supported Tasks and Leaderboards
SEA Toxicity Detection is designed for evaluating chat or instruction-tuned large language models (LLMs). It is part of the SEA-HELM leaderboard from AI Singapore.
Languages
Indonesian (id) Thai… See the full description on the dataset page: https://huggingface.co/datasets/aisingapore/NLU-Toxicity-Detection.
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Dataset Card for WildChat
Note: a newer version with 1 million conversations and demographic information can be found here.
Dataset Description
Paper: https://arxiv.org/abs/2405.01470
Interactive Search Tool: https://wildvisualizer.com (paper)
License: ODC-BY
Language(s) (NLP): multi-lingual
Point of Contact: Yuntian Deng
Dataset Summary
WildChat is a collection of 650K conversations between human users and ChatGPT. We collected WildChat… See the full description on the dataset page: https://huggingface.co/datasets/allenai/WildChat.
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Chatbot Arena Conversations Dataset
This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IndicAlign
A diverse collection of Instruction and Toxic alignment datasets for 14 Indic Languages. The collection comprises of:
IndicAlign - Instruct Indic-ShareLlama Dolly-T OpenAssistant-T WikiHow IndoWordNet Anudesh Wiki-Conv Wiki-Chat
IndicAlign - Toxic HHRLHF-T Toxic-Matrix
We use IndicTrans2 (Gala et al., 2023) for the translation of the datasets. We recommend the readers to check out our paper on Arxiv for detailed information on the curation process of these… See the full description on the dataset page: https://huggingface.co/datasets/ai4bharat/indic-align.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Card for ProsocialDialog Dataset
Dataset Summary
ProsocialDialog is the first large-scale multi-turn English dialogue dataset to teach conversational agents to respond to problematic content following social norms. Covering diverse unethical, problematic, biased, and toxic situations, ProsocialDialog contains responses that encourage prosocial behavior, grounded in commonsense social rules (i.e., rules-of-thumb, RoTs). Created via a human-AI collaborative… See the full description on the dataset page: https://huggingface.co/datasets/allenai/prosocial-dialog.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Update
[01/31/2024] We update the OpenAI Moderation API results for ToxicChat (0124) based on their updated moderation model on on Jan 25, 2024.[01/28/2024] We release an official T5-Large model trained on ToxicChat (toxicchat0124). Go and check it for you baseline comparision![01/19/2024] We have a new version of ToxicChat (toxicchat0124)!
Content
This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/toxic-chat.