Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In-The-Wild Jailbreak Prompts on LLMs
This is the official repository for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models by Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. In this project, employing our new framework JailbreakHub, we conduct the first measurement study on jailbreak prompts in the wild, with 15,140 prompts collected from December 2022 to December 2023 (including 1,405… See the full description on the dataset page: https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Jailbreak Classification
Dataset Summary
Dataset used to classify prompts as jailbreak vs. benign.
Dataset Structure
Data Fields
prompt: an LLM prompt type: classification label, either jailbreak or benign
Dataset Creation
Curation Rationale
Created to help detect & prevent harmful jailbreak prompts when users interact with LLMs.
Source Data
Jailbreak prompts sourced from: https://github.com/verazuo/jailbreak_llms… See the full description on the dataset page: https://huggingface.co/datasets/jackhhao/jailbreak-classification.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In-The-Wild Jailbreak Prompts on LLMs
Paper: ``Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models Data: Dataset
Data
Prompts
Overall, authors collect 15,140 prompts from four platforms (Reddit, Discord, websites, and open-source datasets) during Dec 2022 to Dec 2023. Among these prompts, they identify 1,405 jailbreak prompts. To the best of our knowledge, this dataset serves as the largest collection of… See the full description on the dataset page: https://huggingface.co/datasets/walledai/JailbreakHub.
Facebook
TwitterDataset Card for Dataset Name
Name
ChatGPT Jailbreak Prompts
Dataset Summary
ChatGPT Jailbreak Prompts is a complete collection of jailbreak related prompts for ChatGPT. This dataset is intended to provide a valuable resource for understanding and generating text in the context of jailbreaking in ChatGPT.
Languages
[English]
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
In-The-Wild Jailbreak Prompts on LLMs
This is the official repository for the ACM CCS 2024 paper "Do Anything Now'': Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models by Xinyue Shen, Zeyuan Chen, Michael Backes, Yun Shen, and Yang Zhang. In this project, employing our new framework JailbreakHub, we conduct the first measurement study on jailbreak prompts in the wild, with 15,140 prompts collected from December 2022 to December 2023 (including 1,405… See the full description on the dataset page: https://huggingface.co/datasets/TrustAIRLab/in-the-wild-jailbreak-prompts.