3 datasets found

h
chatbot_arena_conversations
huggingface.co
Updated Jul 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2023). chatbot_arena_conversations [Dataset]. https://huggingface.co/datasets/lmsys/chatbot_arena_conversations
Explore at:
Dataset updated
Jul 18, 2023
Dataset authored and provided by
Large Model Systems Organization
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Chatbot Arena Conversations Dataset

This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.
h
lmsys-chat-1m
huggingface.co
Updated Jul 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jitendra Chauhan (2025). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/jc-detoxio/lmsys-chat-1m
Explore at:
Dataset updated
Jul 2, 2025
Authors
Jitendra Chauhan
Description
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/jc-detoxio/lmsys-chat-1m.
Arena-Hard-v0.1
kaggle.com
Updated May 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LMSYS ORG (2024). Arena-Hard-v0.1 [Dataset]. http://doi.org/10.34740/kaggle/dsv/8283907
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/8283907
Dataset updated
May 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
LMSYS ORG
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Checkout our blog post

Building an affordable and reliable benchmark for LLM chatbots has become a critical challenge. A high-quality benchmark should 1. robustly separate model capability 2. reflect human preference in real-world use cases 3. frequently update to avoid over-fitting or test set leakage

Traditional benchmarks are often static or close-ended (e.g., MMLU multi-choice QA), which do not satisfy the above requirements. On the other hand, models are evolving faster than ever, underscoring the need to build benchmarks with high separability.

We introduce Arena-Hard – a data pipeline to build high-quality benchmarks from live data in Chatbot Arena, which is a crowd-sourced platform for LLM evals.

We compare our new benchmark, Arena Hard v0.1, to a current leading chat LLM benchmark, MT Bench. We show Arena Hard v0.1 offers significantly stronger separability against MT Bench with tighter confidence intervals. It also has a higher agreement (89.1%, see blog post) with the human preference ranking by Chatbot Arena (english-only). We expect to see this benchmark useful for model developers to differentiate their model checkpoints.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Large Model Systems Organization (2023). chatbot_arena_conversations [Dataset]. https://huggingface.co/datasets/lmsys/chatbot_arena_conversations

chatbot_arena_conversations

lmsys/chatbot_arena_conversations

Explore at:

25 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jul 18, 2023

Dataset authored and provided by

Large Model Systems Organization

License

https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

Description

Chatbot Arena Conversations Dataset

This dataset contains 33K cleaned conversations with pairwise human preferences. It is collected from 13K unique IP addresses on the Chatbot Arena from April to June 2023. Each sample includes a question ID, two model names, their full conversation text in OpenAI API JSON format, the user vote, the anonymized user ID, the detected language tag, the OpenAI moderation API tag, the additional toxic tag, and the timestamp. To ensure the safe release… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/chatbot_arena_conversations.

Clear search

Close search

Google apps

Main menu

chatbot_arena_conversations

lmsys-chat-1m

Arena-Hard-v0.1

chatbot_arena_conversationsSee More Versions

lmsys/chatbot_arena_conversations

chatbot_arena_conversations