11 datasets found

h
lmsys-chat-1m
huggingface.co
opendatalab.com
Updated Sep 17, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m
Explore at:
Dataset updated
Sep 17, 2023
Dataset authored and provided by
Large Model Systems Organization
Description
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.
h
lmsys-chat-1m-qwen2-instruct
huggingface.co
Updated Nov 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Williams (2024). lmsys-chat-1m-qwen2-instruct [Dataset]. https://huggingface.co/datasets/bew/lmsys-chat-1m-qwen2-instruct
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 22, 2024
Authors
Brian Williams
Description
bew/lmsys-chat-1m-qwen2-instruct dataset hosted on Hugging Face and contributed by the HF Datasets community
h
lmsys-chat-1m-jsonify-v2
huggingface.co
Updated May 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lmsys-chat-1m-jsonify-v2 [Dataset]. https://huggingface.co/datasets/jsonifize/lmsys-chat-1m-jsonify-v2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 27, 2024
Dataset authored and provided by
jsonifize
Description
jsonifize/lmsys-chat-1m-jsonify-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
lmsys-chat-Qwen2.5-1.5B-Instruct-1epoch-100k
huggingface.co
Updated Nov 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanley Tang (2024). lmsys-chat-Qwen2.5-1.5B-Instruct-1epoch-100k [Dataset]. https://huggingface.co/datasets/Stanleytowne/lmsys-chat-Qwen2.5-1.5B-Instruct-1epoch-100k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 26, 2024
Authors
Stanley Tang
Description
Stanleytowne/lmsys-chat-Qwen2.5-1.5B-Instruct-1epoch-100k dataset hosted on Hugging Face and contributed by the HF Datasets community
mt_bench_prompts
huggingface.co
hf-proxy-cf.effarig.site
Updated Jul 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mt_bench_prompts [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/mt_bench_prompts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
MT Bench by LMSYS

This set of evaluation prompts is created by the LMSYS org for better evaluation of chat models. For more information, see the paper.

Dataset loading

To load this dataset, use 🤗 datasets: from datasets import load_dataset data = load_dataset(HuggingFaceH4/mt_bench_prompts, split="train")

Dataset creation

To create the dataset, we do the following for our internal tooling.

rename turns to prompts, add empty reference to… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/mt_bench_prompts.
h
ScaleBiO-Train-lmsys-chat-1m
huggingface.co
Updated Nov 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ScaleBiO (2024). ScaleBiO-Train-lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/ScaleBiO/ScaleBiO-Train-lmsys-chat-1m
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 24, 2024
Dataset authored and provided by
ScaleBiO
Description
Dataset Card for "ScaleBiO-Train-lmsys-chat-1m"

More Information needed
h
lmsys-finance
huggingface.co
Updated Apr 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GUIJIN SON (2024). lmsys-finance [Dataset]. https://huggingface.co/datasets/amphora/lmsys-finance
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 3, 2024
Authors
GUIJIN SON
Description
Dataset Card for "lmsys-finance"

This dataset is a curated version of the lmsys-chat-1m dataset, focusing solely on finance-related conversations. The refinement process encompassed:

Removing non-English conversations. Selecting conversations from models: "vicuna-33b", "wizardlm-13b", "gpt-4", "gpt-3.5-turbo", "claude-2", "palm-2", and "claude-instant-1". Excluding conversations with responses under 30 characters. Using 100 financial keywords, choosing conversations with at… See the full description on the dataset page: https://huggingface.co/datasets/amphora/lmsys-finance.
h
Barcenas-lmsys-Dataset
huggingface.co
Updated Oct 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel (2023). Barcenas-lmsys-Dataset [Dataset]. https://huggingface.co/datasets/Danielbrdz/Barcenas-lmsys-Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 20, 2023
Authors
Daniel
Description
Dataset made on the basis of lmsys/lmsys-chat-1m With data only for the Spanish language.
h
lmsys-chat-tiny-20k
huggingface.co
Updated Oct 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
🎀超絶最かわ🎀てんしちゃん (2024). lmsys-chat-tiny-20k [Dataset]. https://huggingface.co/datasets/x-angelkawaii-x/lmsys-chat-tiny-20k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 10, 2024
Dataset authored and provided by
🎀超絶最かわ🎀てんしちゃん
Description
x-angelkawaii-x/lmsys-chat-tiny-20k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
wild-if-eval
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gili Lior, wild-if-eval [Dataset]. https://huggingface.co/datasets/gililior/wild-if-eval
Explore at:
Authors
Gili Lior
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
WildIFEval Dataset

This dataset was originally introduced in the paper WildIFEval: Instruction Following in the Wild, available on arXiv. Code: https://github.com/gililior/wild-if-eval

Dataset Overview

The WildIFEval dataset is designed for evaluating instruction-following capabilities in language models. It provides decompositions of conversations extracted from the LMSYS-Chat-1M dataset. Each example includes:

conversation_id: A unique identifier for each conversation.… See the full description on the dataset page: https://huggingface.co/datasets/gililior/wild-if-eval.
h
diffing-stats-gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04
huggingface.co
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Science of Finetuning (Neel Nanda's MATS 7.0) (2024). diffing-stats-gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04 [Dataset]. https://huggingface.co/datasets/science-of-finetuning/diffing-stats-gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 25, 2024
Dataset authored and provided by
Science of Finetuning (Neel Nanda's MATS 7.0)
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Contains maximum activating examples for all the features of our crosscoder trained on gemma 2 2B layer 13 available here: https://huggingface.co/Butanium/gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04/blob/main/README.md

base_examples.pt contains all the maximum examples of the feature on a subset of validation test of fineweb chat_examples.pt is the same but for lmsys chat data chat_base_examples.pt is a merge of the two above files. All files are of the type dict[int, list[tuple[float… See the full description on the dataset page: https://huggingface.co/datasets/science-of-finetuning/diffing-stats-gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m

lmsys-chat-1m

lmsys/lmsys-chat-1m

Explore at:

190 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Sep 17, 2023

Dataset authored and provided by

Large Model Systems Organization

Description

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.

Clear search

Close search

Google apps

Main menu

lmsys-chat-1m

lmsys-chat-1m-qwen2-instruct

lmsys-chat-1m-jsonify-v2

lmsys-chat-Qwen2.5-1.5B-Instruct-1epoch-100k

mt_bench_prompts

ScaleBiO-Train-lmsys-chat-1m

lmsys-finance

Barcenas-lmsys-Dataset

lmsys-chat-tiny-20k

wild-if-eval

diffing-stats-gemma-2-2b-crosscoder-l13-mu4.1e-02-lr1e-04

lmsys-chat-1mSee More Versions

lmsys/lmsys-chat-1m

lmsys-chat-1m