4 datasets found
  1. h

    Penguin

    • huggingface.co
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    crz (2025). Penguin [Dataset]. https://huggingface.co/datasets/RuizheChen/Penguin
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2025
    Authors
    crz
    Description

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/RuizheChen/Penguin.

  2. h

    lmsys-chat-1m

    • huggingface.co
    Updated Sep 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/1-800-SHARED-TASKS/lmsys-chat-1m
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2024
    Dataset authored and provided by
    1-800-SHARED-TASKS
    Description

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of… See the full description on the dataset page: https://huggingface.co/datasets/1-800-SHARED-TASKS/lmsys-chat-1m.

  3. h

    lmsys-chat-1m

    • huggingface.co
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aarush Sah (2024). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/AarushSah/lmsys-chat-1m
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 8, 2024
    Authors
    Aarush Sah
    Description

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of… See the full description on the dataset page: https://huggingface.co/datasets/AarushSah/lmsys-chat-1m.

  4. h

    lmsys-chat-1m

    • huggingface.co
    • opendatalab.com
    Updated Sep 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Large Model Systems Organization (2023). lmsys-chat-1m [Dataset]. https://huggingface.co/datasets/lmsys/lmsys-chat-1m
    Explore at:
    Dataset updated
    Sep 17, 2023
    Dataset authored and provided by
    Large Model Systems Organization
    Description

    LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

    This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/lmsys-chat-1m.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
crz (2025). Penguin [Dataset]. https://huggingface.co/datasets/RuizheChen/Penguin

Penguin

RuizheChen/Penguin

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 20, 2025
Authors
crz
Description

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

This dataset contains one million real-world conversations with 25 state-of-the-art LLMs. It is collected from 210K unique IP addresses in the wild on the Vicuna demo and Chatbot Arena website from April to August 2023. Each sample includes a conversation ID, model name, conversation text in OpenAI API JSON format, detected language tag, and OpenAI moderation API tag. User consent is obtained through the "Terms of use"… See the full description on the dataset page: https://huggingface.co/datasets/RuizheChen/Penguin.

Search
Clear search
Close search
Google apps
Main menu