2 datasets found

h
UltraFeedback
huggingface.co
Updated Nov 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RobinZ (2023). UltraFeedback [Dataset]. https://huggingface.co/datasets/zhengr/UltraFeedback
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 26, 2023
Authors
RobinZ
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Introduction

GitHub Repo UltraRM-13b UltraCM-13b

UltraFeedback is a large-scale, fine-grained, diverse preference dataset, used for training powerful reward models and critic models. We collect about 64k prompts from diverse resources (including UltraChat, ShareGPT, Evol-Instruct, TruthfulQA, FalseQA, and FLAN). We then use these prompts to query multiple LLMs (see Table for model lists) and generate 4 different responses for each prompt, resulting in a total of 256k samples. To… See the full description on the dataset page: https://huggingface.co/datasets/zhengr/UltraFeedback.
h
UltraFeedback
huggingface.co
opendatalab.com
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenBMB (2023). UltraFeedback [Dataset]. https://huggingface.co/datasets/openbmb/UltraFeedback
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2023
Dataset authored and provided by
OpenBMB
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Introduction

GitHub Repo UltraRM-13b UltraCM-13b

UltraFeedback is a large-scale, fine-grained, diverse preference dataset, used for training powerful reward models and critic models. We collect about 64k prompts from diverse resources (including UltraChat, ShareGPT, Evol-Instruct, TruthfulQA, FalseQA, and FLAN). We then use these prompts to query multiple LLMs (see Table for model lists) and generate 4 different responses for each prompt, resulting in a total of 256k samples. To… See the full description on the dataset page: https://huggingface.co/datasets/openbmb/UltraFeedback.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

RobinZ (2023). UltraFeedback [Dataset]. https://huggingface.co/datasets/zhengr/UltraFeedback

UltraFeedback

zhengr/UltraFeedback

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Nov 26, 2023

Authors

RobinZ

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Introduction

GitHub Repo UltraRM-13b UltraCM-13b

UltraFeedback is a large-scale, fine-grained, diverse preference dataset, used for training powerful reward models and critic models. We collect about 64k prompts from diverse resources (including UltraChat, ShareGPT, Evol-Instruct, TruthfulQA, FalseQA, and FLAN). We then use these prompts to query multiple LLMs (see Table for model lists) and generate 4 different responses for each prompt, resulting in a total of 256k samples. To… See the full description on the dataset page: https://huggingface.co/datasets/zhengr/UltraFeedback.

Clear search

Close search

Google apps

Main menu