Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
HelpSteer2: Open-source dataset for training top-performing reward models
HelpSteer2 is an open-source Helpfulness Dataset (CC-BY-4.0) that supports aligning models to become more helpful, factually correct and coherent, while being adjustable in terms of the complexity and verbosity of its responses. This dataset has been created in partnership with Scale AI. When used to tune a Llama 3.1 70B Instruct Model, we achieve 94.1% on RewardBench, which makes it the best Reward Model asโฆ See the full description on the dataset page: https://huggingface.co/datasets/nvidia/HelpSteer2.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a conversion of nvidia/HelpSteer2 into preference pairs based on the helpfulness score for training DPO. HelpSteer2-DPO is also licensed under CC-BY-4.0.
Dataset Description
In accordance with the following paper, HelpSteer2: Open-source dataset for training top-performing reward models we converted nvidia/HelpSteer2 dataset into a preference dataset by taking the response with the higher helpfulness score as the chosen response, with the remaining response being theโฆ See the full description on the dataset page: https://huggingface.co/datasets/Atsunori/HelpSteer2-DPO.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
juyoungml/HelpSteer2 dataset hosted on Hugging Face and contributed by the HF Datasets community
A reformatted version of nvidia/HelpSteer2 into both a multiturn config conversation and completion config config. A v4 UUID doc_id is shared across the same document in each config, source, conversation, and completion.
Nitral-AI/nvidia-HelpSteer2-ShareGPT dataset hosted on Hugging Face and contributed by the HF Datasets community
gagan3012/helpsteer2-preference-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community
This is the nvidia/Helpsteer2 training split binarized and sorted by length using the Llama3 tokenizer and categorized into multi- vs. single-turn subparts. The 500 splits contain chosen responses between 500-1000 tokens, the 1000 split 1000+ tokens. A multi-turn example requires at least one pair of User and Assistant besides the main resposne to be categorized as such. If you don't care, there is a combined split, which includes everything just binarized, but note that ids are not the sameโฆ See the full description on the dataset page: https://huggingface.co/datasets/root-signals/helpsteer2-binarized-granular-tiny.
Citation
@misc{wang2024helpsteer2preferencecomplementingratingspreferences, title={HelpSteer2-Preference: Complementing Ratings with Preferences}, author={Zhilin Wang and Alexander Bukharin and Olivier Delalleau and Daniel Egert and Gerald Shen and Jiaqi Zeng and Oleksii Kuchaiev and Yi Dong}, year={2024}, eprint={2410.01257}, archivePrefix={arXiv}, primaryClass={cs.LG}, url={https://arxiv.org/abs/2410.01257}, }
@misc{wang2024helpsteer2โฆ See the full description on the dataset page: https://huggingface.co/datasets/Jennny/helpsteer2-helpfulness-preference.
withpi/nvidia-HelpSteer2-group-label_normalized dataset hosted on Hugging Face and contributed by the HF Datasets community
withpi/nvidia-HelpSteer2-group-label-v2_tokenized_16k_euro dataset hosted on Hugging Face and contributed by the HF Datasets community
withpi/nvidia-HelpSteer2-group-label-v2_euro_st_tokenized_32k_1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Delta-Vector/Hydrus-HelpSteer2 dataset hosted on Hugging Face and contributed by the HF Datasets community
mimasss/llama3.2-3b-instruct-helpsteer2 dataset hosted on Hugging Face and contributed by the HF Datasets community
chrisliu298/helpsteer2-standard dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MasterGodzilla/HelpSteer2-Preference-WarmStart dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "HelpSteer2-incoherent"
More Information needed
saepark/preprocessed-helpsteer2-train-10k dataset hosted on Hugging Face and contributed by the HF Datasets community
saumyamalik/helpsteer2-rewardbench-contamination dataset hosted on Hugging Face and contributed by the HF Datasets community
saepark/preprocessed-helpsteer2-test-500 dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This is a binarized preference datasets from nvidia/HelpSteer2. HelpSteer2 is an open-source Helpfulness Dataset (CC-BY-4.0) that supports aligning models to become more helpful, factually correct and coherent, while being adjustable in terms of the complexity and verbosity of its responses. This dataset has been created in partnership with Scale AI. I processed the raw data by prioritizing helpfulness, correctness, and coherence to determine which responses were chosenโฆ See the full description on the dataset page: https://huggingface.co/datasets/AIR-hl/helpsteer2_preference.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
HelpSteer2: Open-source dataset for training top-performing reward models
HelpSteer2 is an open-source Helpfulness Dataset (CC-BY-4.0) that supports aligning models to become more helpful, factually correct and coherent, while being adjustable in terms of the complexity and verbosity of its responses. This dataset has been created in partnership with Scale AI. When used to tune a Llama 3.1 70B Instruct Model, we achieve 94.1% on RewardBench, which makes it the best Reward Model asโฆ See the full description on the dataset page: https://huggingface.co/datasets/nvidia/HelpSteer2.