80 datasets found

hh-rlhf
huggingface.co
opendatalab.com
Updated Feb 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face H4 (2023). hh-rlhf [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is part of the Anthropic's HH data used to train their RLHF Assistant https://github.com/anthropics/hh-rlhf. The data contains the first utterance from human to the dialog agent and the number of words in that utterance. The sampled version is a random sample of size 200.
h
rm-hh-rlhf
huggingface.co
Updated Apr 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Havrilla (2023). rm-hh-rlhf [Dataset]. https://huggingface.co/datasets/Dahoas/rm-hh-rlhf
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 19, 2023
Authors
Alex Havrilla
Description
Dahoas/rm-hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community
hh-rlhf-h4
huggingface.co
Updated Nov 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face H4 (2023). hh-rlhf-h4 [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf-h4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 29, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
Description
Dataset Card for "hh-rlhf-h4"

More Information needed
h
hh-rlhf-Rule17
huggingface.co
Updated Jan 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XL (2025). hh-rlhf-Rule17 [Dataset]. https://huggingface.co/datasets/HFXM/hh-rlhf-Rule17
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 2, 2025
Authors
XL
Description
HFXM/hh-rlhf-Rule17 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf-dpo
huggingface.co
Updated Feb 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhili Feng (2024). hh-rlhf-dpo [Dataset]. https://huggingface.co/datasets/zekeZZ/hh-rlhf-dpo
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 27, 2024
Authors
Zhili Feng
Description
zekeZZ/hh-rlhf-dpo dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh_rlhf_cn
huggingface.co
Updated Aug 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
deng yong (2023). hh_rlhf_cn [Dataset]. https://huggingface.co/datasets/dikw/hh_rlhf_cn
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 18, 2023
Authors
deng yong
License
https://choosealicense.com/licenses/llama2/https://choosealicense.com/licenses/llama2/
Description
hh-rlhf中文翻译版本

基于Anthropic论文Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback 开源的helpful 和harmless数据，使用翻译工具进行了翻译。hh_rlhf_train.jsonl 合并中英文训练集数据清洗过后17万条hh_rlhf_test.jsonl 合并中英文测试集数据清洗过后9千条harmless_base_cn_train.jsonl 42394条harmless_base_cn_test.jsonl 2304条helpful_base_cn_train.jsonl 43722条helpful_base_cn_test.jsonl 2346条

实验报告

相关rlhf实验报告:https://zhuanlan.zhihu.com/p/652044120
h4-anthropic-hh-rlhf-helpful-base-gen
huggingface.co
Updated Apr 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face H4 (2024). h4-anthropic-hh-rlhf-helpful-base-gen [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/h4-anthropic-hh-rlhf-helpful-base-gen
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2024
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
Description
HuggingFaceH4/h4-anthropic-hh-rlhf-helpful-base-gen dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf-en
huggingface.co
Updated Feb 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanghyun Byun (2025). hh-rlhf-en [Dataset]. https://huggingface.co/datasets/shbyun080/hh-rlhf-en
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 11, 2025
Authors
Sanghyun Byun
Description
shbyun080/hh-rlhf-en dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf-base
huggingface.co
Updated Feb 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wang (2025). hh-rlhf-base [Dataset]. https://huggingface.co/datasets/xinpeng/hh-rlhf-base
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2025
Authors
wang
Description
HH-RLHF-Harmless-Base Dataset

Summary

The HH-RLHF-Harmless-Base dataset is a processed version of Anthropic's HH-RLHF dataset, specifically curated to train models using the TRL library for preference learning and alignment tasks. It contains pairs of text samples, each labeled as either "chosen" or "rejected," based on human preferences regarding the harmlessness of the responses. This dataset enables models to learn human preferences in generating harmless responses… See the full description on the dataset page: https://huggingface.co/datasets/xinpeng/hh-rlhf-base.
h
hh-rlhf-helpful-base
huggingface.co
Updated Nov 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jise Shen (2024). hh-rlhf-helpful-base [Dataset]. https://huggingface.co/datasets/Jise/hh-rlhf-helpful-base
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 22, 2024
Authors
Jise Shen
Description
Jise/hh-rlhf-helpful-base dataset hosted on Hugging Face and contributed by the HF Datasets community
h
processed-hh-rlhf
huggingface.co
Updated Jan 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Russel (2025). processed-hh-rlhf [Dataset]. https://huggingface.co/datasets/rshwndsz/processed-hh-rlhf
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 14, 2025
Authors
Russel
Description
rshwndsz/processed-hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf
huggingface.co
Updated May 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jyz (2024). hh-rlhf [Dataset]. https://huggingface.co/datasets/Lumen1123/hh-rlhf
Explore at:
Dataset updated
May 5, 2024
Authors
jyz
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Lumen1123/hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community
h
trojan-hh-rlhf-golden
huggingface.co
Updated Sep 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
trojan-hh-rlhf-golden [Dataset]. https://huggingface.co/datasets/Baidicoot/trojan-hh-rlhf-golden
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2024
Authors
Aidan Ewart
Description
Baidicoot/trojan-hh-rlhf-golden dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf-helpful-base
huggingface.co
Updated Jan 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRL (2025). hh-rlhf-helpful-base [Dataset]. https://huggingface.co/datasets/trl-lib/hh-rlhf-helpful-base
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 8, 2025
Dataset authored and provided by
TRL
Description
HH-RLHF-Helpful-Base Dataset

Summary

The HH-RLHF-Helpful-Base dataset is a processed version of Anthropic's HH-RLHF dataset, specifically curated to train models using the TRL library for preference learning and alignment tasks. It contains pairs of text samples, each labeled as either "chosen" or "rejected," based on human preferences regarding the helpfulness of the responses. This dataset enables models to learn human preferences in generating helpful responses… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/hh-rlhf-helpful-base.
h
hh-rlhf-h4
huggingface.co
Updated Jan 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
On-Device-LLM (2024). hh-rlhf-h4 [Dataset]. https://huggingface.co/datasets/ondevicellm/hh-rlhf-h4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 31, 2024
Dataset authored and provided by
On-Device-LLM
Description
ondevicellm/hh-rlhf-h4 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf_with_features_flan_t5_large_lll_relabeled
huggingface.co
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Germán Kruszewski (2023). hh-rlhf_with_features_flan_t5_large_lll_relabeled [Dataset]. https://huggingface.co/datasets/germank/hh-rlhf_with_features_flan_t5_large_lll_relabeled
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 15, 2023
Authors
Germán Kruszewski
Description
Dataset Card for "hh-rlhf_with_features_flan_t5_large_lll_relabeled"

More Information needed
h
hh-rlhf-Rule2
huggingface.co
Updated Dec 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XL (2024). hh-rlhf-Rule2 [Dataset]. https://huggingface.co/datasets/HFXM/hh-rlhf-Rule2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2024
Authors
XL
Description
HFXM/hh-rlhf-Rule2 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
hh-rlhf-entropy-rule5-b0-84
huggingface.co
Updated Dec 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
hh-rlhf-entropy-rule5-b0-84 [Dataset]. https://huggingface.co/datasets/fjxdaisy/hh-rlhf-entropy-rule5-b0-84
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 24, 2024
Authors
Jingxuan Fan
Description
fjxdaisy/hh-rlhf-entropy-rule5-b0-84 dataset hosted on Hugging Face and contributed by the HF Datasets community
R
Hh Dataset
universe.roboflow.com
zip
Updated Mar 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenqiang Wu (2024). Hh Dataset [Dataset]. https://universe.roboflow.com/wenqiang-wu/hh-wb8si
Explore at:
zipAvailable download formats
Dataset updated
Mar 19, 2024
Dataset authored and provided by
Wenqiang Wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
K Lk L K Lklkl Bounding Boxes
Description
Hh

## Overview Hh is a dataset for object detection tasks - it contains K Lk L K Lklkl annotations for 1,436 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
h
hh-rlhf-eval
huggingface.co
Updated Dec 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WeepCat (2024). hh-rlhf-eval [Dataset]. https://huggingface.co/datasets/weepcat/hh-rlhf-eval
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 17, 2024
Authors
WeepCat
Description
weepcat/hh-rlhf-eval dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Hugging Face H4 (2023). hh-rlhf [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf

hh-rlhf

HuggingFaceH4/hh-rlhf

Explore at:

464 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 17, 2023

Dataset provided by

Hugging Facehttps://huggingface.co/

Authors

Hugging Face H4

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset is part of the Anthropic's HH data used to train their RLHF Assistant https://github.com/anthropics/hh-rlhf. The data contains the first utterance from human to the dialog agent and the number of words in that utterance. The sampled version is a random sample of size 200.

Clear search

Close search

Google apps

Main menu

hh-rlhf

rm-hh-rlhf

hh-rlhf-h4

hh-rlhf-Rule17

hh-rlhf-dpo

hh_rlhf_cn

h4-anthropic-hh-rlhf-helpful-base-gen

hh-rlhf-en

hh-rlhf-base

hh-rlhf-helpful-base

processed-hh-rlhf

hh-rlhf

trojan-hh-rlhf-golden

hh-rlhf-helpful-base

hh-rlhf-h4

hh-rlhf_with_features_flan_t5_large_lll_relabeled

hh-rlhf-Rule2

hh-rlhf-entropy-rule5-b0-84

Hh Dataset

Hh

hh-rlhf-eval

hh-rlhf

HuggingFaceH4/hh-rlhf