80 datasets found
  1. hh-rlhf

    • huggingface.co
    • opendatalab.com
    Updated Feb 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2023). hh-rlhf [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2023
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is part of the Anthropic's HH data used to train their RLHF Assistant https://github.com/anthropics/hh-rlhf. The data contains the first utterance from human to the dialog agent and the number of words in that utterance. The sampled version is a random sample of size 200.

  2. h

    rm-hh-rlhf

    • huggingface.co
    Updated Apr 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Havrilla (2023). rm-hh-rlhf [Dataset]. https://huggingface.co/datasets/Dahoas/rm-hh-rlhf
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 19, 2023
    Authors
    Alex Havrilla
    Description

    Dahoas/rm-hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. hh-rlhf-h4

    • huggingface.co
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2023). hh-rlhf-h4 [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf-h4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    Description

    Dataset Card for "hh-rlhf-h4"

    More Information needed

  4. h

    hh-rlhf-Rule17

    • huggingface.co
    Updated Jan 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    XL (2025). hh-rlhf-Rule17 [Dataset]. https://huggingface.co/datasets/HFXM/hh-rlhf-Rule17
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2025
    Authors
    XL
    Description

    HFXM/hh-rlhf-Rule17 dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    hh-rlhf-dpo

    • huggingface.co
    Updated Feb 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhili Feng (2024). hh-rlhf-dpo [Dataset]. https://huggingface.co/datasets/zekeZZ/hh-rlhf-dpo
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 27, 2024
    Authors
    Zhili Feng
    Description

    zekeZZ/hh-rlhf-dpo dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    hh_rlhf_cn

    • huggingface.co
    Updated Aug 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deng yong (2023). hh_rlhf_cn [Dataset]. https://huggingface.co/datasets/dikw/hh_rlhf_cn
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2023
    Authors
    deng yong
    License

    https://choosealicense.com/licenses/llama2/https://choosealicense.com/licenses/llama2/

    Description

    hh-rlhf中文翻译版本

    基于Anthropic论文Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback 开源的helpful 和harmless数据,使用翻译工具进行了翻译。hh_rlhf_train.jsonl 合并中英文训练集数据 清洗过后17万条hh_rlhf_test.jsonl 合并中英文测试集数据 清洗过后9千条harmless_base_cn_train.jsonl 42394条harmless_base_cn_test.jsonl 2304条helpful_base_cn_train.jsonl 43722条helpful_base_cn_test.jsonl 2346条

      实验报告
    

    相关rlhf实验报告:https://zhuanlan.zhihu.com/p/652044120

  7. h4-anthropic-hh-rlhf-helpful-base-gen

    • huggingface.co
    Updated Apr 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2024). h4-anthropic-hh-rlhf-helpful-base-gen [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/h4-anthropic-hh-rlhf-helpful-base-gen
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 1, 2024
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    Description

    HuggingFaceH4/h4-anthropic-hh-rlhf-helpful-base-gen dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    hh-rlhf-en

    • huggingface.co
    Updated Feb 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanghyun Byun (2025). hh-rlhf-en [Dataset]. https://huggingface.co/datasets/shbyun080/hh-rlhf-en
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2025
    Authors
    Sanghyun Byun
    Description

    shbyun080/hh-rlhf-en dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    hh-rlhf-base

    • huggingface.co
    Updated Feb 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wang (2025). hh-rlhf-base [Dataset]. https://huggingface.co/datasets/xinpeng/hh-rlhf-base
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2025
    Authors
    wang
    Description

    HH-RLHF-Harmless-Base Dataset

      Summary
    

    The HH-RLHF-Harmless-Base dataset is a processed version of Anthropic's HH-RLHF dataset, specifically curated to train models using the TRL library for preference learning and alignment tasks. It contains pairs of text samples, each labeled as either "chosen" or "rejected," based on human preferences regarding the harmlessness of the responses. This dataset enables models to learn human preferences in generating harmless responses… See the full description on the dataset page: https://huggingface.co/datasets/xinpeng/hh-rlhf-base.

  10. h

    hh-rlhf-helpful-base

    • huggingface.co
    Updated Nov 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jise Shen (2024). hh-rlhf-helpful-base [Dataset]. https://huggingface.co/datasets/Jise/hh-rlhf-helpful-base
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 22, 2024
    Authors
    Jise Shen
    Description

    Jise/hh-rlhf-helpful-base dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    processed-hh-rlhf

    • huggingface.co
    Updated Jan 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russel (2025). processed-hh-rlhf [Dataset]. https://huggingface.co/datasets/rshwndsz/processed-hh-rlhf
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 14, 2025
    Authors
    Russel
    Description

    rshwndsz/processed-hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    hh-rlhf

    • huggingface.co
    Updated May 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jyz (2024). hh-rlhf [Dataset]. https://huggingface.co/datasets/Lumen1123/hh-rlhf
    Explore at:
    Dataset updated
    May 5, 2024
    Authors
    jyz
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Lumen1123/hh-rlhf dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    trojan-hh-rlhf-golden

    • huggingface.co
    Updated Sep 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    trojan-hh-rlhf-golden [Dataset]. https://huggingface.co/datasets/Baidicoot/trojan-hh-rlhf-golden
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 12, 2024
    Authors
    Aidan Ewart
    Description

    Baidicoot/trojan-hh-rlhf-golden dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    hh-rlhf-helpful-base

    • huggingface.co
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRL (2025). hh-rlhf-helpful-base [Dataset]. https://huggingface.co/datasets/trl-lib/hh-rlhf-helpful-base
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 8, 2025
    Dataset authored and provided by
    TRL
    Description

    HH-RLHF-Helpful-Base Dataset

      Summary
    

    The HH-RLHF-Helpful-Base dataset is a processed version of Anthropic's HH-RLHF dataset, specifically curated to train models using the TRL library for preference learning and alignment tasks. It contains pairs of text samples, each labeled as either "chosen" or "rejected," based on human preferences regarding the helpfulness of the responses. This dataset enables models to learn human preferences in generating helpful responses… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/hh-rlhf-helpful-base.

  15. h

    hh-rlhf-h4

    • huggingface.co
    Updated Jan 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    On-Device-LLM (2024). hh-rlhf-h4 [Dataset]. https://huggingface.co/datasets/ondevicellm/hh-rlhf-h4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 31, 2024
    Dataset authored and provided by
    On-Device-LLM
    Description

    ondevicellm/hh-rlhf-h4 dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    hh-rlhf_with_features_flan_t5_large_lll_relabeled

    • huggingface.co
    Updated Dec 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Germán Kruszewski (2023). hh-rlhf_with_features_flan_t5_large_lll_relabeled [Dataset]. https://huggingface.co/datasets/germank/hh-rlhf_with_features_flan_t5_large_lll_relabeled
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2023
    Authors
    Germán Kruszewski
    Description

    Dataset Card for "hh-rlhf_with_features_flan_t5_large_lll_relabeled"

    More Information needed

  17. h

    hh-rlhf-Rule2

    • huggingface.co
    Updated Dec 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    XL (2024). hh-rlhf-Rule2 [Dataset]. https://huggingface.co/datasets/HFXM/hh-rlhf-Rule2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2024
    Authors
    XL
    Description

    HFXM/hh-rlhf-Rule2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    hh-rlhf-entropy-rule5-b0-84

    • huggingface.co
    Updated Dec 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hh-rlhf-entropy-rule5-b0-84 [Dataset]. https://huggingface.co/datasets/fjxdaisy/hh-rlhf-entropy-rule5-b0-84
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 24, 2024
    Authors
    Jingxuan Fan
    Description

    fjxdaisy/hh-rlhf-entropy-rule5-b0-84 dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. R

    Hh Dataset

    • universe.roboflow.com
    zip
    Updated Mar 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenqiang Wu (2024). Hh Dataset [Dataset]. https://universe.roboflow.com/wenqiang-wu/hh-wb8si
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 19, 2024
    Dataset authored and provided by
    Wenqiang Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    K Lk L K Lklkl Bounding Boxes
    Description

    Hh

    ## Overview
    
    Hh is a dataset for object detection tasks - it contains K Lk L K Lklkl annotations for 1,436 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  20. h

    hh-rlhf-eval

    • huggingface.co
    Updated Dec 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    WeepCat (2024). hh-rlhf-eval [Dataset]. https://huggingface.co/datasets/weepcat/hh-rlhf-eval
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2024
    Authors
    WeepCat
    Description

    weepcat/hh-rlhf-eval dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hugging Face H4 (2023). hh-rlhf [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/hh-rlhf
Organization logo

hh-rlhf

HuggingFaceH4/hh-rlhf

Explore at:
464 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2023
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset is part of the Anthropic's HH data used to train their RLHF Assistant https://github.com/anthropics/hh-rlhf. The data contains the first utterance from human to the dialog agent and the number of words in that utterance. The sampled version is a random sample of size 200.

Search
Clear search
Close search
Google apps
Main menu