64 datasets found
  1. reward-bench

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2, reward-bench [Dataset]. http://doi.org/10.57967/hf/2457
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Code | Leaderboard | Prior Preference Sets | Results | Paper

      Reward Bench Evaluation Dataset Card
    

    The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

    Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.

  2. reward-bench-2

    • huggingface.co
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-2 [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Code | Leaderboard | Results | Paper

      RewardBench 2 Evaluation Dataset Card
    

    The RewardBench 2 evaluation dataset is the new version of RewardBench that is based on unseen human data and designed to be substantially more difficult! RewardBench 2 evaluates capabilities of reward models over the following categories:

    Factuality (NEW!): Tests the ability of RMs to detect hallucinations and other basic errors in completions. Precise Instruction Following (NEW!): Tests the ability of RMs… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-2.

  3. P

    RewardBench Dataset

    • paperswithcode.com
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi (2025). RewardBench Dataset [Dataset]. https://paperswithcode.com/dataset/rewardbench
    Explore at:
    Dataset updated
    Jan 20, 2025
    Authors
    Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi
    Description

    RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models, including those trained with Direct Preference Optimization (DPO). It serves as the first evaluation tool for reward models and provides valuable insights into their performance and reliability¹.

    Here are the key components of RewardBench:

    Common Inference Code: The repository includes common inference code for various reward models, such as Starling, PairRM, OpenAssistant, and more. These models can be evaluated using the provided tools¹.

    Dataset and Evaluation: The RewardBench dataset consists of prompt-win-lose trios spanning chat, reasoning, and safety scenarios. It allows benchmarking reward models on challenging, structured, and out-of-distribution queries. The goal is to enhance scientific understanding of reward models and their behavior².

    Scripts for Evaluation:

    scripts/run_rm.py: Used to evaluate individual reward models. scripts/run_dpo.py: Used to evaluate direct preference optimization (DPO) models. scripts/train_rm.py: A basic reward model training script built on TRL (Transformer Reinforcement Learning)¹.

    Installation and Usage:

    Install PyTorch on your system. Install the required dependencies using pip install -e .. Set the environment variable HF_TOKEN with your token. To contribute your model to the leaderboard, open an issue on HuggingFace with the model name. For local model evaluation, follow the instructions in the repository¹.

    Remember that RewardBench provides a standardized way to assess reward models, ensuring transparency and comparability across different approaches. 🌟🔍

    (1) GitHub - allenai/reward-bench: RewardBench: the first evaluation tool .... https://github.com/allenai/reward-bench. (2) RewardBench: Evaluating Reward Models for Language Modeling. https://arxiv.org/abs/2403.13787. (3) RewardBench: Evaluating Reward Models for Language Modeling. https://paperswithcode.com/paper/rewardbench-evaluating-reward-models-for.

  4. reward-bench-results

    • huggingface.co
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-results
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    Description

    Results for Holisitic Evaluation of Reward Models (HERM) Benchmark

    Here, you'll find the raw scores for the HERM project. The repository is structured as follows. ├── best-of-n/ <- Nested directory for different completions on Best of N challenge | ├── alpaca_eval/ └── results for each reward model | | ├── tulu-13b/{org}/{model}.json
    | | └── zephyr-7b/{org}/{model}.json | └── mt_bench/
    |… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.

  5. h

    multilingual-reward-bench

    • huggingface.co
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cohere Labs Community (2025). multilingual-reward-bench [Dataset]. http://doi.org/10.57967/hf/3352
    Explore at:
    Dataset updated
    May 15, 2025
    Dataset authored and provided by
    Cohere Labs Community
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Multilingual Reward Bench (v1.0)

    Reward models (RMs) have driven the development of state-of-the-art LLMs today, with unprecedented impact across the globe. However, their performance in multilingual settings still remains understudied. In order to probe reward model behavior on multilingual data, we present M-RewardBench, a benchmark for 23 typologically diverse languages. M-RewardBench contains prompt-chosen-rejected preference triples obtained by curating and translating chat… See the full description on the dataset page: https://huggingface.co/datasets/CohereLabsCommunity/multilingual-reward-bench.

  6. reward-bench-2-results

    • huggingface.co
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-2-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2-results
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    Description

    allenai/reward-bench-2-results dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    agent-reward-bench

    • huggingface.co
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McGill NLP Group (2025). agent-reward-bench [Dataset]. https://huggingface.co/datasets/McGill-NLP/agent-reward-bench
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    McGill NLP Group
    Description

    AgentRewardBench

    💾Code 📄Paper 🌐Website

    🤗Dataset 💻Demo 🏆Leaderboard

    AgentRewardBench: Evaluating Automatic Evaluations of Web Agent TrajectoriesXing Han Lù, Amirhossein Kazemnejad*, Nicholas Meade, Arkil Patel, Dongchan Shin, Alejandra Zambrano, Karolina Stańczak, Peter Shaw, Christopher J. Pal, Siva Reddy*Core Contributor

      Loading dataset
    

    You can use the huggingface_hub library to load the dataset. The dataset is available on Huggingface Hub at… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/agent-reward-bench.

  8. fc-reward-bench

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research, fc-reward-bench [Dataset]. https://huggingface.co/datasets/ibm-research/fc-reward-bench
    Explore at:
    Dataset provided by
    IBMhttp://ibm.com/
    IBM Research
    Authors
    IBM Research
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    fc-reward-bench

    fc-reward-bench is a benchmark designed to evaluate reward model performance in function-calling tasks. It features 1,500 unique user inputs derived from the single-turn splits of the BFCL-v3 dataset. Each input is paired with both correct and incorrect function calls. Correct calls are sourced directly from BFCL, while incorrect calls are generated by 25 permissively licensed models.

      Dataset Structure
    

    Each entry in the dataset includes the following… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/fc-reward-bench.

  9. h

    R3-eval-reward-bench

    • huggingface.co
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rubricreward (2025). R3-eval-reward-bench [Dataset]. https://huggingface.co/datasets/rubricreward/R3-eval-reward-bench
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset authored and provided by
    rubricreward
    Description

    rubricreward/R3-eval-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    reward-bench-critique-alpacaeval-easy

    • huggingface.co
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    distilabel-internal-testing (2024). reward-bench-critique-alpacaeval-easy [Dataset]. https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 15, 2024
    Dataset authored and provided by
    distilabel-internal-testing
    Description

    Description

    This dataset is a small subset of allenai/reward-bench to test with our critique models. It was generated in the following way: from datasets import Dataset import pandas as pd

    from datasets import load_dataset

    ds = load_dataset("allenai/reward-bench", split="filtered")

    data = [] for row in ds.filter(lambda x: x["subset"] == "alpacaeval-easy"): for response in ["chosen", "rejected"]: model, is_chosen = (row["chosen_model"], True) if response == "chosen"… See the full description on the dataset page: https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy.

  11. h

    MM-RLHF-RewardBench

    • huggingface.co
    Updated Feb 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yi-Fan Zhang (2025). MM-RLHF-RewardBench [Dataset]. https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2025
    Authors
    Yi-Fan Zhang
    Description

    [📖 arXiv Paper] [📊 MM-RLHF Data] [📝 Homepage] [🏆 Reward Model] [🔮 MM-RewardBench] [🔮 MM-SafetyBench] [📈 Evaluation Suite]

      The Next Step Forward in Multimodal LLM Alignment
    

    [2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:

    A high-quality MLLM alignment dataset. A strong Critique-Based MLLM reward model and its training algorithm. A novel… See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench.

  12. h

    reward-bench-2-converted

    • huggingface.co
    Updated Jun 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    john02171574 (2025). reward-bench-2-converted [Dataset]. https://huggingface.co/datasets/john02171574/reward-bench-2-converted
    Explore at:
    Dataset updated
    Jun 19, 2025
    Authors
    john02171574
    Description

    john02171574/reward-bench-2-converted dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    allenai-reward-bench

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nour Guermazi (2025). allenai-reward-bench [Dataset]. https://huggingface.co/datasets/nourguermazi/allenai-reward-bench
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Nour Guermazi
    Description

    nourguermazi/allenai-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    reward-bench-chat-original

    • huggingface.co
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haoxiang Wang (2024). reward-bench-chat-original [Dataset]. https://huggingface.co/datasets/Haoxiang-Wang/reward-bench-chat-original
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2024
    Authors
    Haoxiang Wang
    Description

    Haoxiang-Wang/reward-bench-chat-original dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. Data from: LLF-Bench: Benchmark for Interactive Learning from Language...

    • giter.site
    • github.com
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft (2024). LLF-Bench: Benchmark for Interactive Learning from Language Feedback [Dataset]. https://giter.site/ac0123456/llf-bench
    Explore at:
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    Microsofthttp://microsoft.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    LLF Bench is a benchmark that provides a diverse collection of interactive learning problems where the agent gets language feedback instead of rewards (as in RL) or action feedback (as in imitation learning).

  16. h

    reward-bench-reasoning

    • huggingface.co
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sylvia Chen (2024). reward-bench-reasoning [Dataset]. https://huggingface.co/datasets/hsicat/reward-bench-reasoning
    Explore at:
    Dataset updated
    Jul 31, 2024
    Authors
    Sylvia Chen
    Description

    This evaluation dataset is the reasoning subset from allenai/reward-bench.

  17. h

    reward-bench

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingxuan Sun (2025). reward-bench [Dataset]. https://huggingface.co/datasets/PirxTion/reward-bench
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Jingxuan Sun
    Description

    PirxTion/reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    reward-bench-math

    • huggingface.co
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Madeleine Hueber (2025). reward-bench-math [Dataset]. https://huggingface.co/datasets/madhueb/reward-bench-math
    Explore at:
    Dataset updated
    Jul 3, 2025
    Authors
    Madeleine Hueber
    Description

    madhueb/reward-bench-math dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    R3-eval-reward-bench-new

    • huggingface.co
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rubricreward (2025). R3-eval-reward-bench-new [Dataset]. https://huggingface.co/datasets/rubricreward/R3-eval-reward-bench-new
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    rubricreward
    Description

    rubricreward/R3-eval-reward-bench-new dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    reward-bench-hacking-rewards-harmless-train-normal

    • huggingface.co
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Singh (2025). reward-bench-hacking-rewards-harmless-train-normal [Dataset]. https://huggingface.co/datasets/Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 20, 2025
    Authors
    Ayush Singh
    Description

    Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ai2, reward-bench [Dataset]. http://doi.org/10.57967/hf/2457
Organization logo

reward-bench

RM Bench

allenai/reward-bench

Explore at:
268 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License

https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

Description

Code | Leaderboard | Prior Preference Sets | Results | Paper

  Reward Bench Evaluation Dataset Card

The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.

Search
Clear search
Close search
Google apps
Main menu