64 datasets found

reward-bench
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2, reward-bench [Dataset]. http://doi.org/10.57967/hf/2457
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/2457
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
Code | Leaderboard | Prior Preference Sets | Results | Paper

Reward Bench Evaluation Dataset Card

The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.
reward-bench-2
huggingface.co
Updated Jun 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2025). reward-bench-2 [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2
Explore at:
Dataset updated
Jun 3, 2025
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
Code | Leaderboard | Results | Paper

RewardBench 2 Evaluation Dataset Card

The RewardBench 2 evaluation dataset is the new version of RewardBench that is based on unseen human data and designed to be substantially more difficult! RewardBench 2 evaluates capabilities of reward models over the following categories:

Factuality (NEW!): Tests the ability of RMs to detect hallucinations and other basic errors in completions. Precise Instruction Following (NEW!): Tests the ability of RMs… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-2.
P
RewardBench Dataset
paperswithcode.com
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi (2025). RewardBench Dataset [Dataset]. https://paperswithcode.com/dataset/rewardbench
Explore at:
Dataset updated
Jan 20, 2025
Authors
Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi
Description
RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models, including those trained with Direct Preference Optimization (DPO). It serves as the first evaluation tool for reward models and provides valuable insights into their performance and reliability¹.

Here are the key components of RewardBench:

Common Inference Code: The repository includes common inference code for various reward models, such as Starling, PairRM, OpenAssistant, and more. These models can be evaluated using the provided tools¹.

Dataset and Evaluation: The RewardBench dataset consists of prompt-win-lose trios spanning chat, reasoning, and safety scenarios. It allows benchmarking reward models on challenging, structured, and out-of-distribution queries. The goal is to enhance scientific understanding of reward models and their behavior².

Scripts for Evaluation:

scripts/run_rm.py: Used to evaluate individual reward models. scripts/run_dpo.py: Used to evaluate direct preference optimization (DPO) models. scripts/train_rm.py: A basic reward model training script built on TRL (Transformer Reinforcement Learning)¹.

Installation and Usage:

Install PyTorch on your system. Install the required dependencies using pip install -e .. Set the environment variable HF_TOKEN with your token. To contribute your model to the leaderboard, open an issue on HuggingFace with the model name. For local model evaluation, follow the instructions in the repository¹.

Remember that RewardBench provides a standardized way to assess reward models, ensuring transparency and comparability across different approaches. 🌟🔍

(1) GitHub - allenai/reward-bench: RewardBench: the first evaluation tool .... https://github.com/allenai/reward-bench. (2) RewardBench: Evaluating Reward Models for Language Modeling. https://arxiv.org/abs/2403.13787. (3) RewardBench: Evaluating Reward Models for Language Modeling. https://paperswithcode.com/paper/rewardbench-evaluating-reward-models-for.
reward-bench-results
huggingface.co
Updated Apr 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2025). reward-bench-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-results
Explore at:
Dataset updated
Apr 30, 2025
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
Description
Results for Holisitic Evaluation of Reward Models (HERM) Benchmark

Here, you'll find the raw scores for the HERM project. The repository is structured as follows. ├── best-of-n/ <- Nested directory for different completions on Best of N challenge | ├── alpaca_eval/ └── results for each reward model | | ├── tulu-13b/{org}/{model}.json
| | └── zephyr-7b/{org}/{model}.json | └── mt_bench/
|… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.
h
multilingual-reward-bench
huggingface.co
Updated May 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cohere Labs Community (2025). multilingual-reward-bench [Dataset]. http://doi.org/10.57967/hf/3352
Explore at:
Unique identifier
https://doi.org/10.57967/hf/3352
Dataset updated
May 15, 2025
Dataset authored and provided by
Cohere Labs Community
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
Multilingual Reward Bench (v1.0)

Reward models (RMs) have driven the development of state-of-the-art LLMs today, with unprecedented impact across the globe. However, their performance in multilingual settings still remains understudied. In order to probe reward model behavior on multilingual data, we present M-RewardBench, a benchmark for 23 typologically diverse languages. M-RewardBench contains prompt-chosen-rejected preference triples obtained by curating and translating chat… See the full description on the dataset page: https://huggingface.co/datasets/CohereLabsCommunity/multilingual-reward-bench.
reward-bench-2-results
huggingface.co
Updated Jun 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2025). reward-bench-2-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2-results
Explore at:
Dataset updated
Jun 3, 2025
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
Description
allenai/reward-bench-2-results dataset hosted on Hugging Face and contributed by the HF Datasets community
h
agent-reward-bench
huggingface.co
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
McGill NLP Group (2025). agent-reward-bench [Dataset]. https://huggingface.co/datasets/McGill-NLP/agent-reward-bench
Explore at:
Dataset updated
Apr 15, 2025
Dataset authored and provided by
McGill NLP Group
Description
AgentRewardBench

💾Code 📄Paper 🌐Website

🤗Dataset 💻Demo 🏆Leaderboard

AgentRewardBench: Evaluating Automatic Evaluations of Web Agent TrajectoriesXing Han Lù, Amirhossein Kazemnejad*, Nicholas Meade, Arkil Patel, Dongchan Shin, Alejandra Zambrano, Karolina Stańczak, Peter Shaw, Christopher J. Pal, Siva Reddy*Core Contributor

Loading dataset

You can use the huggingface_hub library to load the dataset. The dataset is available on Huggingface Hub at… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/agent-reward-bench.
fc-reward-bench
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
IBM Research, fc-reward-bench [Dataset]. https://huggingface.co/datasets/ibm-research/fc-reward-bench
Explore at:
Dataset provided by
IBMhttp://ibm.com/
IBM Research
Authors
IBM Research
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
fc-reward-bench

fc-reward-bench is a benchmark designed to evaluate reward model performance in function-calling tasks. It features 1,500 unique user inputs derived from the single-turn splits of the BFCL-v3 dataset. Each input is paired with both correct and incorrect function calls. Correct calls are sourced directly from BFCL, while incorrect calls are generated by 25 permissively licensed models.

Dataset Structure

Each entry in the dataset includes the following… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/fc-reward-bench.
h
R3-eval-reward-bench
huggingface.co
Updated May 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rubricreward (2025). R3-eval-reward-bench [Dataset]. https://huggingface.co/datasets/rubricreward/R3-eval-reward-bench
Explore at:
Dataset updated
May 21, 2025
Dataset authored and provided by
rubricreward
Description
rubricreward/R3-eval-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community
h
reward-bench-critique-alpacaeval-easy
huggingface.co
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
distilabel-internal-testing (2024). reward-bench-critique-alpacaeval-easy [Dataset]. https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 15, 2024
Dataset authored and provided by
distilabel-internal-testing
Description
Description

This dataset is a small subset of allenai/reward-bench to test with our critique models. It was generated in the following way: from datasets import Dataset import pandas as pd

from datasets import load_dataset

ds = load_dataset("allenai/reward-bench", split="filtered")

data = [] for row in ds.filter(lambda x: x["subset"] == "alpacaeval-easy"): for response in ["chosen", "rejected"]: model, is_chosen = (row["chosen_model"], True) if response == "chosen"… See the full description on the dataset page: https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy.
h
MM-RLHF-RewardBench
huggingface.co
Updated Feb 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yi-Fan Zhang (2025). MM-RLHF-RewardBench [Dataset]. https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2025
Authors
Yi-Fan Zhang
Description
[📖 arXiv Paper] [📊 MM-RLHF Data] [📝 Homepage] [🏆 Reward Model] [🔮 MM-RewardBench] [🔮 MM-SafetyBench] [📈 Evaluation Suite]

The Next Step Forward in Multimodal LLM Alignment

[2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:

A high-quality MLLM alignment dataset. A strong Critique-Based MLLM reward model and its training algorithm. A novel… See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF-RewardBench.
h
reward-bench-2-converted
huggingface.co
Updated Jun 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
john02171574 (2025). reward-bench-2-converted [Dataset]. https://huggingface.co/datasets/john02171574/reward-bench-2-converted
Explore at:
Dataset updated
Jun 19, 2025
Authors
john02171574
Description
john02171574/reward-bench-2-converted dataset hosted on Hugging Face and contributed by the HF Datasets community
h
allenai-reward-bench
huggingface.co
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nour Guermazi (2025). allenai-reward-bench [Dataset]. https://huggingface.co/datasets/nourguermazi/allenai-reward-bench
Explore at:
Dataset updated
Jun 12, 2025
Authors
Nour Guermazi
Description
nourguermazi/allenai-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community
h
reward-bench-chat-original
huggingface.co
Updated Dec 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haoxiang Wang (2024). reward-bench-chat-original [Dataset]. https://huggingface.co/datasets/Haoxiang-Wang/reward-bench-chat-original
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2024
Authors
Haoxiang Wang
Description
Haoxiang-Wang/reward-bench-chat-original dataset hosted on Hugging Face and contributed by the HF Datasets community
Data from: LLF-Bench: Benchmark for Interactive Learning from Language...
giter.site
github.com
Updated Dec 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Microsoft (2024). LLF-Bench: Benchmark for Interactive Learning from Language Feedback [Dataset]. https://giter.site/ac0123456/llf-bench
Explore at:
Dataset updated
Dec 6, 2024
Dataset provided by
Microsofthttp://microsoft.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
LLF Bench is a benchmark that provides a diverse collection of interactive learning problems where the agent gets language feedback instead of rewards (as in RL) or action feedback (as in imitation learning).
h
reward-bench-reasoning
huggingface.co
Updated Jul 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sylvia Chen (2024). reward-bench-reasoning [Dataset]. https://huggingface.co/datasets/hsicat/reward-bench-reasoning
Explore at:
Dataset updated
Jul 31, 2024
Authors
Sylvia Chen
Description
This evaluation dataset is the reasoning subset from allenai/reward-bench.
h
reward-bench
huggingface.co
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jingxuan Sun (2025). reward-bench [Dataset]. https://huggingface.co/datasets/PirxTion/reward-bench
Explore at:
Dataset updated
Jun 12, 2025
Authors
Jingxuan Sun
Description
PirxTion/reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community
h
reward-bench-math
huggingface.co
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Madeleine Hueber (2025). reward-bench-math [Dataset]. https://huggingface.co/datasets/madhueb/reward-bench-math
Explore at:
Dataset updated
Jul 3, 2025
Authors
Madeleine Hueber
Description
madhueb/reward-bench-math dataset hosted on Hugging Face and contributed by the HF Datasets community
h
R3-eval-reward-bench-new
huggingface.co
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rubricreward (2025). R3-eval-reward-bench-new [Dataset]. https://huggingface.co/datasets/rubricreward/R3-eval-reward-bench-new
Explore at:
Dataset updated
Jul 3, 2025
Dataset authored and provided by
rubricreward
Description
rubricreward/R3-eval-reward-bench-new dataset hosted on Hugging Face and contributed by the HF Datasets community
h
reward-bench-hacking-rewards-harmless-train-normal
huggingface.co
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayush Singh (2025). reward-bench-hacking-rewards-harmless-train-normal [Dataset]. https://huggingface.co/datasets/Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 20, 2025
Authors
Ayush Singh
Description
Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Ai2, reward-bench [Dataset]. http://doi.org/10.57967/hf/2457

reward-bench

RM Bench

allenai/reward-bench

Explore at:

268 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.57967/hf/2457

Dataset provided by

Allen Institute for AIhttp://allenai.org/

Authors

Ai2

License

https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

Description

Code | Leaderboard | Prior Preference Sets | Results | Paper

  Reward Bench Evaluation Dataset Card

The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.

Clear search

Close search

Google apps

Main menu

reward-bench

reward-bench-2

RewardBench Dataset

reward-bench-results

multilingual-reward-bench

reward-bench-2-results

agent-reward-bench

fc-reward-bench

R3-eval-reward-bench

reward-bench-critique-alpacaeval-easy

MM-RLHF-RewardBench

reward-bench-2-converted

allenai-reward-bench

reward-bench-chat-original

Data from: LLF-Bench: Benchmark for Interactive Learning from Language...

reward-bench-reasoning

reward-bench

reward-bench-math

R3-eval-reward-bench-new

reward-bench-hacking-rewards-harmless-train-normal

reward-bench

RM Bench

allenai/reward-bench