67 datasets found
  1. reward-bench

    • huggingface.co
    Updated Mar 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2024). reward-bench [Dataset]. http://doi.org/10.57967/hf/2457
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2024
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Code | Leaderboard | Prior Preference Sets | Results | Paper

      Reward Bench Evaluation Dataset Card
    

    The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

    Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.

  2. P

    RewardBench Dataset

    • paperswithcode.com
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi (2024). RewardBench Dataset [Dataset]. https://paperswithcode.com/dataset/rewardbench
    Explore at:
    Dataset updated
    Apr 12, 2024
    Authors
    Nathan Lambert; Valentina Pyatkin; Jacob Morrison; LJ Miranda; Bill Yuchen Lin; Khyathi Chandu; Nouha Dziri; Sachin Kumar; Tom Zick; Yejin Choi; Noah A. Smith; Hannaneh Hajishirzi
    Description

    RewardBench is a benchmark designed to evaluate the capabilities and safety of reward models, including those trained with Direct Preference Optimization (DPO). It serves as the first evaluation tool for reward models and provides valuable insights into their performance and reliability¹.

    Here are the key components of RewardBench:

    Common Inference Code: The repository includes common inference code for various reward models, such as Starling, PairRM, OpenAssistant, and more. These models can be evaluated using the provided tools¹.

    Dataset and Evaluation: The RewardBench dataset consists of prompt-win-lose trios spanning chat, reasoning, and safety scenarios. It allows benchmarking reward models on challenging, structured, and out-of-distribution queries. The goal is to enhance scientific understanding of reward models and their behavior².

    Scripts for Evaluation:

    scripts/run_rm.py: Used to evaluate individual reward models. scripts/run_dpo.py: Used to evaluate direct preference optimization (DPO) models. scripts/train_rm.py: A basic reward model training script built on TRL (Transformer Reinforcement Learning)¹.

    Installation and Usage:

    Install PyTorch on your system. Install the required dependencies using pip install -e .. Set the environment variable HF_TOKEN with your token. To contribute your model to the leaderboard, open an issue on HuggingFace with the model name. For local model evaluation, follow the instructions in the repository¹.

    Remember that RewardBench provides a standardized way to assess reward models, ensuring transparency and comparability across different approaches. 🌟🔍

    (1) GitHub - allenai/reward-bench: RewardBench: the first evaluation tool .... https://github.com/allenai/reward-bench. (2) RewardBench: Evaluating Reward Models for Language Modeling. https://arxiv.org/abs/2403.13787. (3) RewardBench: Evaluating Reward Models for Language Modeling. https://paperswithcode.com/paper/rewardbench-evaluating-reward-models-for.

  3. reward-bench-2

    • huggingface.co
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-2 [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Code | Leaderboard | Results | Paper

      RewardBench 2 Evaluation Dataset Card
    

    The RewardBench 2 evaluation dataset is the new version of RewardBench that is based on unseen human data and designed to be substantially more difficult! RewardBench 2 evaluates capabilities of reward models over the following categories:

    Factuality (NEW!): Tests the ability of RMs to detect hallucinations and other basic errors in completions. Precise Instruction Following (NEW!): Tests the ability of RMs… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-2.

  4. reward-bench-results

    • huggingface.co
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-results
    Explore at:
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    Description

    Results for Holisitic Evaluation of Reward Models (HERM) Benchmark

    Here, you'll find the raw scores for the HERM project. The repository is structured as follows. ├── best-of-n/ <- Nested directory for different completions on Best of N challenge | ├── alpaca_eval/ └── results for each reward model | | ├── tulu-13b/{org}/{model}.json
    | | └── zephyr-7b/{org}/{model}.json | └── mt_bench/
    |… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench-results.

  5. h

    multilingual-reward-bench

    • huggingface.co
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cohere Labs Community (2025). multilingual-reward-bench [Dataset]. http://doi.org/10.57967/hf/3352
    Explore at:
    Dataset updated
    May 15, 2025
    Dataset authored and provided by
    Cohere Labs Community
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Multilingual Reward Bench (v1.0)

    Reward models (RMs) have driven the development of state-of-the-art LLMs today, with unprecedented impact across the globe. However, their performance in multilingual settings still remains understudied. In order to probe reward model behavior on multilingual data, we present M-RewardBench, a benchmark for 23 typologically diverse languages. M-RewardBench contains prompt-chosen-rejected preference triples obtained by curating and translating chat… See the full description on the dataset page: https://huggingface.co/datasets/CohereLabsCommunity/multilingual-reward-bench.

  6. reward-bench-2-results

    • huggingface.co
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2025). reward-bench-2-results [Dataset]. https://huggingface.co/datasets/allenai/reward-bench-2-results
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    Description

    allenai/reward-bench-2-results dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    agent-reward-bench

    • huggingface.co
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McGill NLP Group (2025). agent-reward-bench [Dataset]. https://huggingface.co/datasets/McGill-NLP/agent-reward-bench
    Explore at:
    Dataset updated
    Apr 15, 2025
    Dataset authored and provided by
    McGill NLP Group
    Description

    AgentRewardBench

    💾Code 📄Paper 🌐Website

    🤗Dataset 💻Demo 🏆Leaderboard

    AgentRewardBench: Evaluating Automatic Evaluations of Web Agent TrajectoriesXing Han Lù, Amirhossein Kazemnejad*, Nicholas Meade, Arkil Patel, Dongchan Shin, Alejandra Zambrano, Karolina Stańczak, Peter Shaw, Christopher J. Pal, Siva Reddy*Core Contributor

      Loading dataset
    

    You can use the huggingface_hub library to load the dataset. The dataset is available on Huggingface Hub at… See the full description on the dataset page: https://huggingface.co/datasets/McGill-NLP/agent-reward-bench.

  8. h

    MM-RLHF

    • huggingface.co
    Updated Feb 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yi-Fan Zhang (2025). MM-RLHF [Dataset]. https://huggingface.co/datasets/yifanzhang114/MM-RLHF
    Explore at:
    Dataset updated
    Feb 17, 2025
    Authors
    Yi-Fan Zhang
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    [📖 arXiv Paper] [📊 Training Code] [📝 Homepage] [🏆 Reward Model] [🔮 MM-RewardBench] [🔮 MM-SafetyBench] [📈 Evaluation Suite]

      The Next Step Forward in Multimodal LLM Alignment
    

    [2025/02/10] 🔥 We are proud to open-source MM-RLHF, a comprehensive project for aligning Multimodal Large Language Models (MLLMs) with human preferences. This release includes:

    A high-quality MLLM alignment dataset. A strong Critique-Based MLLM reward model and its training algorithm. A novel… See the full description on the dataset page: https://huggingface.co/datasets/yifanzhang114/MM-RLHF.

  9. P

    VL-RewardBench Dataset

    • paperswithcode.com
    Updated May 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lei LI; Yuancheng Wei; Zhihui Xie; Xuqing Yang; YiFan Song; Peiyi Wang; Chenxin An; Tianyu Liu; Sujian Li; Bill Yuchen Lin; Lingpeng Kong; Qi Liu (2025). VL-RewardBench Dataset [Dataset]. https://paperswithcode.com/dataset/vl-rewardbench
    Explore at:
    Dataset updated
    May 23, 2025
    Authors
    Lei LI; Yuancheng Wei; Zhihui Xie; Xuqing Yang; YiFan Song; Peiyi Wang; Chenxin An; Tianyu Liu; Sujian Li; Bill Yuchen Lin; Lingpeng Kong; Qi Liu
    Description

    Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and evaluating multimodal AI systems, yet their own evaluation remains under-explored. Current assessment methods primarily rely on AI-annotated preference labels from traditional VL tasks, which can introduce biases and often fail to effectively challenge state-of-the-art models. To address these limitations, we introduce VL-RewardBench, a comprehensive benchmark spanning general multimodal queries, visual hallucination detection, and complex reasoning tasks. Through our AI-assisted annotation pipeline combining sample selection with human verification, we curate 1,250 high-quality examples specifically designed to probe model limitations. Comprehensive evaluation across 16 leading large vision-language models, demonstrates VL-RewardBench's effectiveness as a challenging testbed, where even GPT-4o achieves only 65.4% accuracy, and state-of-the-art open-source models such as Qwen2-VL-72B, struggle to surpass random-guessing. Importantly, performance on VL-RewardBench strongly correlates (Pearson's r > 0.9) with MMMU-Pro accuracy using Best-of-N sampling with VL-GenRMs. Analysis experiments uncover three critical insights for improving VL-GenRMs: (i) models predominantly fail at basic visual perception tasks rather than reasoning tasks; (ii) inference-time scaling benefits vary dramatically by model capacity; and (iii) training VL-GenRMs to learn to judge substantially boosts judgment capability (+14.7% accuracy for a 7B VL-GenRM). We believe VL-RewardBench along with the experimental insights will become a valuable resource for advancing VL-GenRMs.

  10. h

    R3-eval-reward-bench

    • huggingface.co
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rubricreward (2025). R3-eval-reward-bench [Dataset]. https://huggingface.co/datasets/rubricreward/R3-eval-reward-bench
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset authored and provided by
    rubricreward
    Description

    rubricreward/R3-eval-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. fc-reward-bench

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IBM Research, fc-reward-bench [Dataset]. https://huggingface.co/datasets/ibm-research/fc-reward-bench
    Explore at:
    Dataset provided by
    IBM Research
    IBMhttp://ibm.com/
    Authors
    IBM Research
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    fc-reward-bench

    fc-reward-bench is a benchmark designed to evaluate reward model performance in function-calling tasks. It features 1,500 unique user inputs derived from the single-turn splits of the BFCL-v3 dataset. Each input is paired with both correct and incorrect function calls. Correct calls are sourced directly from BFCL, while incorrect calls are generated by 25 permissively licensed models.

      Dataset Structure
    

    Each entry in the dataset includes the following… See the full description on the dataset page: https://huggingface.co/datasets/ibm-research/fc-reward-bench.

  12. h

    reward-bench-critique-alpacaeval-easy

    • huggingface.co
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    distilabel-internal-testing (2024). reward-bench-critique-alpacaeval-easy [Dataset]. https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 15, 2024
    Dataset authored and provided by
    distilabel-internal-testing
    Description

    Description

    This dataset is a small subset of allenai/reward-bench to test with our critique models. It was generated in the following way: from datasets import Dataset import pandas as pd

    from datasets import load_dataset

    ds = load_dataset("allenai/reward-bench", split="filtered")

    data = [] for row in ds.filter(lambda x: x["subset"] == "alpacaeval-easy"): for response in ["chosen", "rejected"]: model, is_chosen = (row["chosen_model"], True) if response == "chosen"… See the full description on the dataset page: https://huggingface.co/datasets/distilabel-internal-testing/reward-bench-critique-alpacaeval-easy.

  13. h

    reward-bench-2-converted

    • huggingface.co
    Updated Jun 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    john02171574 (2025). reward-bench-2-converted [Dataset]. https://huggingface.co/datasets/john02171574/reward-bench-2-converted
    Explore at:
    Dataset updated
    Jun 19, 2025
    Authors
    john02171574
    Description

    john02171574/reward-bench-2-converted dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. h

    reward-bench-chat-rewritten

    • huggingface.co
    Updated Dec 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haoxiang Wang (2024). reward-bench-chat-rewritten [Dataset]. https://huggingface.co/datasets/Haoxiang-Wang/reward-bench-chat-rewritten
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2024
    Authors
    Haoxiang Wang
    Description

    Haoxiang-Wang/reward-bench-chat-rewritten dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. h

    allenai-reward-bench

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nour Guermazi (2025). allenai-reward-bench [Dataset]. https://huggingface.co/datasets/nourguermazi/allenai-reward-bench
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Nour Guermazi
    Description

    nourguermazi/allenai-reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    VL-RewardBench

    • huggingface.co
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Multi-modal Multilingual Instruction (2024). VL-RewardBench [Dataset]. https://huggingface.co/datasets/MMInstruction/VL-RewardBench
    Explore at:
    Dataset updated
    Nov 29, 2024
    Dataset authored and provided by
    Multi-modal Multilingual Instruction
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for VLRewardBench

    Project Page: https://vl-rewardbench.github.io

      Dataset Summary
    

    VLRewardBench is a comprehensive benchmark designed to evaluate vision-language generative reward models (VL-GenRMs) across visual perception, hallucination detection, and reasoning tasks. The benchmark contains 1,250 high-quality examples specifically curated to probe model limitations.

      Dataset Structure
    

    Each instance consists of multimodal queries spanning three key… See the full description on the dataset page: https://huggingface.co/datasets/MMInstruction/VL-RewardBench.

  17. h

    reward-bench-reasoning

    • huggingface.co
    Updated Jul 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sylvia Chen (2024). reward-bench-reasoning [Dataset]. https://huggingface.co/datasets/hsicat/reward-bench-reasoning
    Explore at:
    Dataset updated
    Jul 31, 2024
    Authors
    Sylvia Chen
    Description

    This evaluation dataset is the reasoning subset from allenai/reward-bench.

  18. h

    reward-bench

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingxuan Sun (2025). reward-bench [Dataset]. https://huggingface.co/datasets/PirxTion/reward-bench
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Jingxuan Sun
    Description

    PirxTion/reward-bench dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    reward-bench-hacking-rewards-harmless-train-normal

    • huggingface.co
    Updated Jan 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Singh (2025). reward-bench-hacking-rewards-harmless-train-normal [Dataset]. https://huggingface.co/datasets/Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 20, 2025
    Authors
    Ayush Singh
    Description

    Ayush-Singh/reward-bench-hacking-rewards-harmless-train-normal dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    reward-bench-pythia-1.4b-set3-scores

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayush Singh, reward-bench-pythia-1.4b-set3-scores [Dataset]. https://huggingface.co/datasets/Ayush-Singh/reward-bench-pythia-1.4b-set3-scores
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Ayush Singh
    Description

    Ayush-Singh/reward-bench-pythia-1.4b-set3-scores dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ai2 (2024). reward-bench [Dataset]. http://doi.org/10.57967/hf/2457
Organization logo

reward-bench

RM Bench

allenai/reward-bench

Explore at:
268 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 25, 2024
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
License

https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

Description

Code | Leaderboard | Prior Preference Sets | Results | Paper

  Reward Bench Evaluation Dataset Card

The RewardBench evaluation dataset evaluates capabilities of reward models over the following categories:

Chat: Includes the easy chat subsets (alpacaeval-easy, alpacaeval-length, alpacaeval-hard, mt-bench-easy, mt-bench-medium) Chat Hard: Includes the hard chat subsets (mt-bench-hard, llmbar-natural, llmbar-adver-neighbor, llmbar-adver-GPTInst, llmbar-adver-GPTOut… See the full description on the dataset page: https://huggingface.co/datasets/allenai/reward-bench.

Search
Clear search
Close search
Google apps
Main menu