MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-stepwise-critic dataset hosted on Hugging Face and contributed by the HF Datasets community
PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
PRM800K: A Process Supervision Dataset
[Blog Post]
This repository accompanies the paper Let's Verify Step by Step and presents the PRM800K dataset introduced there. PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. More information on PRM800K and the project can be found in the paper. We are releasing the raw labels as well as the instructions we gave labelers during… See the full description on the dataset page: https://huggingface.co/datasets/Mai0313/prm800k.
vicky23456/multilingual-PRM800K dataset hosted on Hugging Face and contributed by the HF Datasets community
PRM800K Dataset
Summary
The PRM800K dataset is a processed version of OpenAI's PRM800K, designed to train models using the TRL library for stepwise supervision tasks. It contains 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. This dataset enables models to learn and verify each step of a solution, enhancing their reasoning capabilities.
Data Structure
Format: Standard Type: Stepwise supervision… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/prm800k.
nlile/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
PRM800K 是一个过程监督数据集,包含 800,000 个步骤级正确性标签,用于针对MATH数据集中的问题生成模型解决方案。
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-phase2_train-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Denormalized dataset created by processing OpenAI's PRM800K process supervision dataset via prm800k-denorm.Dataset filtered to just conversations which terminated in successful solutions.All steps were deemed as exhibiting progress towards a solution. Dataset description and usage instructions in prm800k-denorm README.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-phase1_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset card for gaetanlop/openai-prm800k-15k-stage2
This dataset is intended for testing purposes. It is a 15k subset of the OpenAI PRM800k formatted for use with the StepWiseRewardTrainer from the huggingface trl library. Please cite the original dataset if you find it useful in your work.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Olivia-umich/prm800k-IF-CELoss dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-phase2_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset description
This dataset is a version of tasksource/PRM800K prepared to fine tune a PRM model using TRL. To replicate the dataset creation, go to main.py script in the files directory. Following the guide to the data at openai/prm800k repository, it contains for each prompt, the list of completions, where each one was obtained by exctracting the text from the list of steps, and for the cases with multiple alternative steps, a new completion was created. Sample row:… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-phase1_train-stepwise-best dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "prm800k-parsed"
More Information needed
vermouthdky/prm800k-phase1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
ericzhao28/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community
ttc-research/DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "prm800k-chatml"
More Information needed
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Birchlabs/openai-prm800k-stepwise-critic dataset hosted on Hugging Face and contributed by the HF Datasets community