34 datasets found

h
openai-prm800k-stepwise-critic
huggingface.co
Updated Oct 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2024). openai-prm800k-stepwise-critic [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-stepwise-critic
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2024
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Birchlabs/openai-prm800k-stepwise-critic dataset hosted on Hugging Face and contributed by the HF Datasets community
P
PRM800K Dataset
paperswithcode.com
library.toponeai.link
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hunter Lightman; Vineet Kosaraju; Yura Burda; Harri Edwards; Bowen Baker; Teddy Lee; Jan Leike; John Schulman; Ilya Sutskever; Karl Cobbe, PRM800K Dataset [Dataset]. https://paperswithcode.com/dataset/prm800k
Explore at:
Authors
Hunter Lightman; Vineet Kosaraju; Yura Burda; Harri Edwards; Bowen Baker; Teddy Lee; Jan Leike; John Schulman; Ilya Sutskever; Karl Cobbe
Description
PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset.
h
prm800k
huggingface.co
Updated Jan 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lee Wei (2023). prm800k [Dataset]. https://huggingface.co/datasets/Mai0313/prm800k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 22, 2023
Authors
Lee Wei
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
PRM800K: A Process Supervision Dataset

[Blog Post]

This repository accompanies the paper Let's Verify Step by Step and presents the PRM800K dataset introduced there. PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. More information on PRM800K and the project can be found in the paper. We are releasing the raw labels as well as the instructions we gave labelers during… See the full description on the dataset page: https://huggingface.co/datasets/Mai0313/prm800k.
h
multilingual-PRM800K
huggingface.co
Updated May 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
wang (2025). multilingual-PRM800K [Dataset]. https://huggingface.co/datasets/vicky23456/multilingual-PRM800K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 4, 2025
Authors
wang
Description
vicky23456/multilingual-PRM800K dataset hosted on Hugging Face and contributed by the HF Datasets community
h
prm800k
huggingface.co
Updated Nov 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRL (2024). prm800k [Dataset]. https://huggingface.co/datasets/trl-lib/prm800k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 27, 2024
Dataset authored and provided by
TRL
Description
PRM800K Dataset

Summary

The PRM800K dataset is a processed version of OpenAI's PRM800K, designed to train models using the TRL library for stepwise supervision tasks. It contains 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. This dataset enables models to learn and verify each step of a solution, enhancing their reasoning capabilities.

Data Structure

Format: Standard Type: Stepwise supervision… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/prm800k.
h
prm800k
huggingface.co
Updated Oct 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nathan lile (2024). prm800k [Dataset]. https://huggingface.co/datasets/nlile/prm800k
Explore at:
Dataset updated
Oct 16, 2024
Authors
nathan lile
Description
nlile/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community
PRM800K
opendatalab.com
zip
Updated May 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenAI (2023). PRM800K [Dataset]. https://opendatalab.com/OpenDataLab/PRM800K
Explore at:
zipAvailable download formats
Dataset updated
May 1, 2023
Dataset provided by
OpenAIhttps://openai.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
PRM800K 是一个过程监督数据集，包含 800,000 个步骤级正确性标签，用于针对MATH数据集中的问题生成模型解决方案。
h
openai-prm800k-phase2_train-stepwise-critique
huggingface.co
Updated Oct 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2023). openai-prm800k-phase2_train-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase2_train-stepwise-critique
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 20, 2023
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Birchlabs/openai-prm800k-phase2_train-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
h
openai-prm800k-solutions-only
huggingface.co
Updated Jul 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2023). openai-prm800k-solutions-only [Dataset]. https://huggingface.co/datasets/sl-alex/openai-prm800k-solutions-only
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 12, 2023
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Denormalized dataset created by processing OpenAI's PRM800K process supervision dataset via prm800k-denorm.Dataset filtered to just conversations which terminated in successful solutions.All steps were deemed as exhibiting progress towards a solution. Dataset description and usage instructions in prm800k-denorm README.
h
openai-prm800k-phase1_test-stepwise-critique
huggingface.co
Updated Jun 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2023). openai-prm800k-phase1_test-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase1_test-stepwise-critique
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 19, 2023
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Birchlabs/openai-prm800k-phase1_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
h
openai-prm800k-15k-stage2
huggingface.co
Updated Oct 15, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaetan Lopez (2024). openai-prm800k-15k-stage2 [Dataset]. https://huggingface.co/datasets/gaetanlop/openai-prm800k-15k-stage2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 15, 2024
Authors
Gaetan Lopez
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset card for gaetanlop/openai-prm800k-15k-stage2

This dataset is intended for testing purposes. It is a 15k subset of the OpenAI PRM800k formatted for use with the StepWiseRewardTrainer from the huggingface trl library. Please cite the original dataset if you find it useful in your work.
h
prm800k-IF-CELoss
huggingface.co
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simin Fan (2025). prm800k-IF-CELoss [Dataset]. https://huggingface.co/datasets/Olivia-umich/prm800k-IF-CELoss
Explore at:
Dataset updated
Mar 11, 2025
Authors
Simin Fan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Olivia-umich/prm800k-IF-CELoss dataset hosted on Hugging Face and contributed by the HF Datasets community
h
openai-prm800k-phase2_test-stepwise-critique
huggingface.co
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2023). openai-prm800k-phase2_test-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase2_test-stepwise-critique
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 2, 2023
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Birchlabs/openai-prm800k-phase2_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community
prm800k-trl-dedup
huggingface.co
Updated Jan 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hugging Face H4 (2025). prm800k-trl-dedup [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 8, 2025
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Hugging Face H4
Description
Dataset description

This dataset is a version of tasksource/PRM800K prepared to fine tune a PRM model using TRL. To replicate the dataset creation, go to main.py script in the files directory. Following the guide to the data at openai/prm800k repository, it contains for each prompt, the list of completions, where each one was obtained by exctracting the text from the list of steps, and for the cases with multiple alternative steps, a new completion was created. Sample row:… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup.
h
openai-prm800k-phase1_train-stepwise-best
huggingface.co
Updated Aug 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Birch (2023). openai-prm800k-phase1_train-stepwise-best [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase1_train-stepwise-best
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 18, 2023
Authors
Alex Birch
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Birchlabs/openai-prm800k-phase1_train-stepwise-best dataset hosted on Hugging Face and contributed by the HF Datasets community
h
prm800k-parsed
huggingface.co
Updated Dec 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Southern university of science and technology (2023). prm800k-parsed [Dataset]. https://huggingface.co/datasets/SUSTech/prm800k-parsed
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 16, 2023
Dataset authored and provided by
Southern university of science and technology
Description
Dataset Card for "prm800k-parsed"

More Information needed
h
prm800k-phase1
huggingface.co
Updated Dec 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Keyu Duan (2024). prm800k-phase1 [Dataset]. https://huggingface.co/datasets/vermouthdky/prm800k-phase1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 20, 2024
Authors
Keyu Duan
Description
vermouthdky/prm800k-phase1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
prm800k
huggingface.co
Updated Jul 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric (2024). prm800k [Dataset]. https://huggingface.co/datasets/ericzhao28/prm800k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 13, 2024
Authors
Eric
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
ericzhao28/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions...
huggingface.co
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Test Time Compute (2025). DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions [Dataset]. https://huggingface.co/datasets/ttc-research/DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions
Explore at:
Dataset updated
Mar 14, 2025
Dataset authored and provided by
Test Time Compute
Description
ttc-research/DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions dataset hosted on Hugging Face and contributed by the HF Datasets community
h
prm800k-chatml
huggingface.co
Updated Jan 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diwank Tomer (2025). prm800k-chatml [Dataset]. https://huggingface.co/datasets/diwank/prm800k-chatml
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 2, 2025
Authors
Diwank Tomer
Description
Dataset Card for "prm800k-chatml"

More Information needed