34 datasets found
  1. h

    openai-prm800k-stepwise-critic

    • huggingface.co
    Updated Oct 21, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2024). openai-prm800k-stepwise-critic [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-stepwise-critic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 21, 2024
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Birchlabs/openai-prm800k-stepwise-critic dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. P

    PRM800K Dataset

    • paperswithcode.com
    • library.toponeai.link
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hunter Lightman; Vineet Kosaraju; Yura Burda; Harri Edwards; Bowen Baker; Teddy Lee; Jan Leike; John Schulman; Ilya Sutskever; Karl Cobbe, PRM800K Dataset [Dataset]. https://paperswithcode.com/dataset/prm800k
    Explore at:
    Authors
    Hunter Lightman; Vineet Kosaraju; Yura Burda; Harri Edwards; Bowen Baker; Teddy Lee; Jan Leike; John Schulman; Ilya Sutskever; Karl Cobbe
    Description

    PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset.

  3. h

    prm800k

    • huggingface.co
    Updated Jan 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lee Wei (2023). prm800k [Dataset]. https://huggingface.co/datasets/Mai0313/prm800k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2023
    Authors
    Lee Wei
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    PRM800K: A Process Supervision Dataset

      [Blog Post]
    

    This repository accompanies the paper Let's Verify Step by Step and presents the PRM800K dataset introduced there. PRM800K is a process supervision dataset containing 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. More information on PRM800K and the project can be found in the paper. We are releasing the raw labels as well as the instructions we gave labelers during… See the full description on the dataset page: https://huggingface.co/datasets/Mai0313/prm800k.

  4. h

    multilingual-PRM800K

    • huggingface.co
    Updated May 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wang (2025). multilingual-PRM800K [Dataset]. https://huggingface.co/datasets/vicky23456/multilingual-PRM800K
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2025
    Authors
    wang
    Description

    vicky23456/multilingual-PRM800K dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    prm800k

    • huggingface.co
    Updated Nov 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRL (2024). prm800k [Dataset]. https://huggingface.co/datasets/trl-lib/prm800k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 27, 2024
    Dataset authored and provided by
    TRL
    Description

    PRM800K Dataset

      Summary
    

    The PRM800K dataset is a processed version of OpenAI's PRM800K, designed to train models using the TRL library for stepwise supervision tasks. It contains 800,000 step-level correctness labels for model-generated solutions to problems from the MATH dataset. This dataset enables models to learn and verify each step of a solution, enhancing their reasoning capabilities.

      Data Structure
    

    Format: Standard Type: Stepwise supervision… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/prm800k.

  6. h

    prm800k

    • huggingface.co
    Updated Oct 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nathan lile (2024). prm800k [Dataset]. https://huggingface.co/datasets/nlile/prm800k
    Explore at:
    Dataset updated
    Oct 16, 2024
    Authors
    nathan lile
    Description

    nlile/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. PRM800K

    • opendatalab.com
    zip
    Updated May 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2023). PRM800K [Dataset]. https://opendatalab.com/OpenDataLab/PRM800K
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 1, 2023
    Dataset provided by
    OpenAIhttps://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    PRM800K 是一个过程监督数据集,包含 800,000 个步骤级正确性标签,用于针对MATH数据集中的问题生成模型解决方案。

  8. h

    openai-prm800k-phase2_train-stepwise-critique

    • huggingface.co
    Updated Oct 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2023). openai-prm800k-phase2_train-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase2_train-stepwise-critique
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 20, 2023
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Birchlabs/openai-prm800k-phase2_train-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    openai-prm800k-solutions-only

    • huggingface.co
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2023). openai-prm800k-solutions-only [Dataset]. https://huggingface.co/datasets/sl-alex/openai-prm800k-solutions-only
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 12, 2023
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Denormalized dataset created by processing OpenAI's PRM800K process supervision dataset via prm800k-denorm.Dataset filtered to just conversations which terminated in successful solutions.All steps were deemed as exhibiting progress towards a solution. Dataset description and usage instructions in prm800k-denorm README.

  10. h

    openai-prm800k-phase1_test-stepwise-critique

    • huggingface.co
    Updated Jun 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2023). openai-prm800k-phase1_test-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase1_test-stepwise-critique
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 19, 2023
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Birchlabs/openai-prm800k-phase1_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    openai-prm800k-15k-stage2

    • huggingface.co
    Updated Oct 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaetan Lopez (2024). openai-prm800k-15k-stage2 [Dataset]. https://huggingface.co/datasets/gaetanlop/openai-prm800k-15k-stage2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 15, 2024
    Authors
    Gaetan Lopez
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset card for gaetanlop/openai-prm800k-15k-stage2

    This dataset is intended for testing purposes. It is a 15k subset of the OpenAI PRM800k formatted for use with the StepWiseRewardTrainer from the huggingface trl library. Please cite the original dataset if you find it useful in your work.

  12. h

    prm800k-IF-CELoss

    • huggingface.co
    Updated Mar 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simin Fan (2025). prm800k-IF-CELoss [Dataset]. https://huggingface.co/datasets/Olivia-umich/prm800k-IF-CELoss
    Explore at:
    Dataset updated
    Mar 11, 2025
    Authors
    Simin Fan
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Olivia-umich/prm800k-IF-CELoss dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    openai-prm800k-phase2_test-stepwise-critique

    • huggingface.co
    Updated Nov 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2023). openai-prm800k-phase2_test-stepwise-critique [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase2_test-stepwise-critique
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 2, 2023
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Birchlabs/openai-prm800k-phase2_test-stepwise-critique dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. prm800k-trl-dedup

    • huggingface.co
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2025). prm800k-trl-dedup [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    Description

    Dataset description

    This dataset is a version of tasksource/PRM800K prepared to fine tune a PRM model using TRL. To replicate the dataset creation, go to main.py script in the files directory. Following the guide to the data at openai/prm800k repository, it contains for each prompt, the list of completions, where each one was obtained by exctracting the text from the list of steps, and for the cases with multiple alternative steps, a new completion was created. Sample row:… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceH4/prm800k-trl-dedup.

  15. h

    openai-prm800k-phase1_train-stepwise-best

    • huggingface.co
    Updated Aug 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Birch (2023). openai-prm800k-phase1_train-stepwise-best [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-phase1_train-stepwise-best
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 18, 2023
    Authors
    Alex Birch
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Birchlabs/openai-prm800k-phase1_train-stepwise-best dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h

    prm800k-parsed

    • huggingface.co
    Updated Dec 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Southern university of science and technology (2023). prm800k-parsed [Dataset]. https://huggingface.co/datasets/SUSTech/prm800k-parsed
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 16, 2023
    Dataset authored and provided by
    Southern university of science and technology
    Description

    Dataset Card for "prm800k-parsed"

    More Information needed

  17. h

    prm800k-phase1

    • huggingface.co
    Updated Dec 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keyu Duan (2024). prm800k-phase1 [Dataset]. https://huggingface.co/datasets/vermouthdky/prm800k-phase1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2024
    Authors
    Keyu Duan
    Description

    vermouthdky/prm800k-phase1 dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. h

    prm800k

    • huggingface.co
    Updated Jul 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric (2024). prm800k [Dataset]. https://huggingface.co/datasets/ericzhao28/prm800k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2024
    Authors
    Eric
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    ericzhao28/prm800k dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. h

    DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions...

    • huggingface.co
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Test Time Compute (2025). DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions [Dataset]. https://huggingface.co/datasets/ttc-research/DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset authored and provided by
    Test Time Compute
    Description

    ttc-research/DeepSeek-R1-Distill-Qwen-1.5B-PRM-prm800k-Llama-3.2-1B-Instruct-best_of_n-completions dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    prm800k-chatml

    • huggingface.co
    Updated Jan 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diwank Tomer (2025). prm800k-chatml [Dataset]. https://huggingface.co/datasets/diwank/prm800k-chatml
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2025
    Authors
    Diwank Tomer
    Description

    Dataset Card for "prm800k-chatml"

    More Information needed

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alex Birch (2024). openai-prm800k-stepwise-critic [Dataset]. https://huggingface.co/datasets/Birchlabs/openai-prm800k-stepwise-critic

openai-prm800k-stepwise-critic

Birchlabs/openai-prm800k-stepwise-critic

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2024
Authors
Alex Birch
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Birchlabs/openai-prm800k-stepwise-critic dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu