18 datasets found
  1. h

    NuminaMath-1.5-RL-Verifiable

    • huggingface.co
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nathan lile (2025). NuminaMath-1.5-RL-Verifiable [Dataset]. https://huggingface.co/datasets/nlile/NuminaMath-1.5-RL-Verifiable
    Explore at:
    Dataset updated
    Mar 25, 2025
    Authors
    nathan lile
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for NuminaMath-1.5-RL-Verifiable

      Dataset Summary
    

    NuminaMath-1.5-RL-Verifiable is a curated subset of the NuminaMath-1.5 dataset, specifically filtered to support reinforcement learning applications requiring verifiable outcomes. This collection consists of 131,063 math word problems from the original dataset that meet strict filtering criteria: all problems have definitive numerical answers, validated problem statements and solutions, and come from… See the full description on the dataset page: https://huggingface.co/datasets/nlile/NuminaMath-1.5-RL-Verifiable.

  2. h

    NuminaMath-1.5-proofs-only

    • huggingface.co
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nathan lile (2025). NuminaMath-1.5-proofs-only [Dataset]. https://huggingface.co/datasets/nlile/NuminaMath-1.5-proofs-only
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    nathan lile
    Description

    NuminaMath-1.5 Proofs Only

    This is a filtered subset of the AI-MO/NuminaMath-1.5 dataset containing only proof problems.

      Dataset Information
    

    Total Problems: 110,998 Filter Criteria: question_type == 'proof' Original Dataset: AI-MO/NuminaMath-1.5 License: CC BY-NC 4.0

      Usage
    

    This dataset contains high-quality proof problems from various mathematical competitions and sources, formatted in Chain of Thought (CoT) manner.

      Source Breakdown
    

    The proof… See the full description on the dataset page: https://huggingface.co/datasets/nlile/NuminaMath-1.5-proofs-only.

  3. h

    NuminaMath-1.5-EFA-Subset

    • huggingface.co
    Updated Apr 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zaid Khan (2025). NuminaMath-1.5-EFA-Subset [Dataset]. https://huggingface.co/datasets/codezakh/NuminaMath-1.5-EFA-Subset
    Explore at:
    Dataset updated
    Apr 15, 2025
    Authors
    Zaid Khan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    📃 Paper This dataset contains EFAs inferred for a subset of NuminaMath_CoT, specifically the first 5,000 problems. These EFAs were inferred by this model, and the prompts used for training are linked in the model card. The dataset contains multiple EFA candidates for most of the first 5,000 problems in NuminaMath.
    Each row in the dataset is described by the Row class below: from pydantic import BaseModel

    class ProblemVariant(BaseModel): """Synthetic problem variants constructed by… See the full description on the dataset page: https://huggingface.co/datasets/codezakh/NuminaMath-1.5-EFA-Subset.

  4. h

    collect-data-NuminaMath-1.5-v2

    • huggingface.co
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruikang Zhao (2025). collect-data-NuminaMath-1.5-v2 [Dataset]. https://huggingface.co/datasets/laolaorkk/collect-data-NuminaMath-1.5-v2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2025
    Authors
    Ruikang Zhao
    Description

    laolaorkk/collect-data-NuminaMath-1.5-v2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    NuminaMath-1.5-550k

    • huggingface.co
    Updated Mar 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Junnan Liu (2025). NuminaMath-1.5-550k [Dataset]. https://huggingface.co/datasets/jnanliu/NuminaMath-1.5-550k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 1, 2025
    Authors
    Junnan Liu
    Description

    jnanliu/NuminaMath-1.5-550k dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    NuminaMath-TIR

    • huggingface.co
    • mooodflaresvariety.store
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Project-Numina (2024). NuminaMath-TIR [Dataset]. https://huggingface.co/datasets/AI-MO/NuminaMath-TIR
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Dataset authored and provided by
    Project-Numina
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for NuminaMath CoT

      Dataset Summary
    

    Tool-integrated reasoning (TIR) plays a crucial role in this competition. However, collecting and annotating such data is both costly and time-consuming. To address this, we selected approximately 70k problems from the NuminaMath-CoT dataset, focusing on those with numerical outputs, most of which are integers. We then utilized a pipeline leveraging GPT-4 to generate TORA-like reasoning paths, executing the code and… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/NuminaMath-TIR.

  7. h

    OpenR1-Math-220k-NuminaMath-1.5-Big-Math-RL-Verified-Cleaned

    • huggingface.co
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daffa Hanif Padantya (2025). OpenR1-Math-220k-NuminaMath-1.5-Big-Math-RL-Verified-Cleaned [Dataset]. https://huggingface.co/datasets/daffapadantya/OpenR1-Math-220k-NuminaMath-1.5-Big-Math-RL-Verified-Cleaned
    Explore at:
    Dataset updated
    Feb 12, 2025
    Authors
    Daffa Hanif Padantya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    daffapadantya/OpenR1-Math-220k-NuminaMath-1.5-Big-Math-RL-Verified-Cleaned dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    NuminaMath-1.5-hard

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chenggong, NuminaMath-1.5-hard [Dataset]. https://huggingface.co/datasets/chenggong1995/NuminaMath-1.5-hard
    Explore at:
    Authors
    chenggong
    Description

    chenggong1995/NuminaMath-1.5-hard dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    NuminaMath-1.5-RL-Verifiable_Qwen3-8B_zero_solve

    • huggingface.co
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenting Zhao (2025). NuminaMath-1.5-RL-Verifiable_Qwen3-8B_zero_solve [Dataset]. https://huggingface.co/datasets/wentingzhao/NuminaMath-1.5-RL-Verifiable_Qwen3-8B_zero_solve
    Explore at:
    Dataset updated
    Jul 1, 2025
    Authors
    Wenting Zhao
    Description

    wentingzhao/NuminaMath-1.5-RL-Verifiable_Qwen3-8B_zero_solve dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    numina-number-theory-proofstep

    • huggingface.co
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David (2025). numina-number-theory-proofstep [Dataset]. https://huggingface.co/datasets/dlauran/numina-number-theory-proofstep
    Explore at:
    Dataset updated
    Jun 17, 2025
    Authors
    David
    Description
  11. h

    AI-MO-NuminaMath-TIR-korean-240918

    • huggingface.co
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ChuGyouk (2024). AI-MO-NuminaMath-TIR-korean-240918 [Dataset]. https://huggingface.co/datasets/ChuGyouk/AI-MO-NuminaMath-TIR-korean-240918
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2024
    Authors
    ChuGyouk
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    IMPORTANT NOTE

    This data is part of the progress. Current translation progress: 24.85% (2024-09-18 01:32 KST) I'm taking a short break due to personal reasons. I'll be back in a month.

      TODO-LIST
    

    Finish translation

      Translation
    

    I used gemini-1.5-pro-exp-0827. The prompt used for translation will be disclosed at the end.

      Dataset Card for NuminaMath CoT
    
    
    
    
    
      Dataset Summary
    

    Tool-integrated reasoning (TIR) plays a crucial role in this… See the full description on the dataset page: https://huggingface.co/datasets/ChuGyouk/AI-MO-NuminaMath-TIR-korean-240918.

  12. h

    math_reasoning

    • huggingface.co
    Updated Mar 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Notbad AI (2025). math_reasoning [Dataset]. https://huggingface.co/datasets/notbadai/math_reasoning
    Explore at:
    Dataset updated
    Mar 30, 2025
    Dataset authored and provided by
    Notbad AI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This is the math reasoning dataset used to train Notbad v1.0 Mistral 24B reasoning model. The reasoning data were sampled from a RL-based self-improved Mistral-Small-24B-Instruct-2501 model. The questions were sourced from:

    NuminaMath 1.5 GSM8k Training Set MATH Training Set

    You can try Notbad v1.0 Mistral 24B on chat.labml.ai.

  13. h

    OpenR1-Math-220k

    • huggingface.co
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge Alonso (2025). OpenR1-Math-220k [Dataset]. https://huggingface.co/datasets/oieieio/OpenR1-Math-220k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2025
    Authors
    Jorge Alonso
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    OpenR1-Math-220k

      Dataset description
    

    OpenR1-Math-220k is a large-scale dataset for mathematical reasoning. It consists of 220k math problems with two to four reasoning traces generated by DeepSeek R1 for problems from NuminaMath 1.5. The traces were verified using Math Verify for most samples and Llama-3.3-70B-Instruct as a judge for 12% of the samples, and each problem contains at least one reasoning trace with a correct answer. The dataset consists of two splits:… See the full description on the dataset page: https://huggingface.co/datasets/oieieio/OpenR1-Math-220k.

  14. h

    math-sft

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duong Hoang Le (2025). math-sft [Dataset]. https://huggingface.co/datasets/lehduong/math-sft
    Explore at:
    Dataset updated
    May 11, 2025
    Authors
    Duong Hoang Le
    Description

    concat: OpenMathInstruct-2, OpenMathReasoning, AceMath, OpenR1-Math, Numinamath-CoT, Numinamath 1.5, OpenThoughts2-1M, MetaMathQA, Maths-College

  15. h

    OpenR1-Math-Raw

    • huggingface.co
    Updated Mar 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open R1 (2025). OpenR1-Math-Raw [Dataset]. https://huggingface.co/datasets/open-r1/OpenR1-Math-Raw
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 6, 2025
    Dataset authored and provided by
    Open R1
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    OpenR1-Math-Raw

      Dataset description
    

    OpenR1-Math-Raw is a large-scale dataset for mathematical reasoning. It consists of 516k math problems sourced from AI-MO/NuminaMath-1.5 with 1 to 8 reasoning traces generated by DeepSeek R1. The traces were verified using Math Verify and LLM-as-Judge based verifier (Llama-3.3-70B-Instruct) The dataset contains:

    516,499 problems 1,209,403 R1-generated solutions, with 2.3 solutions per problem on average re-parsed answers… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenR1-Math-Raw.

  16. h

    zip2zip-1B-no-split

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EPFL Data Science Lab, zip2zip-1B-no-split [Dataset]. https://huggingface.co/datasets/epfl-dlab/zip2zip-1B-no-split
    Explore at:
    Dataset authored and provided by
    EPFL Data Science Lab
    Description

    HuggingFaceFW/fineweb-edu (20%) (common knowledge) devngho/the-stack-llm-annotations-v2 (25%) (code) AI-MO/NuminaMath-1.5 (20%) (math) HuggingFaceH4/ultrachat_200k (20%) (chat) HuggingFaceFW/fineweb-2 (15%) (multilingual: [cmn_Hani, deu_Latn, jpn_Jpan, spa_Latn, fra_Latn, ita_Latn, por_Latn, nld_Latn, arb_Arab])

  17. h

    OpenR1-Math-220k

    • huggingface.co
    Updated Feb 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open R1 (2025). OpenR1-Math-220k [Dataset]. https://huggingface.co/datasets/open-r1/OpenR1-Math-220k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 12, 2025
    Dataset authored and provided by
    Open R1
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    OpenR1-Math-220k

      Dataset description
    

    OpenR1-Math-220k is a large-scale dataset for mathematical reasoning. It consists of 220k math problems with two to four reasoning traces generated by DeepSeek R1 for problems from NuminaMath 1.5. The traces were verified using Math Verify for most samples and Llama-3.3-70B-Instruct as a judge for 12% of the samples, and each problem contains at least one reasoning trace with a correct answer. The dataset consists of two splits:… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenR1-Math-220k.

  18. h

    turkish-math-186k

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ITU Perceptron (2025). turkish-math-186k [Dataset]. https://huggingface.co/datasets/ituperceptron/turkish-math-186k
    Explore at:
    Dataset updated
    Jun 1, 2025
    Dataset authored and provided by
    ITU Perceptron
    Description

    Türkçe Matematik Veri Seti

    Bu veri seti AI-MO/NuminaMath-1.5 veri setinin Türkçe'ye çevirilmiş bir alt parçasıdır ve paylaştığımız veri setinde orijinal veri setinden yaklaşık 186 bin satır bulunmaktadır. Veri setindeki sütunlar ve diğer bilgiler ile ilgili detaylı bilgiye orijinal veri seti üzerinden ulaşabilirsiniz. Problem ve çözümlerin çevirileri için gemini-2.0-flash modeli kullanılmıştır ve matematik notasyonları başta olmak üzere veri setinin çeviri kalitesinin üst düzeyde… See the full description on the dataset page: https://huggingface.co/datasets/ituperceptron/turkish-math-186k.

  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nathan lile (2025). NuminaMath-1.5-RL-Verifiable [Dataset]. https://huggingface.co/datasets/nlile/NuminaMath-1.5-RL-Verifiable

NuminaMath-1.5-RL-Verifiable

NuminaMath 1.5 RL Verifiable

nlile/NuminaMath-1.5-RL-Verifiable

Explore at:
Dataset updated
Mar 25, 2025
Authors
nathan lile
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for NuminaMath-1.5-RL-Verifiable

  Dataset Summary

NuminaMath-1.5-RL-Verifiable is a curated subset of the NuminaMath-1.5 dataset, specifically filtered to support reinforcement learning applications requiring verifiable outcomes. This collection consists of 131,063 math word problems from the original dataset that meet strict filtering criteria: all problems have definitive numerical answers, validated problem statements and solutions, and come from… See the full description on the dataset page: https://huggingface.co/datasets/nlile/NuminaMath-1.5-RL-Verifiable.

Search
Clear search
Close search
Google apps
Main menu