2 datasets found
  1. INTELLECT-MATH-SFT-Data

    • huggingface.co
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prime Intellect (2025). INTELLECT-MATH-SFT-Data [Dataset]. https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2025
    Dataset provided by
    Authors
    Prime Intellect
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

    INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.

  2. NuminaMath-QwQ-CoT-5M

    • huggingface.co
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prime Intellect (2025). NuminaMath-QwQ-CoT-5M [Dataset]. https://huggingface.co/datasets/PrimeIntellect/NuminaMath-QwQ-CoT-5M
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2025
    Dataset provided by
    Authors
    Prime Intellect
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

    INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/NuminaMath-QwQ-CoT-5M.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Prime Intellect (2025). INTELLECT-MATH-SFT-Data [Dataset]. https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data
Organization logo

INTELLECT-MATH-SFT-Data

PrimeIntellect/INTELLECT-MATH-SFT-Data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 22, 2025
Dataset provided by
Authors
Prime Intellect
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.

Search
Clear search
Close search
Google apps
Main menu