2 datasets found

INTELLECT-MATH-SFT-Data
huggingface.co
Updated Jan 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prime Intellect (2025). INTELLECT-MATH-SFT-Data [Dataset]. https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 22, 2025
Dataset provided by
Authors
Prime Intellect
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.
NuminaMath-QwQ-CoT-5M
huggingface.co
Updated Jan 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prime Intellect (2025). NuminaMath-QwQ-CoT-5M [Dataset]. https://huggingface.co/datasets/PrimeIntellect/NuminaMath-QwQ-CoT-5M
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 22, 2025
Dataset provided by
Authors
Prime Intellect
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/NuminaMath-QwQ-CoT-5M.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Prime Intellect (2025). INTELLECT-MATH-SFT-Data [Dataset]. https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data

INTELLECT-MATH-SFT-Data

PrimeIntellect/INTELLECT-MATH-SFT-Data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jan 22, 2025

Dataset provided by

Authors

Prime Intellect

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning

INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.

Clear search

Close search

Google apps

Main menu

INTELLECT-MATH-SFT-Data

NuminaMath-QwQ-CoT-5M

INTELLECT-MATH-SFT-Data

PrimeIntellect/INTELLECT-MATH-SFT-Data