MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning
INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning
INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/NuminaMath-QwQ-CoT-5M.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
INTELLECT-MATH: Frontier Mathematical Reasoning through Better Initializations for Reinforcement Learning
INTELLECT-MATH is a 7B parameter model optimized for mathematical reasoning. It was trained in two stages, an SFT stage, in which the model was fine-tuned on verified QwQ outputs, and an RL stage, in which the model was trained using the PRIME-RL recipe. We demonstrate that the quality of our SFT data can impact the performance and training speed of the RL stage: Due to its… See the full description on the dataset page: https://huggingface.co/datasets/PrimeIntellect/INTELLECT-MATH-SFT-Data.