8 datasets found

h
leaderboard-documents-gsm8k
huggingface.co
Updated Feb 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jerome White (2025). leaderboard-documents-gsm8k [Dataset]. https://huggingface.co/datasets/jerome-white/leaderboard-documents-gsm8k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2025
Authors
Jerome White
Description
jerome-white/leaderboard-documents-gsm8k dataset hosted on Hugging Face and contributed by the HF Datasets community
gsm8k
huggingface.co
Updated Aug 11, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenAI (2022). gsm8k [Dataset]. https://huggingface.co/datasets/openai/gsm8k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 11, 2022
Dataset authored and provided by
OpenAIhttp://openai.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for GSM8K

Dataset Summary

GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.

These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.
h
mrm8488_phi-4-14B-grpo-gsm8k-3e-details
huggingface.co
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Open LLM Leaderboard (2025). mrm8488_phi-4-14B-grpo-gsm8k-3e-details [Dataset]. https://huggingface.co/datasets/open-llm-leaderboard/mrm8488_phi-4-14B-grpo-gsm8k-3e-details
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
Open LLM Leaderboard
Description
Dataset Card for Evaluation run of mrm8488/phi-4-14B-grpo-gsm8k-3e

Dataset automatically created during the evaluation run of model mrm8488/phi-4-14B-grpo-gsm8k-3e The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latest… See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/mrm8488_phi-4-14B-grpo-gsm8k-3e-details.
h
dart-math-pool-gsm8k
huggingface.co
Updated Feb 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HKUST NLP Group (2025). dart-math-pool-gsm8k [Dataset]. https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 19, 2025
Dataset authored and provided by
HKUST NLP Group
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
[!NOTE] This dataset is the data pool synthesized from the query set of the GSM8K training set, containing all answer-correct samples and other metadata produced during the work. DART-Math-* datasets are extracted from dart-math-pool-* data pools.

🎯 DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

📝 Paper@arXiv | 🤗 Datasets&Models@HF | 🐱 Code@GitHub 🐦 Thread@X(Twitter) | 🐶 中文博客@知乎 | 📊 Leaderboard@PapersWithCode | 📑 BibTeX

Datasets:… See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k.
h
dart-math-pool-gsm8k-query-info
huggingface.co
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HKUST NLP Group (2025). dart-math-pool-gsm8k-query-info [Dataset]. https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k-query-info
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 19, 2025
Dataset authored and provided by
HKUST NLP Group
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
[!NOTE] This dataset is the synthesis information of queries from the GSM8K training set, such as the numbers of raw/correct samples of each synthesis job. Usually used with dart-math-pool-gsm8k.

🎯 DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

📝 Paper@arXiv | 🤗 Datasets&Models@HF | 🐱 Code@GitHub 🐦 Thread@X(Twitter) | 🐶 中文博客@知乎 | 📊 Leaderboard@PapersWithCode | 📑 BibTeX

Datasets: DART-Math

DART-Math datasets are the… See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k-query-info.
h
gsm-hard
huggingface.co
Updated Apr 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reasoning Machines (2023). gsm-hard [Dataset]. https://huggingface.co/datasets/reasoning-machines/gsm-hard
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 9, 2023
Dataset authored and provided by
Reasoning Machines
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Summary

This is the harder version of gsm8k math reasoning dataset (https://huggingface.co/datasets/gsm8k). We construct this dataset by replacing the numbers in the questions of GSM8K with larger numbers that are less common.

Supported Tasks and Leaderboards

This dataset is used to evaluate math reasoning

Languages

English - Numbers

Dataset Structure

dataset = load_dataset("reasoning-machines/gsm-hard") DatasetDict({ train: Dataset({… See the full description on the dataset page: https://huggingface.co/datasets/reasoning-machines/gsm-hard.
h
gsm8k-prolog
huggingface.co
Updated Sep 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaocheng Yang (2023). gsm8k-prolog [Dataset]. https://huggingface.co/datasets/Thomas-X-Yang/gsm8k-prolog
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 9, 2023
Authors
Xiaocheng Yang
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for GSM8K-Prolog

Dataset Summary

This is the Prolog annotated version of the GSM8K math reasoning dataset. We used the same dataset splits and questions in GSM8K and prompted GPT-4 to generate the Prolog programs to solve the questions. We then manually corrected some malfunctioning samples.

Supported Tasks and Leaderboards

This dataset can be used to train language models to generate Prolog codes in order to solve math questions and evaluate the… See the full description on the dataset page: https://huggingface.co/datasets/Thomas-X-Yang/gsm8k-prolog.
h
dart-math-hard
huggingface.co
Updated Jun 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HKUST NLP Group (2024). dart-math-hard [Dataset]. https://huggingface.co/datasets/hkust-nlp/dart-math-hard
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 14, 2024
Dataset authored and provided by
HKUST NLP Group
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🎯 DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving

📝 Paper@arXiv | 🤗 Datasets&Models@HF | 🐱 Code@GitHub 🐦 Thread@X(Twitter) | 🐶 中文博客@知乎 | 📊 Leaderboard@PapersWithCode | 📑 BibTeX

[!IMPORTANT] 🔥 Excited to find our DART-Math-DSMath-7B (Prop2Diff) trained on DART-Math-Hard comparable to the AIMO winner NuminaMath-7B on CoT, but based solely on MATH & GSM8K prompt set, leaving much room to improve! Besides, our DART method is also fully compatible… See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-hard.
Not seeing a result you expected?
Learn how you can add new datasets to our index.