jerome-white/leaderboard-documents-gsm8k dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for GSM8K
Dataset Summary
GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.
These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ โ รรท) to reach theโฆ See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.
Dataset Card for Evaluation run of mrm8488/phi-4-14B-grpo-gsm8k-3e
Dataset automatically created during the evaluation run of model mrm8488/phi-4-14B-grpo-gsm8k-3e The dataset is composed of 38 configuration(s), each one corresponding to one of the evaluated task. The dataset has been created from 1 run(s). Each run can be found as a specific split in each configuration, the split being named using the timestamp of the run.The "train" split is always pointing to the latestโฆ See the full description on the dataset page: https://huggingface.co/datasets/open-llm-leaderboard/mrm8488_phi-4-14B-grpo-gsm8k-3e-details.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the data pool synthesized from the query set of the GSM8K training set, containing all answer-correct samples and other metadata produced during the work. DART-Math-* datasets are extracted from dart-math-pool-* data pools.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets:โฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
[!NOTE] This dataset is the synthesis information of queries from the GSM8K training set, such as the numbers of raw/correct samples of each synthesis job. Usually used with dart-math-pool-gsm8k.
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
Datasets: DART-Math
DART-Math datasets are theโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-pool-gsm8k-query-info.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Summary
This is the harder version of gsm8k math reasoning dataset (https://huggingface.co/datasets/gsm8k). We construct this dataset by replacing the numbers in the questions of GSM8K with larger numbers that are less common.
Supported Tasks and Leaderboards
This dataset is used to evaluate math reasoning
Languages
English - Numbers
Dataset Structure
dataset = load_dataset("reasoning-machines/gsm-hard") DatasetDict({ train: Dataset({โฆ See the full description on the dataset page: https://huggingface.co/datasets/reasoning-machines/gsm-hard.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for GSM8K-Prolog
Dataset Summary
This is the Prolog annotated version of the GSM8K math reasoning dataset. We used the same dataset splits and questions in GSM8K and prompted GPT-4 to generate the Prolog programs to solve the questions. We then manually corrected some malfunctioning samples.
Supported Tasks and Leaderboards
This dataset can be used to train language models to generate Prolog codes in order to solve math questions and evaluate theโฆ See the full description on the dataset page: https://huggingface.co/datasets/Thomas-X-Yang/gsm8k-prolog.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
๐ฏ DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
๐ Paper@arXiv | ๐ค Datasets&Models@HF | ๐ฑ Code@GitHub ๐ฆ Thread@X(Twitter) | ๐ถ ไธญๆๅๅฎข@็ฅไน | ๐ Leaderboard@PapersWithCode | ๐ BibTeX
[!IMPORTANT] ๐ฅ Excited to find our DART-Math-DSMath-7B (Prop2Diff) trained on DART-Math-Hard comparable to the AIMO winner NuminaMath-7B on CoT, but based solely on MATH & GSM8K prompt set, leaving much room to improve! Besides, our DART method is also fully compatibleโฆ See the full description on the dataset page: https://huggingface.co/datasets/hkust-nlp/dart-math-hard.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
jerome-white/leaderboard-documents-gsm8k dataset hosted on Hugging Face and contributed by the HF Datasets community