Mathematics database.
This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.
Original paper: Analysing Mathematical Reasoning Abilities of Neural Models (Saxton, Grefenstette, Hill, Kohli).
Example usage: train_examples, val_examples = datasets.load_dataset( 'math_dataset/arithmetic_mul', split=['train', 'test'], as_supervised=True)
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
CAMEL: Communicative Agents for “Mind” Exploration of Large Scale Language Model Society
Github: https://github.com/lightaime/camel Website: https://www.camel-ai.org/ Arxiv Paper: https://arxiv.org/abs/2303.17760
Dataset Summary
Math dataset is composed of 50K problem-solution pairs obtained using GPT-4. The dataset problem-solutions pairs generating from 25 math topics, 25 subtopics for each topic and 80 problems for each "topic,subtopic" pairs. We provide the data… See the full description on the dataset page: https://huggingface.co/datasets/camel-ai/math.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card
This dataset contains ~200K grade school math word problems. All the answers in this dataset is generated using Azure GPT4-Turbo. Please refer to Orca-Math: Unlocking the potential of SLMs in Grade School Math for details about the dataset construction.
Dataset Sources
Repository: microsoft/orca-math-word-problems-200k Paper: Orca-Math: Unlocking the potential of SLMs in Grade School Math
Direct Use
This dataset has been designed to… See the full description on the dataset page: https://huggingface.co/datasets/microsoft/orca-math-word-problems-200k.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
OpenR1-Math-220k
Dataset description
OpenR1-Math-220k is a large-scale dataset for mathematical reasoning. It consists of 220k math problems with two to four reasoning traces generated by DeepSeek R1 for problems from NuminaMath 1.5. The traces were verified using Math Verify for most samples and Llama-3.3-70B-Instruct as a judge for 12% of the samples, and each problem contains at least one reasoning trace with a correct answer. The dataset consists of two splits:… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenR1-Math-220k.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for Mathematics Aptitude Test of Heuristics, hard subset (MATH-Hard) dataset
Dataset Summary
The Mathematics Aptitude Test of Heuristics (MATH) dataset consists of problems from mathematics competitions, including the AMC 10, AMC 12, AIME, and more. Each problem in MATH has a full step-by-step solution, which can be used to teach models to generate answer derivations and explanations. For MATH-Hard, only the hardest questions were kept (Level 5).… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/MATH-Hard.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
StackMathQA
StackMathQA: A Curated Collection of 2 Million Mathematical Questions and Answers Sourced from Stack Exchange
StackMathQA is a meticulously curated collection of 2 million mathematical questions and answers, sourced from various Stack Exchange sites. This repository is designed to serve as a comprehensive resource for researchers, educators, and enthusiasts in the field of mathematics and AI research.
Configs
configs: - config_name: stackmathqa1600k… See the full description on the dataset page: https://huggingface.co/datasets/math-ai/StackMathQA.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
🎉 This work, introducing the AutoMathText dataset and the AutoDS method, has been accepted to The 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025 Findings)! 🎉
AutoMathText
AutoMathText is an extensive and carefully curated dataset encompassing around 200 GB of mathematical texts. It's a compilation sourced from a diverse range of platforms including various websites, arXiv, and GitHub (OpenWebMath, RedPajama, Algebraic Stack). This rich repository… See the full description on the dataset page: https://huggingface.co/datasets/math-ai/AutoMathText.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Summary
MATH dataset from https://github.com/hendrycks/math
Citation Information
@article{hendrycksmath2021, title={Measuring Mathematical Problem Solving With the MATH Dataset}, author={Dan Hendrycks and Collin Burns and Saurav Kadavath and Akul Arora and Steven Basart and Eric Tang and Dawn Song and Jacob Steinhardt}, journal={NeurIPS}, year={2021} }
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for GSM8K
Dataset Summary
GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.
These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.
Keiran Paster*, Marco Dos Santos*, Zhangir Azerbayev, Jimmy Ba GitHub | ArXiv | PDF OpenWebMath is a dataset containing the majority of the high-quality, mathematical text from the internet. It is filtered and extracted from over 200B HTML files on Common Crawl down to a set of 6.3 million documents containing a total of 14.7B tokens. OpenWebMath is intended for use in pretraining and finetuning large language models. You can download the dataset using Hugging Face: from datasets import… See the full description on the dataset page: https://huggingface.co/datasets/Alignment-Lab-AI/Open-Web-Math.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
svc-huggingface/minerva-math dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MM_Math Datasets
We introduce our multimodal mathematics dataset, MM-MATH,. This dataset is collected from real middle school exams in China, and all the math problems are open-ended to evaluate the mathematical problem-solving abilities of current multimodal models. MM-MATH is annotated with fine-grained three-dimensional labels: difficulty, grade, and knowledge points. The difficulty level is determined based on the average scores of student exams, the grade labels are derived… See the full description on the dataset page: https://huggingface.co/datasets/THU-KEG/MM_Math.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
Big-Math is the largest open-source dataset of high-quality mathematical problems, curated specifically for reinforcement learning (RL) training in language models. With over 250,000 rigorously filtered and verified problems, Big-Math bridges the gap between quality and quantity, establishing a robust foundation for advancing reasoning in LLMs.
Request Early Access to Private… See the full description on the dataset page: https://huggingface.co/datasets/SynthLabsAI/Big-Math-RL-Verified.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
TemplateMath: Template-based Data Generation (TDG)
This is the official repository for the paper "Training and Evaluating Language Models with Template-based Data Generation", published at the ICLR 2025 DATA-FM Workshop. Our work introduces Template-based Data Generation (TDG), a scalable paradigm to address the critical data bottleneck in training LLMs for complex reasoning tasks. We use TDG to create TemplateGSM, a massive dataset designed to unlock the next level of… See the full description on the dataset page: https://huggingface.co/datasets/math-ai/TemplateGSM.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BytedTsinghua-SIA/DAPO-Math-17k dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "agieval-sat-math"
Dataset taken from https://github.com/microsoft/AGIEval and processed as in that repo. MIT License Copyright (c) Microsoft Corporation. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of… See the full description on the dataset page: https://huggingface.co/datasets/dmayhem93/agieval-sat-math.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
🦣 MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning
MathInstruct is a meticulously curated instruction tuning dataset that is lightweight yet generalizable. MathInstruct is compiled from 13 math rationale datasets, six of which are newly curated by this work. It uniquely focuses on the hybrid use of chain-of-thought (CoT) and program-of-thought (PoT) rationales, and ensures extensive coverage of diverse mathematical fields. Project Page:… See the full description on the dataset page: https://huggingface.co/datasets/TIGER-Lab/MathInstruct.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
OpenR1-Math-Raw
Dataset description
OpenR1-Math-Raw is a large-scale dataset for mathematical reasoning. It consists of 516k math problems sourced from AI-MO/NuminaMath-1.5 with 1 to 8 reasoning traces generated by DeepSeek R1. The traces were verified using Math Verify and LLM-as-Judge based verifier (Llama-3.3-70B-Instruct) The dataset contains:
516,499 problems 1,209,403 R1-generated solutions, with 2.3 solutions per problem on average re-parsed answers… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenR1-Math-Raw.
xDAN2099/lighteval-MATH dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for MathVerse
Dataset Description Paper Information Dataset Examples Leaderboard Citation
Dataset Description
The capabilities of Multi-modal Large Language Models (MLLMs) in visual math problem-solvingremain insufficiently evaluated and understood. We investigate current benchmarks to incorporate excessive visual content within textual questions, which potentially assist MLLMs in deducing answers without truly interpreting the input diagrams.
To… See the full description on the dataset page: https://huggingface.co/datasets/AI4Math/MathVerse.
Mathematics database.
This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.
Original paper: Analysing Mathematical Reasoning Abilities of Neural Models (Saxton, Grefenstette, Hill, Kohli).
Example usage: train_examples, val_examples = datasets.load_dataset( 'math_dataset/arithmetic_mul', split=['train', 'test'], as_supervised=True)