MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DongfuJiang/MATH-500 dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MATH-500-Overall
About the dataset
This dataset of only 500 examples combines mathematics, physics and logic in English with reasoning and step-by-step problem solving, the dataset was created synthetically, CoT of Qwen2.5-72B-Instruct and Llama3.3-70B-Instruct.
Brief information
Number of rows: 500 Type of dataset files: parquet Type of dataset: text, alpaca with system prompts Language: English License: MIT
Structure: mathยฏยฏยฏยฏยฏโ school-level (100 rows)โฆ See the full description on the dataset page: https://huggingface.co/datasets/fluently-sets/MATH-500-Overall.
MATH is a new dataset of 12,500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations.
appier-ai-research/MATH-500-translated dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for MATH-500-synthetic
This dataset This dataset contains math problems from the MATH-500 dataset with greedy and Best-of-N solutions generated by the Qwen/Qwen2.5-1.5B-Instruct model. The Best-of-N solutions were generated by sampling N=16 solutions and scoring them with the Skywork/Skywork-o1-Open-PRM-Qwen-2.5-1.5B reward model. Only correct solutions are included.
Comparison of MATH-500 (Quantitative Reasoning) by Model
alperengozeten/MATH-500-SUMMARY dataset hosted on Hugging Face and contributed by the HF Datasets community
In 2024, the artificial analysis math index ranked AI models based on their mathematical reasoning using benchmarks like AIME 2024 and Math-500. o1, QwQ-32B, and DeepSeek R1, led the rankings, showing the highest proficiency in mathematical problem solving.
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Contents of the Dataset: Mathematical PDFs This dataset comprises 500+ mathematical PDF files meticulously curated to cover a wide range of mathematical topics. The primary focus is on key concepts, making it an excellent resource for students, educators, and researchers. The files have been processed and organized for optimal usability in adaptive learning systems and AI-powered educational tools.
Key Features: Comprehensive Coverage of Topics:
Algebra: Equations, variables, polynomials, and algebraic expressions. Calculus: Derivatives, integrals, limits, and differential equations. Geometry: Triangles, circles, angles, and other geometric properties. Trigonometry: Sine, cosine, tangent, and trigonometric identities. Statistics: Probability, distributions, mean, variance, and other statistical concepts. Enhanced Content Processing:
Each document has been pre-processed to extract key concepts, topics, and subtopics. Enables content clustering and topic indexing for seamless topic retrieval. Use Cases:
Adaptive Learning Systems: Personalized lesson generation and targeted exercises. AI-Powered Education Platforms: Semantic search and clustering for better topic recommendations. Content Analysis: Clustering and summarization for advanced data analysis. File Details:
Formats: PDF Source: Internet Archive - Mathematics Collection Size: 500+ files totaling approximately X GB (adjust based on actual size). Processing Capabilities:
The dataset has been structured to allow integration with AI models like Gemini for generating personalized explanations and tracking student progress. Designed for multi-age groups, providing flexibility in learning for students and educators. About the Source The dataset was sourced from the Internet Archive's Mathematics Collection, a reputable and open-access repository of educational content. All files comply with public access guidelines and are redistributed here for educational and non-commercial use.
Licensing The dataset adheres to the applicable licensing guidelines of the source. It is shared under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license, allowing others to remix, adapt, and build upon this content for non-commercial purposes.
gupta-tanish/MATH-500-subset dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
PrimeIntellect/MATH-500 dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
benchang1110/MATH-500-zhtw dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2010 to 2011 for The 500 Role Model Academy vs. Florida and Miami-Dade School District
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset tracks annual math proficiency from 2010 to 2021 for Princeton High School vs. Illinois and Princeton HSD 500 School District
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DongfuJiang/MATH-500 dataset hosted on Hugging Face and contributed by the HF Datasets community