MATH-500 Multilingual Problem Set ๐โ
A multilingual subset from OpenAI's MATH benchmark. Perfect for testing math skills across languages, this dataset includes same problems in English, French, Italian, Turkish and Spanish.
๐ Available Languages
English ๐ฌ๐ง
French ๐ซ๐ท
Italian ๐ฎ๐น
Turkish ๐น๐ท
Spanish ๐ช๐ธ
๐ Source & Attribution
Original Dataset: Sourced from HuggingFaceH4/MATH-500.
๐ Quick Start
Load the datasetโฆ See the full description on the dataset page: https://huggingface.co/datasets/bezir/MATH-500-multilingual.
weqweasdas/math500 dataset hosted on Hugging Face and contributed by the HF Datasets community
alperengozeten/MATH-500-SUMMARY dataset hosted on Hugging Face and contributed by the HF Datasets community
MATH is a new dataset of 12,500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations.
RLHFlow/Mistral-MATH500-Test-Result-of-Mistral-ORM dataset hosted on Hugging Face and contributed by the HF Datasets community
violetxi/MATH500-sft-prm800k-llama31-8b-steptok_temp0-0_300 dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of by Model
In 2024, the artificial analysis math index ranked AI models based on their mathematical reasoning using benchmarks like AIME 2024 and Math-500. o1, QwQ-32B, and DeepSeek R1, led the rankings, showing the highest proficiency in mathematical problem solving.
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
zzy1123/MATH500-verification-prm dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
hzy/20250317-math500-sampling-solutions-32-temp dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
alperengozeten/MATH500-with-hints dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
yoonholee/MATH500-qwen-32B_ dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
website | paper
AceMath-RewardBench Evaluation Dataset Card
The AceMath-RewardBench evaluation dataset evaluates capabilities of a math reward model using the best-of-N (N=8) setting for 7 datasets:
GSM8K: 1319 questions Math500: 500 questions Minerva Math: 272 questions Gaokao 2023 en: 385 questions OlympiadBench: 675 questions College Math: 2818 questions MMLU STEM: 3018 questions
Each example in the dataset contains:
A mathematical question 64 solution attempts with varyingโฆ See the full description on the dataset page: https://huggingface.co/datasets/nvidia/AceMath-RewardBench.
MATH-500 Multilingual Problem Set ๐โ
A multilingual subset from OpenAI's MATH benchmark. Perfect for testing math skills across languages, this dataset includes same problems in English, French, Italian, Turkish and Spanish.
๐ Available Languages
English ๐ฌ๐ง
French ๐ซ๐ท
Italian ๐ฎ๐น
Turkish ๐น๐ท
Spanish ๐ช๐ธ
๐ Source & Attribution
Original Dataset: Sourced from HuggingFaceH4/MATH-500.
๐ Quick Start
Load the datasetโฆ See the full description on the dataset page: https://huggingface.co/datasets/bezir/MATH-500-multilingual.