Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Multilingual Grade School Math Benchmark (MGSM) is a benchmark of grade-school math problems, proposed in the paper Language models are multilingual chain-of-thought reasoners.
The same 250 problems from GSM8K are each translated via human annotators in 10 languages. The 10 languages are: - Spanish - French - German - Russian - Chinese - Japanese - Thai - Swahili - Bengali - Telugu
You can find the input and targets for each of the ten languages (and English) as .tsv
files.
We also include few-shot exemplars that are also manually translated from each language in exemplars.py
.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Sources
Paper: BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models Link: https://huggingface.co/papers/2502.07346 Repository: https://github.com/CONE-MT/BenchMAX
Dataset Description
BenchMAX_Math is a dataset of BenchMAX, sourcing from MGSM, which evaluates the math reasoning capability in multilingual scenarios. We extend the original MGSM dataset by six additional languages, i.e. Arabic, Czech, Hungarian, Korean, Serbian, and… See the full description on the dataset page: https://huggingface.co/datasets/LLaMAX/BenchMAX_Math.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Multilingual Grade School Math Benchmark (MGSM) is a benchmark of grade-school math problems, proposed in the paper Language models are multilingual chain-of-thought reasoners.
The same 250 problems from GSM8K are each translated via human annotators in 10 languages. The 10 languages are: - Spanish - French - German - Russian - Chinese - Japanese - Thai - Swahili - Bengali - Telugu
You can find the input and targets for each of the ten languages (and English) as .tsv
files.
We also include few-shot exemplars that are also manually translated from each language in exemplars.py
.