Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for AQUA-RAT
Dataset Summary
A large-scale dataset consisting of approximately 100,000 algebraic word problems. The solution to each question is explained step-by-step using natural language. This data is used to train a program generation model that learns to generate the explanation, while generating the program that solves the question.
Supported Tasks and Leaderboards
Languages
en
Dataset Structure
Data Instances… See the full description on the dataset page: https://huggingface.co/datasets/deepmind/aqua_rat.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Calc-aqua_rat
Summary
This dataset is an instance of AQuA-RAT dataset extended with in-context calls of a sympy calculator.
Supported Tasks
The dataset is intended for training Chain-of-Thought reasoning models able to use external tools to enhance the factuality of their responses. This dataset presents in-context scenarios where models can outsource the computations in the reasoning chain to a calculator.
Construction Process
The… See the full description on the dataset page: https://huggingface.co/datasets/MU-NLPC/Calc-aqua_rat.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
AQUA-RAT MCQA Dataset
This dataset contains the AQUA-RAT dataset converted to Multiple Choice Question Answering (MCQA) format with modifications.
Dataset Description
AQUA-RAT is a dataset of algebraic word problems with rationales. This version has been processed to:
Remove all questions where the correct answer was option "E" (5th choice) Remove the "E" option from all remaining questions (4 choices: A, B, C, D) Merge validation and test splits into a single test split… See the full description on the dataset page: https://huggingface.co/datasets/RikoteMaster/aqua-rat-mcqa.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Việt Hưng Nguyễn
Released under Apache 2.0
laurentiubp/aqua-rat dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Our dataset is gathered by using a new representation language to annotate over the AQuA-RAT dataset. AQuA-RAT has provided the questions, options, rationale, and the correct options.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
MNLP M2 MCQA Dataset
A unified multiple-choice question answering (MCQA) benchmark on STEM subjects combining samples from OpenBookQA, SciQ, MMLU-auxiliary, AQUA-Rat, and MedMCQA.
Dataset Summary
This dataset merges five existing science and knowledge-based MCQA datasets into one standardized format:
Source Train samples
OpenBookQA 4 900
SciQ 10 000
MMLU-aux 85 100
AQUA-Rat 50 000
MedMCQA 50 000
Total 200 000
Supported Tasks and… See the full description on the dataset page: https://huggingface.co/datasets/NicoHelemon/MNLP_M2_mcqa_dataset.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Model Name
MMLU-Pro-Plus Baseline Drop MMLU-Pro Baseline Drop Added Exp MMLU Pro Plus Added MMLU-redux 2.0 Baseline Drop AQUA-RAT Baseline Drop
CohereLabs/c4ai-command-a-03-2025 111B ✅ (single inference) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
google/gemma-3-12b-it 12B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
meta-llama/Llama-4-Scout-17B-16E 17B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
Qwen/Qwen3-4B 4B… See the full description on the dataset page: https://huggingface.co/datasets/sleeping-ai/Judgement-baseline.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for AQUA-RAT
Dataset Summary
A large-scale dataset consisting of approximately 100,000 algebraic word problems. The solution to each question is explained step-by-step using natural language. This data is used to train a program generation model that learns to generate the explanation, while generating the program that solves the question.
Supported Tasks and Leaderboards
Languages
en
Dataset Structure
Data Instances… See the full description on the dataset page: https://huggingface.co/datasets/deepmind/aqua_rat.