Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Our dataset is gathered by using a new representation language to annotate over the AQuA-RAT dataset. AQuA-RAT has provided the questions, options, rationale, and the correct options.
• Question: A train running at the speed of 48 km / hr crosses a pole in 9 seconds . what is the length of the train ? • Rationale: Speed = ( 48 x 5 / 18 ) m / sec = ( 40 / 3 ) m / sec . length of the train = ( speed x time ) . length of the train = ( 40 / 3 x 9 ) m = 120 m . answer is c . • Options: a ) 140 , b ) 130 , c ) 120 , d ) 170 , e ) 160 • Correct Option is: C
The rationales are noisy, incomplete and sometimes incorrect. We correct these rationales and provide stepwise solutions for a portion of AQuA-RAT.
• Our Annotated Formula: multiply(divide(multiply(48, const_1000), const_3600), 9)
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for AQuA-Rat
Homepage: https://github.com/google-deepmind/AQuA This is an unofficial curation of the AQuA-Rat dataset, uploaded here with minimal (i.e., no content-modifying) processing.
Paper: Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems (ACL Anthology)
Modifications:
Pre-tokenized splits removed since tokenization built into most LM pipelines. Fixed file suffix from .json to .jsonl. Changed options column to… See the full description on the dataset page: https://huggingface.co/datasets/mathewhe/aqua_rat.
Facebook
Twitterlaurentiubp/aqua-rat dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
A large-scale dataset consisting of approximately 100,000 algebraic word problems. The solution to each question is explained step-by-step using natural language. This data is used to train a program generation model that learns to generate the explanation, while generating the program that solves the question.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Việt Hưng Nguyễn
Released under Apache 2.0
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Darien Schettler
Released under Apache 2.0
Facebook
Twitterquzhe/aqua-rat dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis dataset contains the algebraic word problems with rationales described in our paper:
Wang Ling, Dani Yogatama, Chris Dyer, and Phil Blunsom. (2017) Program Induction by Rationale Generation: Learning to Solve and Explain Algebraic Word Problems. In Proc. ACL. https://arxiv.org/pdf/1705.04146.pdf
The dataset consists of about 100,000 algebraic word problems with natural language rationales. Each problem is a json object consisting of four parts:
question - A natural language definition of the problem to solve options - 5 possible options (A, B, C, D and E), among which one is correct rationale - A natural language description of the solution to the problem correct - The correct option
{ "question": "A grocery sells a bag of ice for $1.25, and makes 20% profit. If it sells 500 bags of ice, how much total profit does it make?", "options": ["A)125", "B)150", "C)225", "D)250", "E)275"], "rationale": "Profit per bag = 1.25 * 0.20 = 0.25 Total profit = 500 * 0.25 = 125 Answer is A.", "correct": "A" }
train.json -> untokenized training set train.tok.json -> tokenized training set dev.json -> untokenized development set dev.tok.json -> tokenized development set test.json -> untokenized test set test.tok.json -> tokenized test set
This dataset has been fully crowdsourced, as described using the technique in the paper (Ling et al., 2017). The initial published results included in the paper were derived from a previous version of this dataset that cannot be released in full, and results using the published system will differ. Results using our published system will be forthcoming.
https://github.com/deepmind/AQuA
https://media.giphy.com/media/YknAouVrcbkiDvWUOR/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/26xBtSyoi5hUUkCEo/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/4LiMmbAcvgTQs/giphy.gif" alt="Alt Text">
https://media.giphy.com/media/3o6Ztg5jGKDQSjaZ1K/giphy.gif" alt="Alt Text">
Facebook
TwitterDeepMind AQUA Rat Converted Dataset
This dataset is a refined version of the original DeepMind AQUA-RAT benchmark. Originally designed as a multiple-choice question-answering dataset, AQUA-RAT has been transformed in this version to require a numerical answer in many cases.
Overview
Conversion Approach:
Approximately two-thirds of the dataset now requires a single numerical answer.
For questions that are harder to convert to a single verifiable answer, the original… See the full description on the dataset page: https://huggingface.co/datasets/ideacode/aquarat.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Huggingface Hub: link
We introduce a large-scale dataset of math word problems. Our dataset is gathered by using a new representation language to annotate over the AQuA-RAT dataset with fully-specified operational programs. AQuA-RAT has provided the questions, options, rationale, and the correct options.
- A math word problem solving model can be trained on this dataset in order to better understand how to solve math word problems.
- This dataset can be used to develop new methods for automatically annotating math word problems with fully-specified operational programs.
- This dataset can be used as a benchmark for evaluating the performance of various methods for solving math word problems
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: validation.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Problem | The math word problem. (String) | | Rationale | The rationale for the math word problem. (String) | | options | The options for the math word problem. (List of strings) | | correct | The correct option for the math word problem. (String) | | annotated_formula | The annotated formula for the math word problem. (String) | | linear_formula | The linear formula for the math word problem. (String) | | category | The category for the math word problem. (String) |
File: train.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Problem | The math word problem. (String) | | Rationale | The rationale for the math word problem. (String) | | options | The options for the math word problem. (List of strings) | | correct | The correct option for the math word problem. (String) | | annotated_formula | The annotated formula for the math word problem. (String) | | linear_formula | The linear formula for the math word problem. (String) | | category | The category for the math word problem. (String) |
File: test.csv | Column name | Description | |:----------------------|:----------------------------------------------------------| | Problem | The math word problem. (String) | | Rationale | The rationale for the math word problem. (String) | | options | The options for the math word problem. (List of strings) | | correct | The correct option for the math word problem. (String) | | annotated_formula | The annotated formula for the math word problem. (String) | | linear_formula | The linear formula for the math word problem. (String) | | category | The category for the math word problem. (String) |
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset from https://huggingface.co/datasets/deepmind/aqua_rat
add new column "encoded_label" with mapping:
Label Encoding Mapping: {'A': 0, 'B': 1, 'C': 2, 'D': 3, 'E': 4}
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Our dataset is gathered by using a new representation language to annotate over the AQuA-RAT dataset. AQuA-RAT has provided the questions, options, rationale, and the correct options.
Facebook
Twitterlaurentiubp/aquarat-scored dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterLiSoViMa/AquaRat dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MrMaxMind99/reformatted-aquarat dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Model Name
MMLU-Pro-Plus Baseline Drop MMLU-Pro Baseline Drop Added Exp MMLU Pro Plus Added MMLU-redux 2.0 Baseline Drop AQUA-RAT Baseline Drop
CohereLabs/c4ai-command-a-03-2025 111B ✅ (single inference) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
google/gemma-3-12b-it 12B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
meta-llama/Llama-4-Scout-17B-16E 17B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done
-
-
-
Qwen/Qwen3-4B 4B… See the full description on the dataset page: https://huggingface.co/datasets/sleeping-ai/Judgement-baseline.
Facebook
TwitterEmilRyd/aquarat-sft-gt-stylized-modified dataset hosted on Hugging Face and contributed by the HF Datasets community
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Our dataset is gathered by using a new representation language to annotate over the AQuA-RAT dataset. AQuA-RAT has provided the questions, options, rationale, and the correct options.
• Question: A train running at the speed of 48 km / hr crosses a pole in 9 seconds . what is the length of the train ? • Rationale: Speed = ( 48 x 5 / 18 ) m / sec = ( 40 / 3 ) m / sec . length of the train = ( speed x time ) . length of the train = ( 40 / 3 x 9 ) m = 120 m . answer is c . • Options: a ) 140 , b ) 130 , c ) 120 , d ) 170 , e ) 160 • Correct Option is: C
The rationales are noisy, incomplete and sometimes incorrect. We correct these rationales and provide stepwise solutions for a portion of AQuA-RAT.
• Our Annotated Formula: multiply(divide(multiply(48, const_1000), const_3600), 9)