Wloner0809/AIME25 dataset hosted on Hugging Face and contributed by the HF Datasets community
drproduck/aime25 dataset hosted on Hugging Face and contributed by the HF Datasets community
Source: math-ai/aime25 Modification:
Remove redundent columns
Keys:
problem answer
Size
test: 30
guanning/aime25 dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset card for aime25
This dataset was made with Curator.
Dataset details
A sample from the dataset: { "question": "Find the sum of all integer bases $b>9$ for which $17_{b}$ is a divisor of $97_{b}$.", "reasoning": "Okay, let's see. The problem is to find the sum of all integer bases b > 9 where 17_b divides 97_b. Hmm.
First, I need to understand what 17_b and 97_b represent in base 10. In base b, the number 17_b would be 1*b + 7, right? Similarly, 97_bโฆ See the full description on the dataset page: https://huggingface.co/datasets/JingzeShi/aime25.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
flatlander1024/aime25-o3mini dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
AIME 2025 Dataset
Dataset Description
This dataset contains problems from the American Invitational Mathematics Examination (AIME) 2025-I & II.
HennersBro98/reasoning-aime25-nous dataset hosted on Hugging Face and contributed by the HF Datasets community
PrimeIntellect/AIME-25 dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for Dataset Name
Generate 32 solutions for aime25 using deepseek-ai/DeepSeek-R1-Distill-Qwen-7B This dataset card aims to be a base template for new datasets. It has been generated using this raw template.
Dataset Details
Dataset Description
Columns:
problem: str original problem statement answer: str ground truth answer completion: List[str] generations by the model prediction: List[str] prediction extracted from generation score: List[float] isโฆ See the full description on the dataset page: https://huggingface.co/datasets/drproduck/r1-qwen7b-aime25-n32.
HennersBro98/reasoning-aime25-deepscaler dataset hosted on Hugging Face and contributed by the HF Datasets community
HennersBro98/reasoning-aime25-evaluation-system1 dataset hosted on Hugging Face and contributed by the HF Datasets community
drproduck/r1-qwen7b-awq-aime25-n32 dataset hosted on Hugging Face and contributed by the HF Datasets community
justus27/aime-25-genesys dataset hosted on Hugging Face and contributed by the HF Datasets community
jdchang/distill-r1-qwen-1.5b-aime-25-budget dataset hosted on Hugging Face and contributed by the HF Datasets community
jdchang/distill-r1-qwen-1.5b-aime-25-4096 dataset hosted on Hugging Face and contributed by the HF Datasets community
kaiwenw/distill-r1-qwen-1.5b-aime-25-4096-with-old-prm-indices_84480_92160 dataset hosted on Hugging Face and contributed by the HF Datasets community
o3-2025-04-16 Evaluation Results
Precomputed model outputs for evaluation.
Evaluation Results
Summary
Metric AIME25 LiveCodeBenchv5 AMC23 MATH500 MMLUPro JEEBench GPQADiamond LiveCodeBench CodeElo HLE
Accuracy 70.3 66.8 97.5 86.0 38.2 86.2 80.0 79.2 35.2 22.8
AIME25
Average Accuracy: 70.3% ยฑ 1.7% Number of Runs: 10
Run Accuracy Questions Solved Total Questions
1 70.0% 21 30
2 70.0% 21 30
3 70.0% 21 30
4 63.3% 19โฆ See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations-dev/o3-2025-04-16_eval_5ed6.
gpt-4.1-2025-04-14 Evaluation Results
Precomputed model outputs for evaluation.
Evaluation Results
Summary
Metric AIME25 LiveCodeBenchv5 AMC23 MATH500 MMLUPro JEEBench GPQADiamond LiveCodeBench CodeElo HLE AIME24
Accuracy 33.0 46.6 83.2 83.6 30.8 78.3 34.5 65.6 31.1 7.2 0.0
AIME25
Average Accuracy: 33.0% ยฑ 1.3% Number of Runs: 10
Run Accuracy Questions Solved Total Questions
1 33.3% 10 30
2 33.3% 10 30
3 33.3% 10โฆ See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations-dev/gpt-4.1-2025-04-14_eval_5ed6.
Wloner0809/AIME25 dataset hosted on Hugging Face and contributed by the HF Datasets community