Dataset card for AIME 2024
This dataset consists of 30 problems from the 2024 AIME I and AIME II tests. The original source is AI-MO/aimo-validation-aime, which contains a larger set of 90 problems from AIME 2022-2024.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
[!IMPORTANT] Why this dataset is duplicated: This dataset actually repeats the AIME 2024 dataset for 32 times to help calculate metrics like Best-of-32. How we are trying to fix: verl is supporting specifying sampling times for validation and we will fix it asap.
pe-nlp/DAPO-AIME-2024 dataset hosted on Hugging Face and contributed by the HF Datasets community
Perflow-Shuai/aime-2024-long-rl dataset hosted on Hugging Face and contributed by the HF Datasets community
In 2024, the artificial analysis math index ranked AI models based on their mathematical reasoning using benchmarks like AIME 2024 and Math-500. o1, QwQ-32B, and DeepSeek R1, led the rankings, showing the highest proficiency in mathematical problem solving.
CMU-AIRe/aime-2015-2024 dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME) by Model
Abhiram1009/aime-2024-modified dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Artificial Analysis Intelligence Index v2.2 incorporates 8 evaluations: MMLU-Pro, GPQA Diamond, Humanity's Last Exam, LiveCodeBench, SciCode, AIME, IFBench, AA-LCR by Model
Original Dataset: HuggingFaceH4/aime_2024 Translator: gemini-2.0-flash
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Flight paths of drone surveys used to capture imagery and video for the August 3, 2024, Lac Nairne, QC downburst. Ground survey conducted August 7, 2024. DJI Mavic 3E performed four flights. Please note drones are also used for scouting the initial area of interest using a live view on the controller, meaning that some flight paths may not be associated with any imagery. View survey summary map here
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Additional photos collected via drone for the August 3, 2024, Lac Nairne, QC downbust. Ground survey conducted August 7, 2024. DJI Mavic 3E used to capture 34 photos. Does not include videos or drone mapping photos [where applicable].
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Homepage and repository
Homepage: https://matharena.ai/ Repository: https://github.com/eth-sri/matharena
Dataset Summary
This dataset contains the questions from AIME II 2024 used for the MathArena Leaderboard
Data Fields
Below one can find the description of each field in the dataset.
problem_idx (int): Index of the problem in the competition problem (str): Full problem statement answer (str): Ground-truth answer to the question
Source Data
The… See the full description on the dataset page: https://huggingface.co/datasets/MathArena/aime_2024_II.
Comparison of Cost (USD) to run all evaluations in the Artificial Analysis Intelligence Index by Model
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset is a subset of the Camelyon-17 Breast Cancer Challenge. It contains 224x224 H&E histological image patches where blood has been detected. It was originally sampled to validate the blood detection capabilities of the method presented in [1]. Blood was manually identified by a trained technician.
If you use this dataset, please cite:
Pérez-Bueno, F., Engan, K., Molina, R. (2024). Robust blind color deconvolution and blood detection on histological images using Bayesian K-SVD. In: Journal of Artificial Intelligence in Medicine. https://doi.org/10.1016/j.artmed.2024.102969 [bibtex]
Pérez-Bueno, F., Engan, K., Molina, R. (2023). A Robust BKSVD Method for Blind Color Deconvolution and Blood Detection on H&E Histological Images. In: Artificial Intelligence in Medicine. AIME 2023, vol 13897. https://doi.org/10.1007/978-3-031-34344-5_25 [bibtex]
and the original publication for the Camelyon-17 Challenge (see details on the challenge website)
The folder structure is as follows:
center/image_id/pathology_label/patch_label/
pathology_label can take the following values:
patch_label can take the following values:
Patches are sampled at the maximum resolution available 40x, and the filename includes the starting pixel in the x and y dimension. For the original .tiff images at high quality, please refer to the Camelyon-17 Challenge.
The license for this dataset is CC0 following the Camelyon-17 license.
https://www.marketresearchintellect.com/it/privacy-policyhttps://www.marketresearchintellect.com/it/privacy-policy
Ulteriori informazioni sulla relazione sul mercato degli assistenti alle riunioni alimentati dall'intelligenza artificiale da parte di un intelletto di ricerca di mercato, che si è attestato a 1,2 miliardi di dollari nel 2024 e si prevede che si espanda a 3,4 miliardi di USD entro il 2033, crescendo a un CAGR del 15,4%. Scopri come nuove strategie, in aumento degli investimenti e dei migliori giocatori stanno modellando il futuro.
Qwen3-8B AIME Reasoning vs No-Reasoning Dataset (Router Edition)
TL;DR – 933 American Invitational Mathematics Examination (AIME) problems (1983 – 2024) paired with answers generated by Qwen3-8B in two modes:Reasoning off (« no_think ») and Reasoning on (« think »).Each example is auto-verified and labelled with the winner policy used by our routing experiments.
Dataset Summary
This dataset was created for the Router Project, a line of research that investigates… See the full description on the dataset page: https://huggingface.co/datasets/AmirMohseni/AIME-1983-2024-Qwen3-8B.
Dataset card for AIME 2024
This dataset consists of 30 problems from the 2024 AIME I and AIME II tests. The original source is AI-MO/aimo-validation-aime, which contains a larger set of 90 problems from AIME 2022-2024.