23 datasets found
  1. h

    aimo-validation-amc

    • huggingface.co
    Updated Jul 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Project-Numina (2024). aimo-validation-amc [Dataset]. https://huggingface.co/datasets/AI-MO/aimo-validation-amc
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Dataset authored and provided by
    Project-Numina
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for AIMO Validation AMC

    All 83 come from AMC12 2022, AMC12 2023, and have been extracted from the AOPS wiki page https://artofproblemsolving.com/wiki/index.php/AMC_12_Problems_and_Solutions This dataset serves as an internal validation set during our participation in the AIMO progress prize competition. Using data after 2021 is to avoid potential overlap with the MATH training set. Here are the different columns in the dataset: problem: the modified problem statement… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/aimo-validation-amc.

  2. h

    amc23

    • huggingface.co
    Updated Oct 31, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhiwei He (2024). amc23 [Dataset]. https://huggingface.co/datasets/zwhe99/amc23
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2024
    Authors
    Zhiwei He
    Description

    zwhe99/amc23 dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    amc

    • huggingface.co
    Updated Apr 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yi (2025). amc [Dataset]. https://huggingface.co/datasets/felixZzz/amc
    Explore at:
    Dataset updated
    Apr 27, 2025
    Authors
    yi
    Description

    felixZzz/amc dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    amc

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdul Waheed, amc [Dataset]. https://huggingface.co/datasets/macabdul9/amc
    Explore at:
    Authors
    Abdul Waheed
    Description

    macabdul9/amc dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    AMC

    • huggingface.co
    Updated May 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chandra Mohan Bhuma (2025). AMC [Dataset]. https://huggingface.co/datasets/chandrabhuma/AMC
    Explore at:
    Dataset updated
    May 11, 2025
    Authors
    Chandra Mohan Bhuma
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    chandrabhuma/AMC dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    aimo-validation-amc-repeated3

    • huggingface.co
    Updated Apr 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yibo Wang (2025). aimo-validation-amc-repeated3 [Dataset]. https://huggingface.co/datasets/yiboowang/aimo-validation-amc-repeated3
    Explore at:
    Dataset updated
    Apr 13, 2025
    Authors
    Yibo Wang
    Description

    yiboowang/aimo-validation-amc-repeated3 dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    math_eval_suite-amc

    • huggingface.co
    Updated May 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yi (2025). math_eval_suite-amc [Dataset]. https://huggingface.co/datasets/felixZzz/math_eval_suite-amc
    Explore at:
    Dataset updated
    May 4, 2025
    Authors
    yi
    Description

    felixZzz/math_eval_suite-amc dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    Easy2Hard-Bench

    • huggingface.co
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Furong Huang's Lab at UMD (2024). Easy2Hard-Bench [Dataset]. https://huggingface.co/datasets/furonghuang-lab/Easy2Hard-Bench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2024
    Dataset authored and provided by
    Furong Huang's Lab at UMD
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Easy2Hard-Bench

      Dataset Description
    

    Easy2Hard-Bench is a benchmark consisting with 6 datasets in different domain (mathematics, programming, chess, and various reasoning tasks). The problems from each dataset are labeled with continuous-valued difficulty levels.

    Topic Source Statistics Used to Infer Difficulty Source Type Estimation Method

    E2H-AMC Math Competitions AMC, AIME, HMMT Item difficulties Human IRT

    E2H-Codeforces Competitive Programming… See the full description on the dataset page: https://huggingface.co/datasets/furonghuang-lab/Easy2Hard-Bench.

  9. h

    ToM-in-AMC

    • huggingface.co
    Updated Mar 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shunchi Zhang (2025). ToM-in-AMC [Dataset]. https://huggingface.co/datasets/ShunchiZhang/ToM-in-AMC
    Explore at:
    Dataset updated
    Mar 30, 2025
    Authors
    Shunchi Zhang
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for ToM-in-AMC

    The dataset consists of ∼1,000 parsed movie scripts from IMSDb, each corresponding to a character understanding task.

      Citation
    

    BibTeX: @inproceedings{yu2024few, title = {Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind}, author = {Yu, Mo and Wang, Qiujing and Zhang, Shunchi and Sang, Yisi and Pu, Kangsheng and Wei, Zekai and Wang, Han and Xu, Liyan and Li, Jing and Yu, Yue and Zhou, Jie}… See the full description on the dataset page: https://huggingface.co/datasets/ShunchiZhang/ToM-in-AMC.

  10. h

    2024_AMC12

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    robert washbourne, 2024_AMC12 [Dataset]. https://huggingface.co/datasets/rawsh/2024_AMC12
    Explore at:
    Authors
    robert washbourne
    Description

    All problems copyrighted by the Mathematical Association of America's American Mathematics Competitions Source:

    https://artofproblemsolving.com/wiki/index.php/2024_AMC_12A_Problems https://artofproblemsolving.com/wiki/index.php/2024_AMC_12B_Problems

    Removed problems with figures:

    12A: problem 14,18,22 12B: problem 7, 19

  11. h

    MATH-Hard

    • huggingface.co
    Updated Dec 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation datasets (2024). MATH-Hard [Dataset]. https://huggingface.co/datasets/lighteval/MATH-Hard
    Explore at:
    Dataset updated
    Dec 22, 2024
    Dataset authored and provided by
    Evaluation datasets
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for Mathematics Aptitude Test of Heuristics, hard subset (MATH-Hard) dataset

      Dataset Summary
    

    The Mathematics Aptitude Test of Heuristics (MATH) dataset consists of problems from mathematics competitions, including the AMC 10, AMC 12, AIME, and more. Each problem in MATH has a full step-by-step solution, which can be used to teach models to generate answer derivations and explanations. For MATH-Hard, only the hardest questions were kept (Level 5).… See the full description on the dataset page: https://huggingface.co/datasets/lighteval/MATH-Hard.

  12. h

    olympiad-math-contest-llama3-20k

    • huggingface.co
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Amiri (2024). olympiad-math-contest-llama3-20k [Dataset]. https://huggingface.co/datasets/kevin009/olympiad-math-contest-llama3-20k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 1, 2024
    Authors
    Kevin Amiri
    Description

    AMC/AIME Mathematics Problem and Solution Dataset

      Dataset Details
    

    Dataset Name: AMC/AIME Mathematics Problem and Solution Dataset Version: 1.0 Release Date: 2024-06-1 Authors: Kevin Amiri

      Intended Use
    

    Primary Use: The dataset is created and intended for research and an AI Mathematical Olympiad Kaggle competition. Intended Users: Researchers in AI & mathematics or science.

      Dataset Composition
    

    Number of Examples: 20,300 problems and solution sets… See the full description on the dataset page: https://huggingface.co/datasets/kevin009/olympiad-math-contest-llama3-20k.

  13. h

    amc2k

    • huggingface.co
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zeynep cahan (2025). amc2k [Dataset]. https://huggingface.co/datasets/zypchn/amc2k
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    zeynep cahan
    Description

    Dataset Sources: AMC 8 - AMC 10 - AMC 12 Both problems and solutions were scraped from their original URLs, preserving LaTeX format.

  14. h

    DeepScaleR-Preview-Dataset

    • huggingface.co
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agentica (2025). DeepScaleR-Preview-Dataset [Dataset]. https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 10, 2025
    Dataset authored and provided by
    Agentica
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Data

    Our training dataset consists of approximately 40,000 unique mathematics problem-answer pairs compiled from:

    AIME (American Invitational Mathematics Examination) problems (1984-2023) AMC (American Mathematics Competition) problems (prior to 2023) Omni-MATH dataset Still dataset

      Format
    

    Each row in the JSON dataset contains:

    problem: The mathematical question text, formatted with LaTeX notation. solution: Offical solution to the problem, including LaTeX formatting… See the full description on the dataset page: https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset.

  15. h

    MATH

    • huggingface.co
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minghui Jia (2024). MATH [Dataset]. https://huggingface.co/datasets/Maxwell-Jia/MATH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2024
    Authors
    Minghui Jia
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    MATH Dataset

    The Mathematics Aptitude Test of Heuristics (MATH) dataset consists of problems from mathematics competitions, including the AMC 10, AMC 12, AIME, and more. Each problem in MATH has a full step-by-step solution, which can be used to teach models to generate answer derivations and explanations. This is a converted version of the hendrycks/competition_math originally created by Hendrycks et al. The dataset has been converted to parquet format for easier loading and usage.… See the full description on the dataset page: https://huggingface.co/datasets/Maxwell-Jia/MATH.

  16. h

    amc_turkish

    • huggingface.co
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    barandinho (2025). amc_turkish [Dataset]. https://huggingface.co/datasets/barandinho/amc_turkish
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    barandinho
    Description

    Description

    Turkish translated version of barandinho/amc_2k_answers (solution column is dropped and not translated).

      Dataset Curation Process
    

    AMC 8, 10 and 12 problems were scraped (Acknowledgment: zypchn)Scraped data then deduplicated with basic Jaccard similarity methodThen answer column is created from scraped solutionsFinally Turkish translation was done via claude-3-7-sonnet-20250219 batch processing (cost us approx. $5)Note : we discarded rows that include string… See the full description on the dataset page: https://huggingface.co/datasets/barandinho/amc_turkish.

  17. h

    dataset_0

    • huggingface.co
    Updated Apr 13, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    XXX (2024). dataset_0 [Dataset]. https://huggingface.co/datasets/xyy888/dataset_0
    Explore at:
    Dataset updated
    Apr 13, 2024
    Authors
    XXX
    Description

    MiniF2F is a formal mathematics benchmark (translated across multiple formal systems) consisting of exercise statements from olympiads (AMC, AIME, IMO) as well as high-school and undergraduate maths classes. This dataset contains formal statements in Isabelle. Each statement is paired with an informal statement and an informal proof, as described in Draft, Sketch, Prove [Jiang et al 2023]. The problems in this dataset use the most recent facebookresearch/miniF2F commit on July 3, 2023.

  18. h

    MATH_Difficulty

    • huggingface.co
    Updated Apr 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Language, Intelligence, and Model Evaluation Lab (2025). MATH_Difficulty [Dataset]. https://huggingface.co/datasets/lime-nlp/MATH_Difficulty
    Explore at:
    Dataset updated
    Apr 9, 2025
    Dataset authored and provided by
    Language, Intelligence, and Model Evaluation Lab
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Difficulty Estimation on MATH

    We annotate the entire MATH dataset with a difficulty score based on the performance of the Qwen 2.5-MATH-7B model. This provides an adaptive signal for curriculum construction and model evaluation. The Mathematics Aptitude Test of Heuristics (MATH) dataset consists of problems from mathematics competitions, including the AMC 10, AMC 12, AIME, and more. Each problem in MATH has a full step-by-step solution, which can be used to teach models to generate… See the full description on the dataset page: https://huggingface.co/datasets/lime-nlp/MATH_Difficulty.

  19. h

    srt_test_dataset

    • huggingface.co
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahim Tajwar (2025). srt_test_dataset [Dataset]. https://huggingface.co/datasets/ftajwar/srt_test_dataset
    Explore at:
    Dataset updated
    May 27, 2025
    Authors
    Fahim Tajwar
    Description

    Test Dataset Compilation For Self-Rewarding Training

    This is our test dataset compilation for our paper, "Can Large Reasoning Models Self-Train?" Please see our project page for more information about our project. In our paper, we use the three following datasets for evaluation:

    AIME 2024 AIME 2025 AMC

    Moreover, we also subsample 1% of the DAPO dataset for additional validation purposes. In this dataset, we compile all 4 of them together. This, together with our data preprocessing… See the full description on the dataset page: https://huggingface.co/datasets/ftajwar/srt_test_dataset.

  20. h

    OREAL-RL-Prompts

    • huggingface.co
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    InternLM (2025). OREAL-RL-Prompts [Dataset]. https://huggingface.co/datasets/internlm/OREAL-RL-Prompts
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 11, 2025
    Dataset authored and provided by
    InternLM
    Description

    OREAL-RL-Prompts

      Links
    

    Arxiv Github OREAL-7B Model OREAL-32B Model Data

      Introduction
    

    This repository contains the prompts used in the RL training phase of the OREAL project. The prompts are collected from MATH, Numina, and historical AMC/AIME (2024 is excluded). The pass rate of the prompts are calculated with 16 times of inference with OREAL-7B-SFT.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Project-Numina (2024). aimo-validation-amc [Dataset]. https://huggingface.co/datasets/AI-MO/aimo-validation-amc

aimo-validation-amc

AI-MO/aimo-validation-amc

Explore at:
48 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 19, 2024
Dataset authored and provided by
Project-Numina
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for AIMO Validation AMC

All 83 come from AMC12 2022, AMC12 2023, and have been extracted from the AOPS wiki page https://artofproblemsolving.com/wiki/index.php/AMC_12_Problems_and_Solutions This dataset serves as an internal validation set during our participation in the AIMO progress prize competition. Using data after 2021 is to avoid potential overlap with the MATH training set. Here are the different columns in the dataset: problem: the modified problem statement… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/aimo-validation-amc.

Search
Clear search
Close search
Google apps
Main menu