42 datasets found
  1. h

    MATH-500-multilingual

    • huggingface.co
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MATH-500-multilingual [Dataset]. https://huggingface.co/datasets/bezir/MATH-500-multilingual
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 23, 2025
    Authors
    Abdullah Bezir
    Description

    MATH-500 Multilingual Problem Set ๐ŸŒโž—

    A multilingual subset from OpenAI's MATH benchmark. Perfect for testing math skills across languages, this dataset includes same problems in English, French, Italian, Turkish and Spanish.

      ๐ŸŒ Available Languages
    

    English ๐Ÿ‡ฌ๐Ÿ‡ง
    French ๐Ÿ‡ซ๐Ÿ‡ท
    Italian ๐Ÿ‡ฎ๐Ÿ‡น
    Turkish ๐Ÿ‡น๐Ÿ‡ท Spanish ๐Ÿ‡ช๐Ÿ‡ธ

      ๐Ÿ“‚ Source & Attribution
    

    Original Dataset: Sourced from HuggingFaceH4/MATH-500.

      ๐Ÿš€ Quick Start
    

    Load the datasetโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/bezir/MATH-500-multilingual.

  2. h

    math500

    • huggingface.co
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wei Xiong (2025). math500 [Dataset]. https://huggingface.co/datasets/weqweasdas/math500
    Explore at:
    Dataset updated
    Mar 19, 2025
    Authors
    Wei Xiong
    Description

    weqweasdas/math500 dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    MATH-500-SUMMARY

    • huggingface.co
    Updated Mar 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alperen Gozeten (2025). MATH-500-SUMMARY [Dataset]. https://huggingface.co/datasets/alperengozeten/MATH-500-SUMMARY
    Explore at:
    Dataset updated
    Mar 23, 2025
    Authors
    Alperen Gozeten
    Description

    alperengozeten/MATH-500-SUMMARY dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. P

    MATH Dataset

    • paperswithcode.com
    • opendatalab.com
    • +2more
    Updated Jan 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dan Hendrycks; Collin Burns; Saurav Kadavath; Akul Arora; Steven Basart; Eric Tang; Dawn Song; Jacob Steinhardt (2025). MATH Dataset [Dataset]. https://paperswithcode.com/dataset/math
    Explore at:
    Dataset updated
    Jan 10, 2025
    Authors
    Dan Hendrycks; Collin Burns; Saurav Kadavath; Akul Arora; Steven Basart; Eric Tang; Dawn Song; Jacob Steinhardt
    Description

    MATH is a new dataset of 12,500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations.

  5. h

    Mistral-MATH500-Test-Result-of-Mistral-ORM

    • huggingface.co
    Updated Nov 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mistral-MATH500-Test-Result-of-Mistral-ORM [Dataset]. https://huggingface.co/datasets/RLHFlow/Mistral-MATH500-Test-Result-of-Mistral-ORM
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 8, 2024
    Dataset authored and provided by
    RLHFlow
    Description

    RLHFlow/Mistral-MATH500-Test-Result-of-Mistral-ORM dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    MATH500-sft-prm800k-llama31-8b-steptok_temp0-0_300

    • huggingface.co
    Updated Dec 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Violet Xiang (2024). MATH500-sft-prm800k-llama31-8b-steptok_temp0-0_300 [Dataset]. https://huggingface.co/datasets/violetxi/MATH500-sft-prm800k-llama31-8b-steptok_temp0-0_300
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 17, 2024
    Authors
    Violet Xiang
    Description

    violetxi/MATH500-sft-prm800k-llama31-8b-steptok_temp0-0_300 dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. a

    Solar Mini Math 500 by Model on Upstage

    • artificialanalysis.ai
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). Solar Mini Math 500 by Model on Upstage [Dataset]. https://artificialanalysis.ai/providers/upstage
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of by Model

  8. Major AI models, by math and computational reasoning

    • statista.com
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Major AI models, by math and computational reasoning [Dataset]. https://www.statista.com/statistics/1600812/ai-math-benchmarking-ranking/
    Explore at:
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2025
    Area covered
    Worldwide
    Description

    In 2024, the artificial analysis math index ranked AI models based on their mathematical reasoning using benchmarks like AIME 2024 and Math-500. o1, QwQ-32B, and DeepSeek R1, led the rankings, showing the highest proficiency in mathematical problem solving.

  9. a

    Math Index by Nova Endpoint

    • artificialanalysis.ai
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by Nova Endpoint [Dataset]. https://artificialanalysis.ai/models/nova-pro
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  10. h

    MATH500-verification-prm

    • huggingface.co
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhaoyi Zhou (2025). MATH500-verification-prm [Dataset]. https://huggingface.co/datasets/zzy1123/MATH500-verification-prm
    Explore at:
    Dataset updated
    Mar 24, 2025
    Authors
    Zhaoyi Zhou
    Description

    zzy1123/MATH500-verification-prm dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. a

    Math Index by QwQ Endpoint

    • artificialanalysis.ai
    Updated Mar 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by QwQ Endpoint [Dataset]. https://artificialanalysis.ai/models/qwq-32b
    Explore at:
    Dataset updated
    Mar 6, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  12. h

    20250317-math500-sampling-solutions-32-temp

    • huggingface.co
    Updated Mar 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziyang (2025). 20250317-math500-sampling-solutions-32-temp [Dataset]. https://huggingface.co/datasets/hzy/20250317-math500-sampling-solutions-32-temp
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 17, 2025
    Authors
    Ziyang
    Description

    hzy/20250317-math500-sampling-solutions-32-temp dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. a

    Math Index by Model

    • artificialanalysis.ai
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by Model [Dataset]. https://artificialanalysis.ai/
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  14. a

    Math Index by DeepSeek-V2-Chat Endpoint

    • artificialanalysis.ai
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Quality vs. Context Window by DeepSeek-V2-Chat Endpoint [Dataset]. https://artificialanalysis.ai/models/deepseek-v2
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  15. a

    Math Index by o1 Endpoint

    • artificialanalysis.ai
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by o1 Endpoint [Dataset]. https://artificialanalysis.ai/models/o1
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  16. h

    MATH500-with-hints

    • huggingface.co
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alperen Gozeten (2025). MATH500-with-hints [Dataset]. https://huggingface.co/datasets/alperengozeten/MATH500-with-hints
    Explore at:
    Dataset updated
    Mar 24, 2025
    Authors
    Alperen Gozeten
    Description

    alperengozeten/MATH500-with-hints dataset hosted on Hugging Face and contributed by the HF Datasets community

  17. a

    Math Index by o1-preview Endpoint

    • artificialanalysis.ai
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by o1-preview Endpoint [Dataset]. https://artificialanalysis.ai/models/o1-preview
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  18. h

    MATH500-qwen-32B_

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoonho Lee, MATH500-qwen-32B_ [Dataset]. https://huggingface.co/datasets/yoonholee/MATH500-qwen-32B_
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Yoonho Lee
    Description

    yoonholee/MATH500-qwen-32B_ dataset hosted on Hugging Face and contributed by the HF Datasets community

  19. a

    Math Index by Gemini Endpoint

    • artificialanalysis.ai
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by Gemini Endpoint [Dataset]. https://artificialanalysis.ai/models/gemini-2-0-flash
    Explore at:
    Dataset updated
    Feb 6, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  20. AceMath-RewardBench

    • huggingface.co
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NVIDIA (2025). AceMath-RewardBench [Dataset]. https://huggingface.co/datasets/nvidia/AceMath-RewardBench
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    Nvidiahttp://nvidia.com/
    Authors
    NVIDIA
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    website | paper

      AceMath-RewardBench Evaluation Dataset Card
    

    The AceMath-RewardBench evaluation dataset evaluates capabilities of a math reward model using the best-of-N (N=8) setting for 7 datasets:

    GSM8K: 1319 questions Math500: 500 questions Minerva Math: 272 questions Gaokao 2023 en: 385 questions OlympiadBench: 675 questions College Math: 2818 questions MMLU STEM: 3018 questions

    Each example in the dataset contains:

    A mathematical question 64 solution attempts with varyingโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/nvidia/AceMath-RewardBench.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
MATH-500-multilingual [Dataset]. https://huggingface.co/datasets/bezir/MATH-500-multilingual

MATH-500-multilingual

MATH 500 Multilingual

bezir/MATH-500-multilingual

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 23, 2025
Authors
Abdullah Bezir
Description

MATH-500 Multilingual Problem Set ๐ŸŒโž—

A multilingual subset from OpenAI's MATH benchmark. Perfect for testing math skills across languages, this dataset includes same problems in English, French, Italian, Turkish and Spanish.

  ๐ŸŒ Available Languages

English ๐Ÿ‡ฌ๐Ÿ‡ง
French ๐Ÿ‡ซ๐Ÿ‡ท
Italian ๐Ÿ‡ฎ๐Ÿ‡น
Turkish ๐Ÿ‡น๐Ÿ‡ท Spanish ๐Ÿ‡ช๐Ÿ‡ธ

  ๐Ÿ“‚ Source & Attribution

Original Dataset: Sourced from HuggingFaceH4/MATH-500.

  ๐Ÿš€ Quick Start

Load the datasetโ€ฆ See the full description on the dataset page: https://huggingface.co/datasets/bezir/MATH-500-multilingual.

Search
Clear search
Close search
Google apps
Main menu