7 datasets found
  1. P

    MathVista Dataset

    • paperswithcode.com
    • huggingface.co
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pan Lu; Hritik Bansal; Tony Xia; Jiacheng Liu; Chunyuan Li; Hannaneh Hajishirzi; Hao Cheng; Kai-Wei Chang; Michel Galley; Jianfeng Gao (2024). MathVista Dataset [Dataset]. https://paperswithcode.com/dataset/mathvista
    Explore at:
    Dataset updated
    Nov 21, 2024
    Authors
    Pan Lu; Hritik Bansal; Tony Xia; Jiacheng Liu; Chunyuan Li; Hannaneh Hajishirzi; Hao Cheng; Kai-Wei Chang; Michel Galley; Jianfeng Gao
    Description

    MathVista is a consolidated Mathematical reasoning benchmark within Visual contexts. It consists of three newly created datasets, IQTest, FunctionQA, and PaperQA, which address the missing visual domains and are tailored to evaluate logical reasoning on puzzle test figures, algebraic reasoning over functional plots, and scientific reasoning with academic paper figures, respectively. It also incorporates 9 MathQA datasets and 19 VQA datasets from the literature, which significantly enrich the diversity and complexity of visual perception and mathematical reasoning challenges within our benchmark. In total, MathVista includes 6,141 examples collected from 31 different datasets.

    Project: https://mathvista.github.io/ Visualization: https://mathvista.github.io/#visualization Leaderboard: https://mathvista.github.io/#leaderboard Paper: https://arxiv.org/abs/2310.02255 Data: https://huggingface.co/datasets/AI4Math/MathVista Code: https://github.com/lupantech/MathVista

  2. h

    VIM-MathVista

    • huggingface.co
    Updated Mar 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VIM-Bench (2024). VIM-MathVista [Dataset]. https://huggingface.co/datasets/VIM-Bench/VIM-MathVista
    Explore at:
    Dataset updated
    Mar 14, 2024
    Dataset authored and provided by
    VIM-Bench
    Description

    VIM-Bench/VIM-MathVista dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    MathVista

    • huggingface.co
    Updated Oct 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MathVista [Dataset]. https://huggingface.co/datasets/macabdul9/MathVista
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 16, 2023
    Authors
    Abdul Waheed
    Description

    macabdul9/MathVista dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    MathVista-CoT-num10

    • huggingface.co
    Updated Mar 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MathVista-CoT-num10 [Dataset]. https://huggingface.co/datasets/yuanshengni/MathVista-CoT-num10
    Explore at:
    Dataset updated
    Mar 24, 2025
    Authors
    Yuansheng Ni
    Description

    yuanshengni/MathVista-CoT-num10 dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. a

    Math Index by Nova Endpoint

    • artificialanalysis.ai
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Math Index by Nova Endpoint [Dataset]. https://artificialanalysis.ai/models/nova-pro
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of math benchmarks in the Artificial Analysis Intelligence Index (AIME 2024 & Math-500) by Model

  6. a

    Intelligence Index by GPT-4.5 Endpoint

    • artificialanalysis.ai
    Updated Mar 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Intelligence Index by GPT-4.5 Endpoint [Dataset]. https://artificialanalysis.ai/models/gpt-4-5
    Explore at:
    Dataset updated
    Mar 15, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Intelligence Index incorporates 7 evaluations spanning reasoning, knowledge, math & coding by Model

  7. a

    Intelligence Index by Nova Endpoint

    • artificialanalysis.ai
    Updated Feb 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2025). Intelligence Index by Nova Endpoint [Dataset]. https://artificialanalysis.ai/models/nova-pro
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Intelligence Index incorporates 7 evaluations spanning reasoning, knowledge, math & coding by Model

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Pan Lu; Hritik Bansal; Tony Xia; Jiacheng Liu; Chunyuan Li; Hannaneh Hajishirzi; Hao Cheng; Kai-Wei Chang; Michel Galley; Jianfeng Gao (2024). MathVista Dataset [Dataset]. https://paperswithcode.com/dataset/mathvista

MathVista Dataset

Mathematical Reasoning of in Visual Contexts

Explore at:
388 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 21, 2024
Authors
Pan Lu; Hritik Bansal; Tony Xia; Jiacheng Liu; Chunyuan Li; Hannaneh Hajishirzi; Hao Cheng; Kai-Wei Chang; Michel Galley; Jianfeng Gao
Description

MathVista is a consolidated Mathematical reasoning benchmark within Visual contexts. It consists of three newly created datasets, IQTest, FunctionQA, and PaperQA, which address the missing visual domains and are tailored to evaluate logical reasoning on puzzle test figures, algebraic reasoning over functional plots, and scientific reasoning with academic paper figures, respectively. It also incorporates 9 MathQA datasets and 19 VQA datasets from the literature, which significantly enrich the diversity and complexity of visual perception and mathematical reasoning challenges within our benchmark. In total, MathVista includes 6,141 examples collected from 31 different datasets.

Project: https://mathvista.github.io/ Visualization: https://mathvista.github.io/#visualization Leaderboard: https://mathvista.github.io/#leaderboard Paper: https://arxiv.org/abs/2310.02255 Data: https://huggingface.co/datasets/AI4Math/MathVista Code: https://github.com/lupantech/MathVista

Search
Clear search
Close search
Google apps
Main menu