8 datasets found
  1. Math-PUMA_Data_Stage2

    • huggingface.co
    Updated Oct 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Math-PUMA (2024). Math-PUMA_Data_Stage2 [Dataset]. https://huggingface.co/datasets/Math-PUMA/Math-PUMA_Data_Stage2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 5, 2024
    Dataset provided by
    PUMAhttps://puma.com/
    Authors
    Math-PUMA
    License

    https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/

    Description

    Citation

    If you find our work helpful, feel free to give us a cite. @article{zhuang2024math, title={Math-puma: Progressive upward multimodal alignment to enhance mathematical reasoning}, author={Zhuang, Wenwen and Huang, Xin and Zhang, Xiantao and Zeng, Jin}, journal={arXiv preprint arXiv:2408.08640}, year={2024} }

  2. P

    GSM8K Dataset

    • paperswithcode.com
    • tensorflow.org
    • +2more
    Updated Dec 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karl Cobbe; Vineet Kosaraju; Mohammad Bavarian; Mark Chen; Heewoo Jun; Lukasz Kaiser; Matthias Plappert; Jerry Tworek; Jacob Hilton; Reiichiro Nakano; Christopher Hesse; John Schulman (2024). GSM8K Dataset [Dataset]. https://paperswithcode.com/dataset/gsm8k
    Explore at:
    Dataset updated
    Dec 31, 2024
    Authors
    Karl Cobbe; Vineet Kosaraju; Mohammad Bavarian; Mark Chen; Heewoo Jun; Lukasz Kaiser; Matthias Plappert; Jerry Tworek; Jacob Hilton; Reiichiro Nakano; Christopher Hesse; John Schulman
    Description

    GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers. The dataset is segmented into 7.5K training problems and 1K test problems. These problems take between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the final answer. A bright middle school student should be able to solve every problem. It can be used for multi-step mathematical reasoning.

  3. h

    gsm8k

    • huggingface.co
    Updated Aug 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2022). gsm8k [Dataset]. https://huggingface.co/datasets/openai/gsm8k
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2022
    Dataset authored and provided by
    OpenAIhttp://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for GSM8K

      Dataset Summary
    

    GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning.

    These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.

  4. f

    Comparative experiments of multimodal sentiment analysis models on the...

    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ji Mingyu; Zhou Jiawei; Wei Ning (2023). Comparative experiments of multimodal sentiment analysis models on the dataset CMU-MOSEI. [Dataset]. http://doi.org/10.1371/journal.pone.0273936.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ji Mingyu; Zhou Jiawei; Wei Ning
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparative experiments of multimodal sentiment analysis models on the dataset CMU-MOSEI.

  5. P

    IconQA Dataset

    • paperswithcode.com
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IconQA Dataset [Dataset]. https://paperswithcode.com/dataset/iconqa
    Explore at:
    Dataset updated
    Mar 7, 2024
    Authors
    Pan Lu; Liang Qiu; Jiaqi Chen; Tony Xia; Yizhou Zhao; Wei zhang; Zhou Yu; Xiaodan Liang; Song-Chun Zhu
    Description

    Current visual question answering (VQA) tasks mainly consider answering human-annotated questions for natural images in the daily-life context. Icon question answering (IconQA) is a benchmark which aims to highlight the importance of abstract diagram understanding and comprehensive cognitive reasoning in real-world diagram word problems. For this benchmark, a large-scale IconQA dataset is built that consists of three sub-tasks: multi-image-choice, multi-text-choice, and filling-in-the-blank. Compared to existing VQA benchmarks, IconQA requires not only perception skills like object recognition and text understanding, but also diverse cognitive reasoning skills, such as geometric reasoning, commonsense reasoning, and arithmetic reasoning.

    Description from: IconQA

  6. P

    MMLU Dataset

    • paperswithcode.com
    Updated Jan 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MMLU Dataset [Dataset]. https://paperswithcode.com/dataset/mmlu
    Explore at:
    Dataset updated
    Jan 5, 2025
    Authors
    Dan Hendrycks; Collin Burns; Steven Basart; Andy Zou; Mantas Mazeika; Dawn Song; Jacob Steinhardt
    Description

    MMLU (Massive Multitask Language Understanding) is a new benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings. This makes the benchmark more challenging and more similar to how we evaluate humans. The benchmark covers 57 subjects across STEM, the humanities, the social sciences, and more. It ranges in difficulty from an elementary level to an advanced professional level, and it tests both world knowledge and problem solving ability. Subjects range from traditional areas, such as mathematics and history, to more specialized areas like law and ethics. The granularity and breadth of the subjects makes the benchmark ideal for identifying a model’s blind spots.

  7. f

    CMU-MOSI dataset information.

    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ji Mingyu; Zhou Jiawei; Wei Ning (2023). CMU-MOSI dataset information. [Dataset]. http://doi.org/10.1371/journal.pone.0273936.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ji Mingyu; Zhou Jiawei; Wei Ning
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CMU-MOSI dataset information.

  8. h

    R1-Onevision

    • huggingface.co
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R1-Onevision [Dataset]. https://huggingface.co/datasets/Fancy-MLLM/R1-Onevision
    Explore at:
    Dataset updated
    Feb 25, 2025
    Authors
    ikunhub
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    R1-Onevision

    [📂 GitHub][📝 Paper] [🤗 Reasoning Benchmark] [🤗 HF Demo]

      R1-Onevision Dataset
    
    
    
    
    
      Dataset Overview
    

    The R1-Onevision dataset is a meticulously crafted resource designed to empower models with advanced multimodal reasoning capabilities. Aimed at bridging the gap between visual and textual understanding, this dataset provides rich, context-aware reasoning tasks across diverse domains, including natural scenes, science, mathematical problems… See the full description on the dataset page: https://huggingface.co/datasets/Fancy-MLLM/R1-Onevision.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Math-PUMA (2024). Math-PUMA_Data_Stage2 [Dataset]. https://huggingface.co/datasets/Math-PUMA/Math-PUMA_Data_Stage2
Organization logo

Math-PUMA_Data_Stage2

Math-PUMA/Math-PUMA_Data_Stage2

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 5, 2024
Dataset provided by
PUMAhttps://puma.com/
Authors
Math-PUMA
License

https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/

Description

Citation

If you find our work helpful, feel free to give us a cite. @article{zhuang2024math, title={Math-puma: Progressive upward multimodal alignment to enhance mathematical reasoning}, author={Zhuang, Wenwen and Huang, Xin and Zhang, Xiantao and Zeng, Jin}, journal={arXiv preprint arXiv:2408.08640}, year={2024} }

Search
Clear search
Close search
Google apps
Main menu