2 datasets found
  1. h

    MMMU

    • huggingface.co
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MMMU (2023). MMMU [Dataset]. https://huggingface.co/datasets/MMMU/MMMU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2023
    Dataset authored and provided by
    MMMU
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    MMMU (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI)

    🌐 Homepage | πŸ† Leaderboard | πŸ€— Dataset | πŸ€— Paper | πŸ“– arXiv | GitHub

      πŸ””News
    

    πŸ› οΈ[2024-05-30]: Fixed duplicate option issues in Materials dataset items (validation_Materials_25; test_Materials_17, 242) and content error in validation_Materials_25. πŸ› οΈ[2024-04-30]: Fixed missing "-" or "^" signs in Math dataset items (dev_Math_2, validation_Math_11, 12, 16; test_Math_8… See the full description on the dataset page: https://huggingface.co/datasets/MMMU/MMMU.

  2. h

    MMMU_with_difficulty_level

    • huggingface.co
    Updated Jul 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jierun Chen (2025). MMMU_with_difficulty_level [Dataset]. https://huggingface.co/datasets/JierunChen/MMMU_with_difficulty_level
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Jierun Chen
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    MMMU with difficulty level tags

    This dataset extends the πŸ€— MMMU val benchmark by introducing two additional tags: passrate_for_qwen2.5_vl_7b and difficulty_level_for_qwen2.5_vl_7b. Further details are available in our paper The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs.

      πŸš€ Data Usage
    

    from datasets import load_dataset

    dataset = load_dataset("JierunChen/MMMU_with_difficulty_level") print(dataset)

      πŸ“‘β€¦ See the full description on the dataset page: https://huggingface.co/datasets/JierunChen/MMMU_with_difficulty_level.
    
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
MMMU (2023). MMMU [Dataset]. https://huggingface.co/datasets/MMMU/MMMU

MMMU

mmmu

MMMU/MMMU

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2023
Dataset authored and provided by
MMMU
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

MMMU (A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI)

🌐 Homepage | πŸ† Leaderboard | πŸ€— Dataset | πŸ€— Paper | πŸ“– arXiv | GitHub

  πŸ””News

πŸ› οΈ[2024-05-30]: Fixed duplicate option issues in Materials dataset items (validation_Materials_25; test_Materials_17, 242) and content error in validation_Materials_25. πŸ› οΈ[2024-04-30]: Fixed missing "-" or "^" signs in Math dataset items (dev_Math_2, validation_Math_11, 12, 16; test_Math_8… See the full description on the dataset page: https://huggingface.co/datasets/MMMU/MMMU.

Search
Clear search
Close search
Google apps
Main menu