5 datasets found
  1. h

    mmlu-redux

    • huggingface.co
    Updated Feb 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Edinburgh Dataset Analytics Working Group (2025). mmlu-redux [Dataset]. http://doi.org/10.57967/hf/2507
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 8, 2025
    Dataset authored and provided by
    Edinburgh Dataset Analytics Working Group
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for MMLU-Redux

    [!TIP] Please consider using MMLU-Redux-2.0 which contains all 57 MMLU subjects.

    MMLU-Redux is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects.

      News
    

    [2025.02.08] We corrected one annotation in High School Mathematics subset, as noted in the PlatinumBench paper. [2025.01.23] MMLU-Redux is accepted to NAACL 2025!

      Dataset Details
    
    
    
    
    
      Dataset Description
    

    Each data point in MMLU-Redux contains… See the full description on the dataset page: https://huggingface.co/datasets/edinburgh-dawg/mmlu-redux.

  2. h

    MMLU-Redux-MCQ

    • huggingface.co
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    aqweteddy (2025). MMLU-Redux-MCQ [Dataset]. https://huggingface.co/datasets/aqweteddy/MMLU-Redux-MCQ
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    aqweteddy
    Description

    aqweteddy/MMLU-Redux-MCQ dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    mmlu-redux-2.0

    • huggingface.co
    Updated Nov 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhiwei He (2024). mmlu-redux-2.0 [Dataset]. https://huggingface.co/datasets/zwhe99/mmlu-redux-2.0
    Explore at:
    Dataset updated
    Nov 15, 2024
    Authors
    Zhiwei He
    Description

    Changing edinburgh-dawg/mmlu-redux-2.0's subset into subject feature.

  4. h

    mmlu_redux_filtered

    • huggingface.co
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel A. Mendez L. (2025). mmlu_redux_filtered [Dataset]. https://huggingface.co/datasets/miguelamendez/mmlu_redux_filtered
    Explore at:
    Dataset updated
    Apr 16, 2025
    Authors
    Miguel A. Mendez L.
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    MMLU REDUX FILTERED I postprocessed the "edinburgh-dawg/mmlu-redux" dataset into macro-categories and added in each category only the samples where type_error had label ok: This is the macro-category division: Medicine and Health :[anatomy,clinical_knowledge,college_medicine,human_aging] Science:[college_chemistry,college_physics,high_school_chemistry,high_school_physics,virology,conceptual_physics,astronomy] Mathematics:[college_mathematics,high_school_mathematics,high_school_statistics… See the full description on the dataset page: https://huggingface.co/datasets/miguelamendez/mmlu_redux_filtered.

  5. h

    Judgement-baseline

    • huggingface.co
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sleeping AI (2025). Judgement-baseline [Dataset]. https://huggingface.co/datasets/sleeping-ai/Judgement-baseline
    Explore at:
    Dataset updated
    May 9, 2025
    Dataset authored and provided by
    Sleeping AI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Model Name

    Params

    MMLU-Pro-Plus Baseline Drop MMLU-Pro Baseline Drop Added Exp MMLU Pro Plus Added MMLU-redux 2.0 Baseline Drop AQUA-RAT Baseline Drop

    CohereLabs/c4ai-command-a-03-2025 111B ✅ (single inference) ✅ done ✅ (HF naive batch) ✅ done ✅ done

    -

    -

    -

    google/gemma-3-12b-it 12B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done

    -

    -

    -

    meta-llama/Llama-4-Scout-17B-16E 17B ✅ (HF naive batch) ✅ done ✅ (HF naive batch) ✅ done ✅ done

    -

    -

    -

    Qwen/Qwen3-4B 4B… See the full description on the dataset page: https://huggingface.co/datasets/sleeping-ai/Judgement-baseline.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Edinburgh Dataset Analytics Working Group (2025). mmlu-redux [Dataset]. http://doi.org/10.57967/hf/2507

mmlu-redux

MMLU-Redux

edinburgh-dawg/mmlu-redux

Explore at:
51 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 8, 2025
Dataset authored and provided by
Edinburgh Dataset Analytics Working Group
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset Card for MMLU-Redux

[!TIP] Please consider using MMLU-Redux-2.0 which contains all 57 MMLU subjects.

MMLU-Redux is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects.

  News

[2025.02.08] We corrected one annotation in High School Mathematics subset, as noted in the PlatinumBench paper. [2025.01.23] MMLU-Redux is accepted to NAACL 2025!

  Dataset Details





  Dataset Description

Each data point in MMLU-Redux contains… See the full description on the dataset page: https://huggingface.co/datasets/edinburgh-dawg/mmlu-redux.

Search
Clear search
Close search
Google apps
Main menu