25 datasets found
  1. MMMLU

    • huggingface.co
    Updated Sep 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenAI (2024). MMMLU [Dataset]. https://huggingface.co/datasets/openai/MMMLU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 17, 2024
    Dataset authored and provided by
    OpenAIhttps://openai.com/
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Multilingual Massive Multitask Language Understanding (MMMLU)

    The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.

  2. h

    mmmlu

    • huggingface.co
    Updated Feb 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nathan Cooper (2023). mmmlu [Dataset]. https://huggingface.co/datasets/ncoop57/mmmlu
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 24, 2023
    Authors
    Nathan Cooper
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This new dataset is designed to solve this great NLP task and is crafted with a lot of care.

  3. h

    R3-eval-MMMLU

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shou-Yi Hung (2025). R3-eval-MMMLU [Dataset]. https://huggingface.co/datasets/HLeiTR/R3-eval-MMMLU
    Explore at:
    Dataset updated
    Jun 12, 2025
    Authors
    Shou-Yi Hung
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    HLeiTR/R3-eval-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. Multilingual MMLU (MMMLU) by Models Model

    • artificialanalysis.ai
    Updated Dec 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). Multilingual MMLU (MMMLU) by Models Model [Dataset]. https://artificialanalysis.ai/models/multilingual/language/english
    Explore at:
    Dataset updated
    Dec 8, 2024
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Multilingual MMLU (MMMLU); Multilingual Grade School Math (MGSM) by Model

  5. h

    sample-MMMLU

    • huggingface.co
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyu ZHANG (2025). sample-MMMLU [Dataset]. https://huggingface.co/datasets/crystina-z/sample-MMMLU
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    Xinyu ZHANG
    Description

    crystina-z/sample-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    mmmlu-ja-annotated

    • huggingface.co
    Updated Feb 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    System K Dev. (2025). mmmlu-ja-annotated [Dataset]. https://huggingface.co/datasets/systemk/mmmlu-ja-annotated
    Explore at:
    Dataset updated
    Feb 5, 2025
    Dataset authored and provided by
    System K Dev.
    Description

    systemk/mmmlu-ja-annotated dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot

    • huggingface.co
    Updated Oct 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Vila (2024). mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 14, 2024
    Authors
    Daniel Vila
    Description

    dvilasuero/mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community

  8. h

    MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B

    • huggingface.co
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dohyung Kim (2025). MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B [Dataset]. https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B
    Explore at:
    Dataset updated
    Jun 10, 2025
    Authors
    Dohyung Kim
    Description

    Dataset

    Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.

    think/nonthink 모드 둘 다 생성했습니다.

    생성 하이퍼파라미터

    max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+ max_think_tokens = 1024 max_nonthink_tokens = 1024 temperature = 0.6 top_p = 0.95

    생성 소스코드: https://github.com/werty1248/MoE-Analyzer-vLLM

  9. h

    mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot

    • huggingface.co
    Updated Oct 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Vila (2024). mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 18, 2024
    Authors
    Daniel Vila
    Description

    dvilasuero/mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    MMMLU-fr

    • huggingface.co
    Updated Oct 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    le-leadboard (2024). MMMLU-fr [Dataset]. https://huggingface.co/datasets/le-leadboard/MMMLU-fr
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 22, 2024
    Dataset authored and provided by
    le-leadboard
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for MMMLU-fr

    le-leadboard/MMMLU-fr fait partie de l'initiative OpenLLM French Leaderboard, proposant une adaptation française du benchmark MMMLU (Multilingual Massive Multitask Language Understanding) développé initialement par OpenAI dataset.c'est un clone exact de la division française du jeu de données MMMLU

      Dataset Summary
    

    MMMLU-fr est l'adaptation française du benchmark MMMLU, intégrant des questions plus complexes axées sur le raisonnement avec un choix… See the full description on the dataset page: https://huggingface.co/datasets/le-leadboard/MMMLU-fr.

  11. Multilingual MMLU , Higher is better by Models Model

    • artificialanalysis.ai
    Updated Dec 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). Multilingual MMLU , Higher is better by Models Model [Dataset]. https://artificialanalysis.ai/models/multilingual
    Explore at:
    Dataset updated
    Dec 27, 2024
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Multilingual MMLU , Higher is better; Multilingual GSM , Higher is better by Model

  12. h

    mmmlu-pro-eval-cot-70B

    • huggingface.co
    Updated Oct 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Vila (2024). mmmlu-pro-eval-cot-70B [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 11, 2024
    Authors
    Daniel Vila
    Description

    Dataset Card for mmmlu-pro-eval-cot-70B

    This dataset has been created with distilabel.

      Dataset Summary
    

    This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B/raw/main/pipeline.yaml"

    or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B.

  13. h

    mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot

    • huggingface.co
    Updated Oct 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Vila (2024). mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 11, 2024
    Authors
    Daniel Vila
    Description

    Dataset Card for mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot

    This dataset has been created with distilabel.

      Dataset Summary
    

    This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot/raw/main/pipeline.yaml"

    or explore the configuration: distilabel pipeline info… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot.

  14. h

    MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B

    • huggingface.co
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dohyung Kim (2025). MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B [Dataset]. https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B
    Explore at:
    Dataset updated
    Jun 10, 2025
    Authors
    Dohyung Kim
    Description

    werty1248/MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B 데이터에서 expert 통계를 낸 데이터

    prompt_stat: 프롬프트 처리 시 선택된 expert id 통계

    output_stat: 토큰 생성 시 선택된 expert id 통계

    first_stat: 생성된 토큰 중 최초 128토큰에서 선택된 expert id 통계

    last_stat: 생성된 토큰 중 마지막 128토큰에서 선택된 expert id 통계

    total_stat: prompt_stat + output_stat

      Dataset
    

    Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.

    think/nonthink 모드 둘 다 생성했습니다.

    생성 하이퍼파라미터

    max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+… See the full description on the dataset page: https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B.

  15. h

    MMMLU-JA-JP-EN

    • huggingface.co
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koki Maeda (2024). MMMLU-JA-JP-EN [Dataset]. https://huggingface.co/datasets/Silviase/MMMLU-JA-JP-EN
    Explore at:
    Dataset updated
    Sep 17, 2024
    Authors
    Koki Maeda
    Description

    Mmmlu (English Translation)

    This is an English translation of the openai/MMMLU dataset, translated using plamo-translate.

      Dataset Description
    

    This dataset is part of the llm-jp-eval-mm benchmark suite. The original Japanese questions and answers have been translated to English while preserving the visual content.

      Translation Details
    

    Translation Model: pfnet/plamo-translate Fields Translated: question, choices, answer Original Language: Japanese Target… See the full description on the dataset page: https://huggingface.co/datasets/Silviase/MMMLU-JA-JP-EN.

  16. h

    swahili-mmmlu

    • huggingface.co
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NIONGOLO Chrys Fé-Marty (2024). swahili-mmmlu [Dataset]. https://huggingface.co/datasets/Svngoku/swahili-mmmlu
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2024
    Authors
    NIONGOLO Chrys Fé-Marty
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Swahili MMMLU Dataset

    This dataset is a Swahili version of the Massive Multitask Language Understanding (MMLU) dataset. It is a multiple-choice question answering dataset that covers a wide range of topics and subjects, designed to evaluate the language understanding capabilities of models in Swahili. Dataset Structure: The dataset is structured as follows:

    question: The question posed in Swahili. options: A dictionary containing the multiple choice options, labeled A, B, C, and D.… See the full description on the dataset page: https://huggingface.co/datasets/Svngoku/swahili-mmmlu.

  17. h

    nlptestrun

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EthanChuang, nlptestrun [Dataset]. https://huggingface.co/datasets/r13922a24/nlptestrun
    Explore at:
    Authors
    EthanChuang
    Description

    MMMLU Order Sensitivity Dataset

    This dataset contains experimental results testing order sensitivity bias in LLMs using the MMMLU dataset.

      Overview
    

    Languages: English (en), French (fr)
    Models: gemini-2.0-flash, mistral-small-latest Formats: 5 input/output combinations (base/json/xml) Subtasks: 17 from MMMLU Questions: 100 per subtask × 4 permutations each

      Dataset Structure
    

    Each record contains:

    question_id: Unique identifier subtask: MMMLU subtask name… See the full description on the dataset page: https://huggingface.co/datasets/r13922a24/nlptestrun.

  18. h

    MMMLU-STEM-Ko

    • huggingface.co
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ChuGyouk (2024). MMMLU-STEM-Ko [Dataset]. https://huggingface.co/datasets/ChuGyouk/MMMLU-STEM-Ko
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2024
    Authors
    ChuGyouk
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Details

    This is a subset of [openai/MMMLU]. Only the subjects related to STEM were extracted from Korean subset of MMMLU. The included subjects are

    'abstract_algebra', 'anatomy', 'astronomy', 'college_biology', 'college_chemistry', 'college_computer_science', 'college_mathematics', 'college_physics', 'computer_security', 'conceptual_physics', 'electrical_engineering', 'elementary_mathematics', 'high_school_biology', 'high_school_chemistry'… See the full description on the dataset page: https://huggingface.co/datasets/ChuGyouk/MMMLU-STEM-Ko.

  19. h

    Speech-MMMLU

    • huggingface.co
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evan Lin (2024). Speech-MMMLU [Dataset]. https://huggingface.co/datasets/Evan-Lin/Speech-MMMLU
    Explore at:
    Dataset updated
    Sep 17, 2024
    Authors
    Evan Lin
    Description

    Evan-Lin/Speech-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community

  20. h

    mmmlu_lite

    • huggingface.co
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenCompass (2024). mmmlu_lite [Dataset]. https://huggingface.co/datasets/opencompass/mmmlu_lite
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 1, 2024
    Dataset authored and provided by
    OpenCompass
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    MMMLU-Lite

      Introduction
    

    A lite version of the MMMLU dataset, which is an community version of the MMMLU dataset by OpenCompass. Due to the large size of the original dataset (about 200k questions), we have created a lite version of the dataset to make it easier to use. We sample 25 examples from each language subject in the original dataset with fixed seed to ensure reproducibility, finally we have 19950 examples in the lite version of the dataset, which is about 10% of… See the full description on the dataset page: https://huggingface.co/datasets/opencompass/mmmlu_lite.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
OpenAI (2024). MMMLU [Dataset]. https://huggingface.co/datasets/openai/MMMLU
Organization logo

MMMLU

openai/MMMLU

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 17, 2024
Dataset authored and provided by
OpenAIhttps://openai.com/
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Multilingual Massive Multitask Language Understanding (MMMLU)

The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.

Search
Clear search
Close search
Google apps
Main menu