25 datasets found

MMMLU
huggingface.co
Updated Sep 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenAI (2024). MMMLU [Dataset]. https://huggingface.co/datasets/openai/MMMLU
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 17, 2024
Dataset authored and provided by
OpenAIhttps://openai.com/
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Multilingual Massive Multitask Language Understanding (MMMLU)

The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.
h
mmmlu
huggingface.co
Updated Feb 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nathan Cooper (2023). mmmlu [Dataset]. https://huggingface.co/datasets/ncoop57/mmmlu
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 24, 2023
Authors
Nathan Cooper
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This new dataset is designed to solve this great NLP task and is crafted with a lot of care.
h
R3-eval-MMMLU
huggingface.co
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shou-Yi Hung (2025). R3-eval-MMMLU [Dataset]. https://huggingface.co/datasets/HLeiTR/R3-eval-MMMLU
Explore at:
Dataset updated
Jun 12, 2025
Authors
Shou-Yi Hung
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
HLeiTR/R3-eval-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
Multilingual MMLU (MMMLU) by Models Model
artificialanalysis.ai
Updated Dec 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artificial Analysis (2024). Multilingual MMLU (MMMLU) by Models Model [Dataset]. https://artificialanalysis.ai/models/multilingual/language/english
Explore at:
Dataset updated
Dec 8, 2024
Dataset authored and provided by
Artificial Analysis
Description
Comparison of Multilingual MMLU (MMMLU); Multilingual Grade School Math (MGSM) by Model
h
sample-MMMLU
huggingface.co
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xinyu ZHANG (2025). sample-MMMLU [Dataset]. https://huggingface.co/datasets/crystina-z/sample-MMMLU
Explore at:
Dataset updated
May 29, 2025
Authors
Xinyu ZHANG
Description
crystina-z/sample-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
h
mmmlu-ja-annotated
huggingface.co
Updated Feb 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
System K Dev. (2025). mmmlu-ja-annotated [Dataset]. https://huggingface.co/datasets/systemk/mmmlu-ja-annotated
Explore at:
Dataset updated
Feb 5, 2025
Dataset authored and provided by
System K Dev.
Description
systemk/mmmlu-ja-annotated dataset hosted on Hugging Face and contributed by the HF Datasets community
h
mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot
huggingface.co
Updated Oct 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2024). mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 14, 2024
Authors
Daniel Vila
Description
dvilasuero/mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community
h
MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B
huggingface.co
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dohyung Kim (2025). MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B [Dataset]. https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B
Explore at:
Dataset updated
Jun 10, 2025
Authors
Dohyung Kim
Description
Dataset

Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.

think/nonthink 모드 둘 다 생성했습니다.

생성 하이퍼파라미터

max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+ max_think_tokens = 1024 max_nonthink_tokens = 1024 temperature = 0.6 top_p = 0.95

생성 소스코드: https://github.com/werty1248/MoE-Analyzer-vLLM
h
mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot
huggingface.co
Updated Oct 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2024). mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 18, 2024
Authors
Daniel Vila
Description
dvilasuero/mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community
h
MMMLU-fr
huggingface.co
Updated Oct 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
le-leadboard (2024). MMMLU-fr [Dataset]. https://huggingface.co/datasets/le-leadboard/MMMLU-fr
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 22, 2024
Dataset authored and provided by
le-leadboard
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for MMMLU-fr

le-leadboard/MMMLU-fr fait partie de l'initiative OpenLLM French Leaderboard, proposant une adaptation française du benchmark MMMLU (Multilingual Massive Multitask Language Understanding) développé initialement par OpenAI dataset.c'est un clone exact de la division française du jeu de données MMMLU

Dataset Summary

MMMLU-fr est l'adaptation française du benchmark MMMLU, intégrant des questions plus complexes axées sur le raisonnement avec un choix… See the full description on the dataset page: https://huggingface.co/datasets/le-leadboard/MMMLU-fr.
Multilingual MMLU , Higher is better by Models Model
artificialanalysis.ai
Updated Dec 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artificial Analysis (2024). Multilingual MMLU , Higher is better by Models Model [Dataset]. https://artificialanalysis.ai/models/multilingual
Explore at:
Dataset updated
Dec 27, 2024
Dataset authored and provided by
Artificial Analysis
Description
Comparison of Multilingual MMLU , Higher is better; Multilingual GSM , Higher is better by Model
h
mmmlu-pro-eval-cot-70B
huggingface.co
Updated Oct 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2024). mmmlu-pro-eval-cot-70B [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 11, 2024
Authors
Daniel Vila
Description
Dataset Card for mmmlu-pro-eval-cot-70B

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B.
h
mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot
huggingface.co
Updated Oct 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2024). mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot [Dataset]. https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 11, 2024
Authors
Daniel Vila
Description
Dataset Card for mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot.
h
MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B
huggingface.co
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dohyung Kim (2025). MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B [Dataset]. https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B
Explore at:
Dataset updated
Jun 10, 2025
Authors
Dohyung Kim
Description
werty1248/MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B 데이터에서 expert 통계를 낸 데이터

prompt_stat: 프롬프트 처리 시 선택된 expert id 통계

output_stat: 토큰 생성 시 선택된 expert id 통계

first_stat: 생성된 토큰 중 최초 128토큰에서 선택된 expert id 통계

last_stat: 생성된 토큰 중 마지막 128토큰에서 선택된 expert id 통계

total_stat: prompt_stat + output_stat

Dataset

Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.

think/nonthink 모드 둘 다 생성했습니다.

생성 하이퍼파라미터

max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+… See the full description on the dataset page: https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B.
h
MMMLU-JA-JP-EN
huggingface.co
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koki Maeda (2024). MMMLU-JA-JP-EN [Dataset]. https://huggingface.co/datasets/Silviase/MMMLU-JA-JP-EN
Explore at:
Dataset updated
Sep 17, 2024
Authors
Koki Maeda
Description
Mmmlu (English Translation)

This is an English translation of the openai/MMMLU dataset, translated using plamo-translate.

Dataset Description

This dataset is part of the llm-jp-eval-mm benchmark suite. The original Japanese questions and answers have been translated to English while preserving the visual content.

Translation Details

Translation Model: pfnet/plamo-translate Fields Translated: question, choices, answer Original Language: Japanese Target… See the full description on the dataset page: https://huggingface.co/datasets/Silviase/MMMLU-JA-JP-EN.
h
swahili-mmmlu
huggingface.co
Updated Sep 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NIONGOLO Chrys Fé-Marty (2024). swahili-mmmlu [Dataset]. https://huggingface.co/datasets/Svngoku/swahili-mmmlu
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 23, 2024
Authors
NIONGOLO Chrys Fé-Marty
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Swahili MMMLU Dataset

This dataset is a Swahili version of the Massive Multitask Language Understanding (MMLU) dataset. It is a multiple-choice question answering dataset that covers a wide range of topics and subjects, designed to evaluate the language understanding capabilities of models in Swahili. Dataset Structure: The dataset is structured as follows:

question: The question posed in Swahili. options: A dictionary containing the multiple choice options, labeled A, B, C, and D.… See the full description on the dataset page: https://huggingface.co/datasets/Svngoku/swahili-mmmlu.
h
nlptestrun
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EthanChuang, nlptestrun [Dataset]. https://huggingface.co/datasets/r13922a24/nlptestrun
Explore at:
Authors
EthanChuang
Description
MMMLU Order Sensitivity Dataset

This dataset contains experimental results testing order sensitivity bias in LLMs using the MMMLU dataset.

Overview

Languages: English (en), French (fr)
Models: gemini-2.0-flash, mistral-small-latest Formats: 5 input/output combinations (base/json/xml) Subtasks: 17 from MMMLU Questions: 100 per subtask × 4 permutations each

Dataset Structure

Each record contains:

question_id: Unique identifier subtask: MMMLU subtask name… See the full description on the dataset page: https://huggingface.co/datasets/r13922a24/nlptestrun.
h
MMMLU-STEM-Ko
huggingface.co
Updated Dec 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ChuGyouk (2024). MMMLU-STEM-Ko [Dataset]. https://huggingface.co/datasets/ChuGyouk/MMMLU-STEM-Ko
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 5, 2024
Authors
ChuGyouk
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Details

This is a subset of [openai/MMMLU]. Only the subjects related to STEM were extracted from Korean subset of MMMLU. The included subjects are

'abstract_algebra', 'anatomy', 'astronomy', 'college_biology', 'college_chemistry', 'college_computer_science', 'college_mathematics', 'college_physics', 'computer_security', 'conceptual_physics', 'electrical_engineering', 'elementary_mathematics', 'high_school_biology', 'high_school_chemistry'… See the full description on the dataset page: https://huggingface.co/datasets/ChuGyouk/MMMLU-STEM-Ko.
h
Speech-MMMLU
huggingface.co
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evan Lin (2024). Speech-MMMLU [Dataset]. https://huggingface.co/datasets/Evan-Lin/Speech-MMMLU
Explore at:
Dataset updated
Sep 17, 2024
Authors
Evan Lin
Description
Evan-Lin/Speech-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
h
mmmlu_lite
huggingface.co
Updated Nov 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenCompass (2024). mmmlu_lite [Dataset]. https://huggingface.co/datasets/opencompass/mmmlu_lite
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 1, 2024
Dataset authored and provided by
OpenCompass
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
MMMLU-Lite

Introduction

A lite version of the MMMLU dataset, which is an community version of the MMMLU dataset by OpenCompass. Due to the large size of the original dataset (about 200k questions), we have created a lite version of the dataset to make it easier to use. We sample 25 examples from each language subject in the original dataset with fixed seed to ensure reproducibility, finally we have 19950 examples in the lite version of the dataset, which is about 10% of… See the full description on the dataset page: https://huggingface.co/datasets/opencompass/mmmlu_lite.

Facebook

Twitter

Click to copy link

Link copied

Cite

OpenAI (2024). MMMLU [Dataset]. https://huggingface.co/datasets/openai/MMMLU

MMMLU

openai/MMMLU

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 17, 2024

Dataset authored and provided by

OpenAIhttps://openai.com/

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Multilingual Massive Multitask Language Understanding (MMMLU)

The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.

Clear search

Close search

Google apps

Main menu

MMMLU

mmmlu

R3-eval-MMMLU

Multilingual MMLU (MMMLU) by Models Model

sample-MMMLU

mmmlu-ja-annotated

mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot

MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B

mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot

MMMLU-fr

Multilingual MMLU , Higher is better by Models Model

mmmlu-pro-eval-cot-70B

mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot

MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B

MMMLU-JA-JP-EN

swahili-mmmlu

nlptestrun

MMMLU-STEM-Ko

Speech-MMMLU

mmmlu_lite

MMMLUSee More Versions

openai/MMMLU

MMMLU