MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Multilingual Massive Multitask Language Understanding (MMMLU)
The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This new dataset is designed to solve this great NLP task and is crafted with a lot of care.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
HLeiTR/R3-eval-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
Comparison of Multilingual MMLU (MMMLU); Multilingual Grade School Math (MGSM) by Model
crystina-z/sample-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
systemk/mmmlu-ja-annotated dataset hosted on Hugging Face and contributed by the HF Datasets community
dvilasuero/mmmlu-pro-eval-Llama-3.1-70B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset
Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.
think/nonthink 모드 둘 다 생성했습니다.
생성 하이퍼파라미터
max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+ max_think_tokens = 1024 max_nonthink_tokens = 1024 temperature = 0.6 top_p = 0.95
dvilasuero/mmmlu-pro-eval-Llama-3.1-8B-Instruct-cot dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for MMMLU-fr
le-leadboard/MMMLU-fr fait partie de l'initiative OpenLLM French Leaderboard, proposant une adaptation française du benchmark MMMLU (Multilingual Massive Multitask Language Understanding) développé initialement par OpenAI dataset.c'est un clone exact de la division française du jeu de données MMMLU
Dataset Summary
MMMLU-fr est l'adaptation française du benchmark MMMLU, intégrant des questions plus complexes axées sur le raisonnement avec un choix… See the full description on the dataset page: https://huggingface.co/datasets/le-leadboard/MMMLU-fr.
Comparison of Multilingual MMLU , Higher is better; Multilingual GSM , Higher is better by Model
Dataset Card for mmmlu-pro-eval-cot-70B
This dataset has been created with distilabel.
Dataset Summary
This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B/raw/main/pipeline.yaml"
or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-cot-70B.
Dataset Card for mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot
This dataset has been created with distilabel.
Dataset Summary
This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot/raw/main/pipeline.yaml"
or explore the configuration: distilabel pipeline info… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/mmmlu-pro-eval-Qwen2.5-72B-Instruct-cot.
werty1248/MoE-Router-Dataset-MMMLU-Qwen3-30B-A3B 데이터에서 expert 통계를 낸 데이터
prompt_stat: 프롬프트 처리 시 선택된 expert id 통계
output_stat: 토큰 생성 시 선택된 expert id 통계
first_stat: 생성된 토큰 중 최초 128토큰에서 선택된 expert id 통계
last_stat: 생성된 토큰 중 마지막 128토큰에서 선택된 expert id 통계
total_stat: prompt_stat + output_stat
Dataset
Qwen3-30B-A3B 모델에 MMLU와 MMMLU의 영어/한국어 데이터를 넣고, gate가 선정한 top 8 expert의 id를 추출했습니다.
think/nonthink 모드 둘 다 생성했습니다.
생성 하이퍼파라미터
max_prompt_tokens = 2048 # MMMLU 최대 프롬프트 토큰: 1500+… See the full description on the dataset page: https://huggingface.co/datasets/werty1248/MoE-Router-Dataset-Statistic-MMMLU-Qwen3-30B-A3B.
Mmmlu (English Translation)
This is an English translation of the openai/MMMLU dataset, translated using plamo-translate.
Dataset Description
This dataset is part of the llm-jp-eval-mm benchmark suite. The original Japanese questions and answers have been translated to English while preserving the visual content.
Translation Details
Translation Model: pfnet/plamo-translate Fields Translated: question, choices, answer Original Language: Japanese Target… See the full description on the dataset page: https://huggingface.co/datasets/Silviase/MMMLU-JA-JP-EN.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Swahili MMMLU Dataset
This dataset is a Swahili version of the Massive Multitask Language Understanding (MMLU) dataset. It is a multiple-choice question answering dataset that covers a wide range of topics and subjects, designed to evaluate the language understanding capabilities of models in Swahili. Dataset Structure: The dataset is structured as follows:
question: The question posed in Swahili. options: A dictionary containing the multiple choice options, labeled A, B, C, and D.… See the full description on the dataset page: https://huggingface.co/datasets/Svngoku/swahili-mmmlu.
MMMLU Order Sensitivity Dataset
This dataset contains experimental results testing order sensitivity bias in LLMs using the MMMLU dataset.
Overview
Languages: English (en), French (fr)
Models: gemini-2.0-flash, mistral-small-latest
Formats: 5 input/output combinations (base/json/xml)
Subtasks: 17 from MMMLU
Questions: 100 per subtask × 4 permutations each
Dataset Structure
Each record contains:
question_id: Unique identifier subtask: MMMLU subtask name… See the full description on the dataset page: https://huggingface.co/datasets/r13922a24/nlptestrun.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Details
This is a subset of [openai/MMMLU]. Only the subjects related to STEM were extracted from Korean subset of MMMLU. The included subjects are
'abstract_algebra', 'anatomy', 'astronomy', 'college_biology', 'college_chemistry', 'college_computer_science', 'college_mathematics', 'college_physics', 'computer_security', 'conceptual_physics', 'electrical_engineering', 'elementary_mathematics', 'high_school_biology', 'high_school_chemistry'… See the full description on the dataset page: https://huggingface.co/datasets/ChuGyouk/MMMLU-STEM-Ko.
Evan-Lin/Speech-MMMLU dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MMMLU-Lite
Introduction
A lite version of the MMMLU dataset, which is an community version of the MMMLU dataset by OpenCompass. Due to the large size of the original dataset (about 200k questions), we have created a lite version of the dataset to make it easier to use. We sample 25 examples from each language subject in the original dataset with fixed seed to ensure reproducibility, finally we have 19950 examples in the lite version of the dataset, which is about 10% of… See the full description on the dataset page: https://huggingface.co/datasets/opencompass/mmmlu_lite.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Multilingual Massive Multitask Language Understanding (MMMLU)
The MMLU is a widely recognized benchmark of general knowledge attained by AI models. It covers a broad range of topics from 57 different categories, covering elementary-level knowledge up to advanced professional subjects like law, physics, history, and computer science. We translated the MMLU’s test set into 14 languages using professional human translators. Relying on human translators for this evaluation increases… See the full description on the dataset page: https://huggingface.co/datasets/openai/MMMLU.