MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MMLU-Pro Dataset
MMLU-Pro dataset is a more robust and challenging massive multi-task understanding dataset tailored to more rigorously benchmark large language models' capabilities. This dataset contains 12K complex questions across various disciplines. |Github | 🏆Leaderboard | 📖Paper |
🚀 What's New
[2025.04.06] We corrected 15 answers in medical domain based on the recommendations of medical professionals, thanks to Dr. Robert (Bob) Hoyt and the subspecialists… See the full description on the dataset page: https://huggingface.co/datasets/nezumikozo/MMLU-Pro.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MMLU-Pro Dataset
MMLU-Pro dataset is a more robust and challenging massive multi-task understanding dataset tailored to more rigorously benchmark large language models' capabilities. This dataset contains 12K complex questions across various disciplines. |Github | 🏆Leaderboard | 📖Paper |
🚀 What's New
[2025.04.06] We corrected 15 answers in medical domain based on the recommendations of medical professionals, thanks to Dr. Robert (Bob) Hoyt and the subspecialists… See the full description on the dataset page: https://huggingface.co/datasets/TIGER-Lab/MMLU-Pro.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MMLU-Pro Dataset
MMLU-Pro dataset is a more robust and challenging massive multi-task understanding dataset tailored to more rigorously benchmark large language models' capabilities. This dataset contains 12K complex questions across various disciplines. |Github | 🏆Leaderboard | 📖Paper |
🚀 What's New
[2025.04.06] We corrected 15 answers in medical domain based on the recommendations of medical professionals, thanks to Dr. Robert (Bob) Hoyt and the subspecialists… See the full description on the dataset page: https://huggingface.co/datasets/nezumikozo/MMLU-Pro.