Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems[ACL 2024]
📖 arXiv | GitHub Note: We have made adjustments to the image content in the multimodal portion of the dataset and fixed previous issues where some images in the English physics subset were not displayed properly. If your usage involves images, please re-download the dataset (we recommend all users to download the latest version). Additionally, some entries… See the full description on the dataset page: https://huggingface.co/datasets/Hothan/OlympiadBench.
Facebook
Twitterlmms-lab/OlympiadBench dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittermath-ai/olympiadbench dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterOlympiadBench Data set used in the Putnam-AXIOM Paper
The Putnam-AXIOM dataset is a benchmark for measuring advanced mathematical reasoning in large language models (LLMs). It includes challenging mathematical problems from the William Lowell Putnam Mathematical Competition, with both original problems and functional variations to address data contamination. The dataset aims to provide rigorous evaluations by requiring models to answer in boxed format, simplifying automatic answer… See the full description on the dataset page: https://huggingface.co/datasets/brando/olympiad-bench-imo-math-boxed-825-v2-21-08-2024.
Facebook
Twitterknoveleng/OlympiadBench dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterSubscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Facebook
TwitterView details of Olympic Bench Incline imports shipment data in November with price, HS codes, major Indian ports, countries, importers, buyers in India, quantity and more.
Facebook
TwitterView details of Olympic Bench Incline imports shipment data in December with price, HS codes, major Indian ports, countries, importers, buyers in India, quantity and more.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Card for "LiveMathBench"
Homepage: https://open-compass.github.io/GPassK/ Repository: https://github.com/open-compass/GPassK Paper: Are Your LLMs Capable of Stable Reasoning?
Introduction
LiveMathBench is a mathematical dataset, specifically designed to include challenging latest question sets from various mathematical competitions, aiming to avoid data contamination issues in existing LLMs and public math benchmarks.
Leaderboard
The Latest… See the full description on the dataset page: https://huggingface.co/datasets/opencompass/LiveMathBench.
Facebook
TwitterView details of Olympic Bench Incline imports shipment data in May with price, HS codes, major Indian ports, countries, importers, buyers in India, quantity and more.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IMO-AnswerBench
Dataset Description
IMO-AnswerBench is a benchmark dataset for evaluating the mathematical reasoning capabilities of large language models. It consists of 400 challenging short-answer problems from the International Mathematical Olympiad (IMO) and other sources. This dataset is part of the IMO-Bench suite, released by Google DeepMind in conjunction with their 2025 IMO gold medal achievement.
Supported Tasks and Leaderboards
The primary task… See the full description on the dataset page: https://huggingface.co/datasets/Hwilner/imo-answerbench.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IMO-ProofBench
Dataset Description
IMO-ProofBench is a benchmark dataset for evaluating the proof-writing capabilities of large language models. It consists of 60 challenging proof-based problems from the International Mathematical Olympiad (IMO) and other sources. This dataset is part of the IMO-Bench suite, released by Google DeepMind in conjunction with their 2025 IMO gold medal achievement.
Supported Tasks and Leaderboards
The primary task for this… See the full description on the dataset page: https://huggingface.co/datasets/Hwilner/imo-proofbench.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems[ACL 2024]
📖 arXiv | GitHub Note: We have made adjustments to the image content in the multimodal portion of the dataset and fixed previous issues where some images in the English physics subset were not displayed properly. If your usage involves images, please re-download the dataset (we recommend all users to download the latest version). Additionally, some entries… See the full description on the dataset page: https://huggingface.co/datasets/Hothan/OlympiadBench.