MiniF2F is a formal mathematics benchmark (translated across multiple formal systems) consisting of exercise statements from olympiads (AMC, AIME, IMO) as well as high-school and undergraduate maths classes. This dataset contains formal statements in Isabelle. Each statement is paired with an informal statement and an informal proof, as described in Draft, Sketch, Prove [Jiang et al 2023]. The problems in this dataset use the most recent facebookresearch/miniF2F commit on July 3, 2023.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
MiniF2F is a dataset of formal Olympiad-level mathematics problems statements intended to provide a unified cross-system benchmark for neural theorem proving. The miniF2F benchmark currently targets Metamath, Lean, and Isabelle and consists of 488 problem statements drawn from the AIME, AMC, and the International Mathematical Olympiad (IMO), as well as material from high-school and undergraduate mathematics courses.
tbetton/miniF2F-rocq-lean dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
MiniF2F
Dataset Usage
The evaluation results of Kimina-Prover presented in our work are all based on this MiniF2F test set.
Improvements
We corrected several erroneous formalizations, since the original formal statements could not be proven. We list them in the following table. All our improvements are made based on the MiniF2F test set provided by DeepseekProverV1.5, which applies certain modifications to the original dataset to adapt it to the Lean 4.… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/minif2f_test.
Kevew/minif2f-kiminaprover8b-inferenced dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for MiniF2F-Solving
This benchmark is part of the official implementation of Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving. Our research focuses on:
What is problem-solving? Beyond proving known targets, how can process-verified problem-solving be conducted inside existing formal theorem proving (FTP) environments?
Contribution
A principled formulation of problem-solving as a deterministic Markov decision process;… See the full description on the dataset page: https://huggingface.co/datasets/purewhite42/minif2f_solving.
autores/DSP1.5RL-minif2f-sampling_4 dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
We release Lean-Github and InternLM2-Step-Prover with 29K theorems compiled from 100+ Lean 4 repos and a 7B models fine-tuned on Lean-Github and Lean-Workbook with SOTA performance on MiniF2F-test (54.5%), ProofNet (18.1%), and Putnam (5 problems). 🤗Dataset 🤗Model 📑 Paper 📖 README
Citation and Tech Report
@misc{wu2024leangithubcompilinggithublean, title={LEAN-GitHub: Compiling GitHub LEAN repositories for a versatile LEAN prover}, author={Zijian Wu and Jiayu… See the full description on the dataset page: https://huggingface.co/datasets/internlm/Lean-Github.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MiniF2F is a formal mathematics benchmark (translated across multiple formal systems) consisting of exercise statements from olympiads (AMC, AIME, IMO) as well as high-school and undergraduate maths classes. This dataset contains formal statements in Isabelle. Each statement is paired with an informal statement and an informal proof, as described in Draft, Sketch, Prove [Jiang et al 2023]. The problems in this dataset use the most recent facebookresearch/miniF2F commit on July 3, 2023.