6 datasets found

P
MiniF2F Dataset
paperswithcode.com
opendatalab.com
Updated Aug 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kunhao Zheng; Jesse Michael Han; Stanislas Polu (2024). MiniF2F Dataset [Dataset]. https://paperswithcode.com/dataset/minif2f
Explore at:
Dataset updated
Aug 14, 2024
Authors
Kunhao Zheng; Jesse Michael Han; Stanislas Polu
Description
MiniF2F is a dataset of formal Olympiad-level mathematics problems statements intended to provide a unified cross-system benchmark for neural theorem proving. The miniF2F benchmark currently targets Metamath, Lean, and Isabelle and consists of 488 problem statements drawn from the AIME, AMC, and the International Mathematical Olympiad (IMO), as well as material from high-school and undergraduate mathematics courses.
h
minif2f_test
huggingface.co
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Project-Numina (2025). minif2f_test [Dataset]. https://huggingface.co/datasets/AI-MO/minif2f_test
Explore at:
Dataset updated
Apr 28, 2025
Dataset authored and provided by
Project-Numina
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
MiniF2F

Dataset Usage

The evaluation results of Kimina-Prover presented in our work are all based on this MiniF2F test set.

Improvements

We corrected several erroneous formalizations, since the original formal statements could not be proven. We list them in the following table. All our improvements are made based on the MiniF2F test set provided by DeepseekProverV1.5, which applies certain modifications to the original dataset to adapt it to the Lean 4.… See the full description on the dataset page: https://huggingface.co/datasets/AI-MO/minif2f_test.
h
minif2f_solving
huggingface.co
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qi Liu (2025). minif2f_solving [Dataset]. https://huggingface.co/datasets/purewhite42/minif2f_solving
Explore at:
Dataset updated
May 8, 2025
Authors
Qi Liu
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for MiniF2F-Solving

This benchmark is part of the official implementation of Beyond Theorem Proving: Formulation, Framework and Benchmark for Formal Problem-Solving. Our research focuses on:

What is problem-solving? Beyond proving known targets, how can process-verified problem-solving be conducted inside existing formal theorem proving (FTP) environments?

Contribution

A principled formulation of problem-solving as a deterministic Markov decision process;… See the full description on the dataset page: https://huggingface.co/datasets/purewhite42/minif2f_solving.
miniF2F-Graded
zenodo.org
bin, json, png
Updated Jan 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous; Anonymous (2025). miniF2F-Graded [Dataset]. http://doi.org/10.5281/zenodo.14776138
Explore at:
png, json, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14776138
Dataset updated
Jan 31, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anonymous; Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MiniF2F-Graded(./miniF2F-Graded.json) builds upon miniF2F by introducing additional metrics for each theorem: Difficulty, Discrimination, and Difficulty Grading. These metrics are calculated based on the actual performance of LLMs in proving the theorems, making them a more accurate reflection of difficulty from the perspective of LLMs.

Please refer to ./README.md for more information.
h
DSP1.5RL-minif2f-sampling_4
huggingface.co
Updated Nov 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Autores (2024). DSP1.5RL-minif2f-sampling_4 [Dataset]. https://huggingface.co/datasets/autores/DSP1.5RL-minif2f-sampling_4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 10, 2024
Dataset authored and provided by
Autores
Description
autores/DSP1.5RL-minif2f-sampling_4 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
dataset_0
huggingface.co
Updated Apr 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
XXX (2024). dataset_0 [Dataset]. https://huggingface.co/datasets/xyy888/dataset_0
Explore at:
Dataset updated
Apr 13, 2024
Authors
XXX
Description
MiniF2F is a formal mathematics benchmark (translated across multiple formal systems) consisting of exercise statements from olympiads (AMC, AIME, IMO) as well as high-school and undergraduate maths classes. This dataset contains formal statements in Isabelle. Each statement is paired with an informal statement and an informal proof, as described in Draft, Sketch, Prove [Jiang et al 2023]. The problems in this dataset use the most recent facebookresearch/miniF2F commit on July 3, 2023.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Kunhao Zheng; Jesse Michael Han; Stanislas Polu (2024). MiniF2F Dataset [Dataset]. https://paperswithcode.com/dataset/minif2f

MiniF2F Dataset

Explore at:

248 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Aug 14, 2024

Authors

Kunhao Zheng; Jesse Michael Han; Stanislas Polu

Description

MiniF2F is a dataset of formal Olympiad-level mathematics problems statements intended to provide a unified cross-system benchmark for neural theorem proving. The miniF2F benchmark currently targets Metamath, Lean, and Isabelle and consists of 488 problem statements drawn from the AIME, AMC, and the International Mathematical Olympiad (IMO), as well as material from high-school and undergraduate mathematics courses.

Clear search

Close search

Google apps

Main menu

MiniF2F Dataset

minif2f_test

minif2f_solving

miniF2F-Graded

DSP1.5RL-minif2f-sampling_4

dataset_0

MiniF2F Dataset