This is a filtered and metadata enriched version of open-thoughts/OpenThoughts-114k. While the original dataset is a valuable resource containing DeepSeek-R1 outputs, it has very little metadata (only 2 fields: system and conversations). It does not contain, for instance, the original solution label, which means that we can not verify the model answers.
What we did
filtered the dataset for math content (math questions were prefixed by "Return your final response within… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenThoughts-114k-math.
gadkins/open-thoughts-math-dry-run dataset hosted on Hugging Face and contributed by the HF Datasets community
mlfoundations-dev/openthoughts-114k-no-special-template_eval_03-11-25_05-44-46_f912
Precomputed model outputs for evaluation.
Evaluation Results
GPQADiamond
Average Accuracy: 31.31% ± 4.97% Number of Runs: 3
Run Accuracy Questions Solved Total Questions
1 24.24% 48 198
2 26.26% 52 198
3 43.43% 86 198
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GeneralThought-195K
NEWEST RELEASE WITH 323K TRACES IS HERE
Thought wants to be free
Open reasoning data from the General Reasoning resource for March 3 2025. The dataset contains questions, reference answers, reasoning traces, final answers and other metadata from several popular reasoning models including DeepSeek-R1, DeepSeek-R1-Zero, OpenThoughts-32B, LIMO, deepseek-r1-distill-llama-70b, DeepHermes-3-Llama-3-8B-Preview and DeepScaleR-1.5B-Preview. We also include final… See the full description on the dataset page: https://huggingface.co/datasets/GeneralReasoning/GeneralThought-195K.
mlfoundations-dev/DCFT-open-thoughts-subset-v1-etash_1742823125_eval_0981
Precomputed model outputs for evaluation.
Evaluation Results
Summary
Metric AIME24 AIME25 AMC23 MATH500 GPQADiamond LiveCodeBench
Accuracy 22.7 16.7 63.5 80.4 22.4 31.5
AIME24
Average Accuracy: 22.67% ± 1.46% Number of Runs: 5
Run Accuracy Questions Solved Total Questions
1 23.33% 7 30
2 23.33% 7 30
3 23.33% 7 30
4 26.67% 8 30
5 16.67% 5 30… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations-dev/DCFT-open-thoughts-subset-v1-etash_1742823125_eval_0981.
mlfoundations-dev/DCFT-open-thoughts-subset-claude-v1-etash_1742633651_eval_0981
Precomputed model outputs for evaluation.
Evaluation Results
Summary
Metric AIME24 AIME25 AMC23 MATH500 GPQADiamond LiveCodeBench
Accuracy 22.0 22.0 61.0 81.8 23.6 31.4
AIME24
Average Accuracy: 22.00% ± 2.02% Number of Runs: 5
Run Accuracy Questions Solved Total Questions
1 23.33% 7 30
2 20.00% 6 30
3 20.00% 6 30
4 30.00% 9 30
5… See the full description on the dataset page: https://huggingface.co/datasets/mlfoundations-dev/DCFT-open-thoughts-subset-claude-v1-etash_1742633651_eval_0981.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GeneralThought-430K
Thought wants to be free
Open reasoning data from the General Reasoning resource for March 14 2025. The dataset contains questions, reference answers, reasoning traces, final answers and other metadata from several popular reasoning models including DeepSeek-R1, DeepSeek-R1-Zero, OpenThoughts-32B, LIMO, deepseek-r1-distill-llama-70b, DeepHermes-3-Llama-3-8B-Previewand DeepScaleR-1.5B-Preview. We also include final answers from o3-mini-2025-01-31… See the full description on the dataset page: https://huggingface.co/datasets/GeneralReasoning/GeneralThought-430K.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This is a filtered and metadata enriched version of open-thoughts/OpenThoughts-114k. While the original dataset is a valuable resource containing DeepSeek-R1 outputs, it has very little metadata (only 2 fields: system and conversations). It does not contain, for instance, the original solution label, which means that we can not verify the model answers.
What we did
filtered the dataset for math content (math questions were prefixed by "Return your final response within… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenThoughts-114k-math.