This is a filtered and metadata enriched version of open-thoughts/OpenThoughts-114k. While the original dataset is a valuable resource containing DeepSeek-R1 outputs, it has very little metadata (only 2 fields: system and conversations). It does not contain, for instance, the original solution label, which means that we can not verify the model answers.
What we did
filtered the dataset for math content (math questions were prefixed by "Return your final response within… See the full description on the dataset page: https://huggingface.co/datasets/akahana/OpenThoughts-114k-math.
This is a filtered and metadata enriched version of open-thoughts/OpenThoughts-114k. While the original dataset is a valuable resource containing DeepSeek-R1 outputs, it has very little metadata (only 2 fields: system and conversations). It does not contain, for instance, the original solution label, which means that we can not verify the model answers.
What we did
filtered the dataset for math content (math questions were prefixed by "Return your final response within… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/OpenThoughts-114k-math.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This is a filtered and metadata enriched version of open-thoughts/OpenThoughts-114k. While the original dataset is a valuable resource containing DeepSeek-R1 outputs, it has very little metadata (only 2 fields: system and conversations). It does not contain, for instance, the original solution label, which means that we can not verify the model answers.
What we did
filtered the dataset for math content (math questions were prefixed by "Return your final response within… See the full description on the dataset page: https://huggingface.co/datasets/akahana/OpenThoughts-114k-math.