Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DeepMath-103K
๐ฅ News
May 8, 2025: We found that 48 samples contained hints that revealed the answers. The relevant questions have now been revised to remove the leaked answers. April 14, 2025: We release DeepMath-103K, a large-scale dataset featuring challenging, verifiable, and decontaminated math problems tailored for RL and SFT. We open source:โฆ See the full description on the dataset page: https://huggingface.co/datasets/zwhe99/DeepMath-103K.
Facebook
TwitterAyushnangia/bucket-deepmath-99k dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterpe-nlp/DeepMath-100K-filteredv2 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterqingy2024/DeepMath-Reformatted dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitteremilbiju/RL-MATH-DeepMath dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterqingyangzhang/DeepMath-103K-formatted dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterfriendshipkim/DeepMath-103K dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDeepMath-309K_scored - with OpenDataArena Scores
This dataset is a scored version of the original zwhe99/DeepMath-103K dataset. The scoring was performed using the OpenDataArena-Tool, a comprehensive suite of automated evaluation methods for assessing instruction-following datasets. This version of the dataset includes rich, multi-dimensional scores for both the instructions (questions) and the instruction-response pairs, allowing for highly granular data analysis and selection. Allโฆ See the full description on the dataset page: https://huggingface.co/datasets/OpenDataArena/DeepMath-309K_scored.
Facebook
Twitterwyhwhy/DeepMath-103K-Augmented dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDataset Information
To perform RLVR (e.g. GRPO) better, this dataset is a carefully filtered version of zwhe99/DeepMath-103K, resulting in 59,873 many data.
Filtering Procedure
Step 0. Remove 'difficulty' < 1.0 and Deduplication
At first, there are 4 questions that the 'difficulty' is less than 1.0. These 4 data points do not make sense at all. In addition to filter out those four, based on the "question" column, we do basic deduplication. As a result, theโฆ See the full description on the dataset page: https://huggingface.co/datasets/ChuGyouk/DeepMath-Filtered-59.9K.
Facebook
TwitterMultiturnRL/DeepMath-Small dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterfuture7/DeepMath-Meta-Test dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterzyzshishui0627/DeepMath-103K-openai-format dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterpe-nlp/DeepMath-75K-80K-filteredv2-difficulty dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterpe-nlp/DeepMath-0.5K-filteredv2-difficulty dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterNishi0923/DeepMath-103K-Bespoke-Filtered-Test-2k dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterpe-nlp/DeepMath-Magistral-stage1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterpe-nlp/DeepMath-20K-25K-filteredv2-difficulty dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DeepMath-103K
๐ Overview
DeepMath-103K is meticulously curated to push the boundaries of mathematical reasoning in language models. Key features include:1. Challenging Problems: DeepMath-103K has a strong focus on difficult mathematical problems (primarily Levels 5-9), significantly raising the complexity bar compared to many existing open datasets.
Difficultyโฆ See the full description on the dataset page: https://huggingface.co/datasets/swpdd/test.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DeepMath-103K
๐ฅ News
May 8, 2025: We found that 48 samples contained hints that revealed the answers. The relevant questions have now been revised to remove the leaked answers. April 14, 2025: We release DeepMath-103K, a large-scale dataset featuring challenging, verifiable, and decontaminated math problems tailored for RL and SFT. We open source:โฆ See the full description on the dataset page: https://huggingface.co/datasets/zwhe99/DeepMath-103K.