Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for s1K
Dataset Summary
s1K is a dataset of 1,000 examples of diverse, high-quality & difficult questions with distilled reasoning traces & solutions from Gemini Thining. Refer to the s1 paper for more details.
Usage
from datasets import load_dataset ds = load_dataset("simplescaling/s1K")["train"] ds[0]
Dataset Structure
Data Instances
An example looks as follows: { 'solution': '1. **Rewrite… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Citation Information
@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Summary:
The open-s1 dataset contains 18,615 mathematical reasoning problems, filtered from the s1K dataset. It’s part of the Open RS project, aimed at enhancing reasoning in small LLMs using reinforcement learning.
Original dataset: https://huggingface.co/datasets/knoveleng/open-s1
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Citation Information
@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for s1K
Dataset Summary
s1K-1.1 consists of the same 1,000 questions as in s1K but with traces instead generated by DeepSeek r1. We find that these traces lead to much better performance.
Usage
from datasets import load_dataset ds = load_dataset("simplescaling/s1K-1.1")["train"] ds[0]
Dataset Structure
Data Instances
An example looks as follows: { 'solution': '1. **Rewrite the function using… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K-1.1.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The benchmark constructed in paper S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models.
Introduction
S1-Bench is a novel benchmark designed to evaluate Large Reasoning Models' performance on simple tasks that favor intuitive system 1 thinking rather than deliberative system 2 reasoning. S1-Bench comprises 422 question-answer pairs across four major categories and 28 subcategories, balanced with 220 English and 202 Chinese questions.… See the full description on the dataset page: https://huggingface.co/datasets/WYRipple/S1-Bench.
Facebook
Twitterpresencesw/en_processed_open-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterming0100/yt-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterSeanie-lee/ThinkSafe-0.6B-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitteryiboowang/R1-Compress-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterdonoway/s1K-s1.1-32B-rollouts dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterseosiju/BigEarthNet-S1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterljvmiranda921/msde-S1-cs dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittermlfoundations-dev/s1-1.1-literally-1-example dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Citation Information
@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
jdp22/s1-tool-new dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittersimplescaling/s1K-1.1_tokenized dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterzzzeeee/s1-tool-1k-latest dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twittersiyanzhao/s1-59k-minus-s1k dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for s1K
Dataset Summary
s1K is a dataset of 1,000 examples of diverse, high-quality & difficult questions with distilled reasoning traces & solutions from Gemini Thining. Refer to the s1 paper for more details.
Usage
from datasets import load_dataset ds = load_dataset("simplescaling/s1K")["train"] ds[0]
Dataset Structure
Data Instances
An example looks as follows: { 'solution': '1. **Rewrite… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K.