63 datasets found

h
s1K
huggingface.co
Updated Jan 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2025). s1K [Dataset]. https://huggingface.co/datasets/simplescaling/s1K
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 14, 2025
Dataset authored and provided by
simplescaling
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for s1K

Dataset Summary

s1K is a dataset of 1,000 examples of diverse, high-quality & difficult questions with distilled reasoning traces & solutions from Gemini Thining. Refer to the s1 paper for more details.

Usage

pip install -q datasets

from datasets import load_dataset ds = load_dataset("simplescaling/s1K")["train"] ds[0]

Dataset Structure Data Instances

An example looks as follows: { 'solution': '1. **Rewrite… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K.
h
s1-teasers
huggingface.co
Updated Nov 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2024). s1-teasers [Dataset]. https://huggingface.co/datasets/simplescaling/s1-teasers
Explore at:
Dataset updated
Nov 27, 2024
Dataset authored and provided by
simplescaling
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Citation Information

@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
knoveleng/open-s1
kaggle.com
zip
Updated Mar 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Duvan Martínez (2025). knoveleng/open-s1 [Dataset]. https://www.kaggle.com/datasets/duvanjmb/knovelengopen-s1
Explore at:
zip(18411611 bytes)Available download formats
Dataset updated
Mar 31, 2025
Authors
Duvan Martínez
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Summary:

The open-s1 dataset contains 18,615 mathematical reasoning problems, filtered from the s1K dataset. It’s part of the Open RS project, aimed at enhancing reasoning in small LLMs using reinforcement learning.

Original dataset: https://huggingface.co/datasets/knoveleng/open-s1
h
s1-prob
huggingface.co
Updated Nov 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2024). s1-prob [Dataset]. https://huggingface.co/datasets/simplescaling/s1-prob
Explore at:
Dataset updated
Nov 18, 2024
Dataset authored and provided by
simplescaling
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Citation Information

@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
h
s1K-1.1
huggingface.co
Updated Feb 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2025). s1K-1.1 [Dataset]. https://huggingface.co/datasets/simplescaling/s1K-1.1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 8, 2025
Dataset authored and provided by
simplescaling
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for s1K

Dataset Summary

s1K-1.1 consists of the same 1,000 questions as in s1K but with traces instead generated by DeepSeek r1. We find that these traces lead to much better performance.

Usage

pip install -q datasets

from datasets import load_dataset ds = load_dataset("simplescaling/s1K-1.1")["train"] ds[0]

Dataset Structure Data Instances

An example looks as follows: { 'solution': '1. **Rewrite the function using… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K-1.1.
h
S1-Bench
huggingface.co
Updated Apr 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenyuan Zhang (2025). S1-Bench [Dataset]. https://huggingface.co/datasets/WYRipple/S1-Bench
Explore at:
Dataset updated
Apr 13, 2025
Authors
Wenyuan Zhang
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The benchmark constructed in paper S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models.

Introduction

S1-Bench is a novel benchmark designed to evaluate Large Reasoning Models' performance on simple tasks that favor intuitive system 1 thinking rather than deliberative system 2 reasoning. S1-Bench comprises 422 question-answer pairs across four major categories and 28 subcategories, balanced with 220 English and 202 Chinese questions.… See the full description on the dataset page: https://huggingface.co/datasets/WYRipple/S1-Bench.
h
en_processed_open-s1
huggingface.co
Updated Jun 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Datnvt (2025). en_processed_open-s1 [Dataset]. https://huggingface.co/datasets/presencesw/en_processed_open-s1
Explore at:
Dataset updated
Jun 1, 2025
Authors
Datnvt
Description
presencesw/en_processed_open-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
yt-s1
huggingface.co
Updated Mar 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ming hu (2025). yt-s1 [Dataset]. https://huggingface.co/datasets/ming0100/yt-s1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 7, 2025
Authors
ming hu
Description
ming0100/yt-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
ThinkSafe-0.6B-s1
huggingface.co
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seanie Lee (2025). ThinkSafe-0.6B-s1 [Dataset]. https://huggingface.co/datasets/Seanie-lee/ThinkSafe-0.6B-s1
Explore at:
Dataset updated
Nov 26, 2025
Authors
Seanie Lee
Description
Seanie-lee/ThinkSafe-0.6B-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
R1-Compress-s1
huggingface.co
Updated Jul 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yibo Wang (2025). R1-Compress-s1 [Dataset]. https://huggingface.co/datasets/yiboowang/R1-Compress-s1
Explore at:
Dataset updated
Jul 1, 2025
Authors
Yibo Wang
Description
yiboowang/R1-Compress-s1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1K-s1.1-32B-rollouts
huggingface.co
Updated Apr 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elizabeth Donoway (2025). s1K-s1.1-32B-rollouts [Dataset]. https://huggingface.co/datasets/donoway/s1K-s1.1-32B-rollouts
Explore at:
Dataset updated
Apr 26, 2025
Authors
Elizabeth Donoway
Description
donoway/s1K-s1.1-32B-rollouts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
BigEarthNet-S1
huggingface.co
Updated Nov 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
junseon (2025). BigEarthNet-S1 [Dataset]. https://huggingface.co/datasets/seosiju/BigEarthNet-S1
Explore at:
Dataset updated
Nov 24, 2025
Authors
junseon
Description
seosiju/BigEarthNet-S1 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
msde-S1-cs
huggingface.co
Updated Nov 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lj V. Miranda (2025). msde-S1-cs [Dataset]. https://huggingface.co/datasets/ljvmiranda921/msde-S1-cs
Explore at:
Dataset updated
Nov 19, 2025
Authors
Lj V. Miranda
Description
ljvmiranda921/msde-S1-cs dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1-rd
huggingface.co
Updated Oct 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yihe Deng (2025). s1-rd [Dataset]. https://huggingface.co/datasets/ydeng9/s1-rd
Explore at:
Dataset updated
Oct 15, 2025
Authors
Yihe Deng
Description
ydeng9/s1-rd dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1-1.1-literally-1-example
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ML Foundations Development, s1-1.1-literally-1-example [Dataset]. https://huggingface.co/datasets/mlfoundations-dev/s1-1.1-literally-1-example
Explore at:
Dataset authored and provided by
ML Foundations Development
Description
mlfoundations-dev/s1-1.1-literally-1-example dataset hosted on Hugging Face and contributed by the HF Datasets community
h
results
huggingface.co
Updated Jan 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2025). results [Dataset]. https://huggingface.co/datasets/simplescaling/results
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 19, 2025
Dataset authored and provided by
simplescaling
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Citation Information

@misc{muennighoff2025s1simpletesttimescaling, title={s1: Simple test-time scaling}, author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, year={2025}, eprint={2501.19393}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2501.19393}, }
h
s1-tool-new
huggingface.co
Updated Apr 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dapeng Jiang (2025). s1-tool-new [Dataset]. https://huggingface.co/datasets/jdp22/s1-tool-new
Explore at:
Dataset updated
Apr 26, 2025
Authors
Dapeng Jiang
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
jdp22/s1-tool-new dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1K-1.1_tokenized
huggingface.co
Updated Feb 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
simplescaling (2025). s1K-1.1_tokenized [Dataset]. https://huggingface.co/datasets/simplescaling/s1K-1.1_tokenized
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 8, 2025
Dataset authored and provided by
simplescaling
Description
simplescaling/s1K-1.1_tokenized dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1-tool-1k-latest
huggingface.co
Updated Mar 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2025). s1-tool-1k-latest [Dataset]. https://huggingface.co/datasets/zzzeeee/s1-tool-1k-latest
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 2, 2025
Authors
Anonymous
Description
zzzeeee/s1-tool-1k-latest dataset hosted on Hugging Face and contributed by the HF Datasets community
h
s1-59k-minus-s1k
huggingface.co
Updated Apr 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
siyan zhao (2025). s1-59k-minus-s1k [Dataset]. https://huggingface.co/datasets/siyanzhao/s1-59k-minus-s1k
Explore at:
Dataset updated
Apr 16, 2025
Authors
siyan zhao
Description
siyanzhao/s1-59k-minus-s1k dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

simplescaling (2025). s1K [Dataset]. https://huggingface.co/datasets/simplescaling/s1K

s1K

simplescaling/s1K

Explore at:

4 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jan 14, 2025

Dataset authored and provided by

simplescaling

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset Card for s1K

  Dataset Summary

s1K is a dataset of 1,000 examples of diverse, high-quality & difficult questions with distilled reasoning traces & solutions from Gemini Thining. Refer to the s1 paper for more details.

  Usage

pip install -q datasets

from datasets import load_dataset ds = load_dataset("simplescaling/s1K")["train"] ds[0]

  Dataset Structure





  Data Instances

An example looks as follows: { 'solution': '1. **Rewrite… See the full description on the dataset page: https://huggingface.co/datasets/simplescaling/s1K.

Clear search

Close search

Google apps

Main menu

s1K

pip install -q datasets

s1-teasers

knoveleng/open-s1

s1-prob

s1K-1.1

pip install -q datasets

S1-Bench

en_processed_open-s1

yt-s1

ThinkSafe-0.6B-s1

R1-Compress-s1

s1K-s1.1-32B-rollouts

BigEarthNet-S1

msde-S1-cs

s1-rd

s1-1.1-literally-1-example

results

s1-tool-new

s1K-1.1_tokenized

s1-tool-1k-latest

s1-59k-minus-s1k

s1KSee More Versions

simplescaling/s1K

pip install -q datasets

s1K