4 datasets found
  1. h

    codeforces-cots

    • huggingface.co
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open R1 (2025). codeforces-cots [Dataset]. https://huggingface.co/datasets/open-r1/codeforces-cots
    Explore at:
    Dataset updated
    Mar 13, 2025
    Dataset authored and provided by
    Open R1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CodeForces-CoTs

      Dataset description
    

    CodeForces-CoTs is a large-scale dataset for training reasoning models on competitive programming tasks. It consists of 10k CodeForces problems with up to five reasoning traces generated by DeepSeek R1. We did not filter the traces for correctness, but found that around 84% of the Python ones pass the public tests. The dataset consists of several subsets:

    solutions: we prompt R1 to solve the problem and produce code.… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/codeforces-cots.

  2. h

    open-r1-codeforces-cot-kd-solutions-sample

    • huggingface.co
    Updated Mar 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    wing lian (2025). open-r1-codeforces-cot-kd-solutions-sample [Dataset]. https://huggingface.co/datasets/winglian/open-r1-codeforces-cot-kd-solutions-sample
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 13, 2025
    Authors
    wing lian
    Description

    Dataset Card for open-r1-codeforces-cot-kd-solutions-sample

    This dataset has been created with distilabel.

      Dataset Summary
    

    This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/winglian/open-r1-codeforces-cot-kd-solutions-sample/raw/main/pipeline.yaml"

    or explore the configuration: distilabel pipeline info… See the full description on the dataset page: https://huggingface.co/datasets/winglian/open-r1-codeforces-cot-kd-solutions-sample.

  3. h

    codeforces

    • huggingface.co
    Updated May 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Open R1 (2025). codeforces [Dataset]. https://huggingface.co/datasets/open-r1/codeforces
    Explore at:
    Dataset updated
    May 13, 2025
    Dataset authored and provided by
    Open R1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CodeForces

      Dataset description
    

    CodeForces is one of the most popular websites among competitive programmers, hosting regular contests where participants must solve challenging algorithmic optimization problems. The challenging nature of these problems makes them an interesting dataset to improve and test models’ code reasoning capabilities. This dataset includes more than 10k unique problems covering the very first contests all the way to 2025.… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/codeforces.

  4. h

    Astro-R1

    • huggingface.co
    Updated Apr 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucidity AI (2025). Astro-R1 [Dataset]. https://huggingface.co/datasets/LucidityAI/Astro-R1
    Explore at:
    Dataset updated
    Apr 5, 2025
    Dataset authored and provided by
    Lucidity AI
    Description

    Astro-R1

    Astro-R1 is designed to enhance AI models' reasoning capabilities, blending datasets for math, coding, and conversational tasks and has been shuffled. This dataset is a mix of the following (in diffrent amounts): simplescaling/s1K-1.1 LucidityAI/QWQ-Distill ServiceNow-AI/R1-Distill-SFT open-r1/codeforces-cots simplescaling/s1K-claude-3-7-sonnet

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Open R1 (2025). codeforces-cots [Dataset]. https://huggingface.co/datasets/open-r1/codeforces-cots

codeforces-cots

open-r1/codeforces-cots

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 13, 2025
Dataset authored and provided by
Open R1
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset Card for CodeForces-CoTs

  Dataset description

CodeForces-CoTs is a large-scale dataset for training reasoning models on competitive programming tasks. It consists of 10k CodeForces problems with up to five reasoning traces generated by DeepSeek R1. We did not filter the traces for correctness, but found that around 84% of the Python ones pass the public tests. The dataset consists of several subsets:

solutions: we prompt R1 to solve the problem and produce code.… See the full description on the dataset page: https://huggingface.co/datasets/open-r1/codeforces-cots.

Search
Clear search
Close search
Google apps
Main menu