16 datasets found
  1. h

    SWE-Gym

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Gym, SWE-Gym [Dataset]. https://huggingface.co/datasets/SWE-Gym/SWE-Gym
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    SWE-Gym
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    SWE-Gym contains 2438 instances sourced from 11 Python repos, following SWE-Bench data collection procedure. Get started at project page github.com/SWE-Gym/SWE-Gym

  2. h

    SWE-Gym-logs

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Synth, SWE-Gym-logs [Dataset]. https://huggingface.co/datasets/swesynth/SWE-Gym-logs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    SWE-Synth
    Description

    swesynth/SWE-Gym-logs dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    OpenHands-Sampled-Trajectories

    • huggingface.co
    Updated Jan 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Gym (2025). OpenHands-Sampled-Trajectories [Dataset]. https://huggingface.co/datasets/SWE-Gym/OpenHands-Sampled-Trajectories
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 5, 2025
    Dataset authored and provided by
    SWE-Gym
    Description

    SWE-Gym/OpenHands-Sampled-Trajectories dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    leader-training-swe-gym-rest

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fundamental Research Labs, leader-training-swe-gym-rest [Dataset]. https://huggingface.co/datasets/FundamentalResearchLabs/leader-training-swe-gym-rest
    Explore at:
    Dataset provided by
    Fundamental Research Labs, Inc
    Authors
    Fundamental Research Labs
    Description

    FundamentalResearchLabs/leader-training-swe-gym-rest dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    SWE-Bench-Verified

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R2E-Gym, SWE-Bench-Verified [Dataset]. https://huggingface.co/datasets/R2E-Gym/SWE-Bench-Verified
    Explore at:
    Dataset authored and provided by
    R2E-Gym
    Description

    R2E-Gym/SWE-Bench-Verified dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    SWE-Gym-Small

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Multiturn RL, SWE-Gym-Small [Dataset]. https://huggingface.co/datasets/MultiturnRL/SWE-Gym-Small
    Explore at:
    Dataset authored and provided by
    Multiturn RL
    Description

    MultiturnRL/SWE-Gym-Small dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    Selected_SWE-Gym

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    YUE PAN, Selected_SWE-Gym [Dataset]. https://huggingface.co/datasets/dcloud347/Selected_SWE-Gym
    Explore at:
    Authors
    YUE PAN
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    🔧 Selected SWE-Gym Subset

    A curated subset of 100 program repair instances from the SWE-Gym dataset, selected for lightweight evaluation and rapid prototyping.

      📦 Dataset Description
    

    This dataset contains 100 program repair tasks selected from the full SWE-Gym benchmark. Each instance represents a realistic software bug scenario, including the following fields:

    instance_id: Unique identifier repo: GitHub repository commit: Bug-inducing commit hash test_setup: Test setup… See the full description on the dataset page: https://huggingface.co/datasets/dcloud347/Selected_SWE-Gym.

  8. h

    threshold-calib-sonnet-4-swe-gym-lite-13k

    • huggingface.co
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Tran (2025). threshold-calib-sonnet-4-swe-gym-lite-13k [Dataset]. https://huggingface.co/datasets/ryanhoangt/threshold-calib-sonnet-4-swe-gym-lite-13k
    Explore at:
    Dataset updated
    Jun 18, 2025
    Authors
    Ryan Tran
    Description

    ryanhoangt/threshold-calib-sonnet-4-swe-gym-lite-13k dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    Nano-SFT-SWE-Gym-gemini-2.5-flash

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ASSERT, Nano-SFT-SWE-Gym-gemini-2.5-flash [Dataset]. https://huggingface.co/datasets/ASSERT-KTH/Nano-SFT-SWE-Gym-gemini-2.5-flash
    Explore at:
    Dataset authored and provided by
    ASSERT
    Description

    ASSERT-KTH/Nano-SFT-SWE-Gym-gemini-2.5-flash dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    SWE-Bench-Verified-R2E-Gym-100

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rasdani, SWE-Bench-Verified-R2E-Gym-100 [Dataset]. https://huggingface.co/datasets/rasdani/SWE-Bench-Verified-R2E-Gym-100
    Explore at:
    Authors
    rasdani
    Description

    rasdani/SWE-Bench-Verified-R2E-Gym-100 dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    SWESwiss-Repair-RL-SWEGym-SWESmith-12K

    • huggingface.co
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Swiss (2025). SWESwiss-Repair-RL-SWEGym-SWESmith-12K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-Repair-RL-SWEGym-SWESmith-12K
    Explore at:
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    SWE-Swiss
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview

    RL dataset for training SWE-Swiss models on the repair task. The prompts are based on issues from SWE-Gym and SWE-smith. To create a challenging task, the code content in each prompt consists of two components: "oracle" files, which are the ground-truth files requiring a patch, and "distractor" files, which are plausible but incorrect files predicted by an LLM.

      Citation
    

    @misc{SWESwiss2025, title = {SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-Repair-RL-SWEGym-SWESmith-12K.

  12. h

    swe-agent-lm-32b-r2e-gym-trajectories

    • huggingface.co
    Updated Jul 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AxT-dev (2025). swe-agent-lm-32b-r2e-gym-trajectories [Dataset]. https://huggingface.co/datasets/AxT-dev/swe-agent-lm-32b-r2e-gym-trajectories
    Explore at:
    Dataset updated
    Jul 6, 2025
    Dataset authored and provided by
    AxT-dev
    Description

    AxT-dev/swe-agent-lm-32b-r2e-gym-trajectories dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    SWESwiss-SFT-Repair-4K

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Swiss, SWESwiss-SFT-Repair-4K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Repair-4K
    Explore at:
    Dataset authored and provided by
    SWE-Swiss
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview

    SFT dataset for training SWE-Swiss models on the repair task. The prompts are based on issues from SWE-Gym and SWE-smith. To create a challenging task, the code content in each prompt consists of two components: 'oracle' files, which are the ground-truth files requiring a patch, and 'distractor' files, which are plausible but incorrect files predicted by an LLM. The responses are generated by DeepSeek-R1-0528, and we filter out any data where the generated patch cannot pass… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Repair-4K.

  14. h

    SWESwiss-SFT-Localization-5K

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Swiss, SWESwiss-SFT-Localization-5K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Localization-5K
    Explore at:
    Dataset authored and provided by
    SWE-Swiss
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview

    SFT dataset for training SWE-Swiss models on the localization task. Prompts are constructed from a subset of issues in SWE-Gym-Raw and the SWE-bench training set. To prevent data leakage, we've filtered out any repositories that also appear in the SWE-bench test set. The responses are generated by DeepSeek-R1-0528. An instance is included in the final dataset only if the model's prediction meets two conditions: the number of predicted files is five or fewer, and the recall… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Localization-5K.

  15. h

    SWESwiss-SFT-Unittest-1K

    • huggingface.co
    Updated Aug 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SWE-Swiss (2025). SWESwiss-SFT-Unittest-1K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Unittest-1K
    Explore at:
    Dataset updated
    Aug 6, 2025
    Dataset authored and provided by
    SWE-Swiss
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Overview

    SFT dataset for training SWE-Swiss models on the unit test generation task. The prompts contain issues sourced from SWE-Gym and SWE-smith, while the responses are generated by DeepSeek-R1-0528. To ensure quality, we filter out data where the generated unit tests do not perform as expected. A generated test is kept only if its execution results correctly distinguish between a set of correct and incorrect patches, mirroring the behavior of the repository's own test suite.… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Unittest-1K.

  16. h

    SWE-rebench-filtered

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiahao, SWE-rebench-filtered [Dataset]. https://huggingface.co/datasets/hubert233/SWE-rebench-filtered
    Explore at:
    Authors
    Jiahao
    Description

    SWE-rebench-R2E (Filtered Dataset)

      Dataset Description
    

    This is a filtered version of the nebius/SWE-rebench dataset. The filtering process removes instances that overlap in repo with other established SWE-bench datasets to ensure uniqueness and reduce data contamination. Thus, you could directly use it as training data along with SWE-smith/R2E-Gym-Subset and test it on SWE-bench_Verified/Lite.

      Filtering Criteria
    

    The dataset was filtered using the following… See the full description on the dataset page: https://huggingface.co/datasets/hubert233/SWE-rebench-filtered.

  17. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
SWE-Gym, SWE-Gym [Dataset]. https://huggingface.co/datasets/SWE-Gym/SWE-Gym

SWE-Gym

SWE-Gym/SWE-Gym

Explore at:
86 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
SWE-Gym
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

SWE-Gym contains 2438 instances sourced from 11 Python repos, following SWE-Bench data collection procedure. Get started at project page github.com/SWE-Gym/SWE-Gym

Search
Clear search
Close search
Google apps
Main menu