16 datasets found

h
SWE-Gym
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Gym, SWE-Gym [Dataset]. https://huggingface.co/datasets/SWE-Gym/SWE-Gym
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
SWE-Gym
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SWE-Gym contains 2438 instances sourced from 11 Python repos, following SWE-Bench data collection procedure. Get started at project page github.com/SWE-Gym/SWE-Gym
h
SWE-Gym-logs
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Synth, SWE-Gym-logs [Dataset]. https://huggingface.co/datasets/swesynth/SWE-Gym-logs
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
SWE-Synth
Description
swesynth/SWE-Gym-logs dataset hosted on Hugging Face and contributed by the HF Datasets community
h
OpenHands-Sampled-Trajectories
huggingface.co
Updated Jan 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Gym (2025). OpenHands-Sampled-Trajectories [Dataset]. https://huggingface.co/datasets/SWE-Gym/OpenHands-Sampled-Trajectories
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 5, 2025
Dataset authored and provided by
SWE-Gym
Description
SWE-Gym/OpenHands-Sampled-Trajectories dataset hosted on Hugging Face and contributed by the HF Datasets community
h
leader-training-swe-gym-rest
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fundamental Research Labs, leader-training-swe-gym-rest [Dataset]. https://huggingface.co/datasets/FundamentalResearchLabs/leader-training-swe-gym-rest
Explore at:
Dataset provided by
Fundamental Research Labs, Inc
Authors
Fundamental Research Labs
Description
FundamentalResearchLabs/leader-training-swe-gym-rest dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SWE-Bench-Verified
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
R2E-Gym, SWE-Bench-Verified [Dataset]. https://huggingface.co/datasets/R2E-Gym/SWE-Bench-Verified
Explore at:
Dataset authored and provided by
R2E-Gym
Description
R2E-Gym/SWE-Bench-Verified dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SWE-Gym-Small
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Multiturn RL, SWE-Gym-Small [Dataset]. https://huggingface.co/datasets/MultiturnRL/SWE-Gym-Small
Explore at:
Dataset authored and provided by
Multiturn RL
Description
MultiturnRL/SWE-Gym-Small dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Selected_SWE-Gym
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
YUE PAN, Selected_SWE-Gym [Dataset]. https://huggingface.co/datasets/dcloud347/Selected_SWE-Gym
Explore at:
Authors
YUE PAN
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
🔧 Selected SWE-Gym Subset

A curated subset of 100 program repair instances from the SWE-Gym dataset, selected for lightweight evaluation and rapid prototyping.

📦 Dataset Description

This dataset contains 100 program repair tasks selected from the full SWE-Gym benchmark. Each instance represents a realistic software bug scenario, including the following fields:

instance_id: Unique identifier repo: GitHub repository commit: Bug-inducing commit hash test_setup: Test setup… See the full description on the dataset page: https://huggingface.co/datasets/dcloud347/Selected_SWE-Gym.
h
threshold-calib-sonnet-4-swe-gym-lite-13k
huggingface.co
Updated Jun 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan Tran (2025). threshold-calib-sonnet-4-swe-gym-lite-13k [Dataset]. https://huggingface.co/datasets/ryanhoangt/threshold-calib-sonnet-4-swe-gym-lite-13k
Explore at:
Dataset updated
Jun 18, 2025
Authors
Ryan Tran
Description
ryanhoangt/threshold-calib-sonnet-4-swe-gym-lite-13k dataset hosted on Hugging Face and contributed by the HF Datasets community
h
Nano-SFT-SWE-Gym-gemini-2.5-flash
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ASSERT, Nano-SFT-SWE-Gym-gemini-2.5-flash [Dataset]. https://huggingface.co/datasets/ASSERT-KTH/Nano-SFT-SWE-Gym-gemini-2.5-flash
Explore at:
Dataset authored and provided by
ASSERT
Description
ASSERT-KTH/Nano-SFT-SWE-Gym-gemini-2.5-flash dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SWE-Bench-Verified-R2E-Gym-100
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
rasdani, SWE-Bench-Verified-R2E-Gym-100 [Dataset]. https://huggingface.co/datasets/rasdani/SWE-Bench-Verified-R2E-Gym-100
Explore at:
Authors
rasdani
Description
rasdani/SWE-Bench-Verified-R2E-Gym-100 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SWESwiss-Repair-RL-SWEGym-SWESmith-12K
huggingface.co
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Swiss (2025). SWESwiss-Repair-RL-SWEGym-SWESmith-12K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-Repair-RL-SWEGym-SWESmith-12K
Explore at:
Dataset updated
Aug 4, 2025
Dataset authored and provided by
SWE-Swiss
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Overview

RL dataset for training SWE-Swiss models on the repair task. The prompts are based on issues from SWE-Gym and SWE-smith. To create a challenging task, the code content in each prompt consists of two components: "oracle" files, which are the ground-truth files requiring a patch, and "distractor" files, which are plausible but incorrect files predicted by an LLM.

Citation

@misc{SWESwiss2025, title = {SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-Repair-RL-SWEGym-SWESmith-12K.
h
swe-agent-lm-32b-r2e-gym-trajectories
huggingface.co
Updated Jul 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AxT-dev (2025). swe-agent-lm-32b-r2e-gym-trajectories [Dataset]. https://huggingface.co/datasets/AxT-dev/swe-agent-lm-32b-r2e-gym-trajectories
Explore at:
Dataset updated
Jul 6, 2025
Dataset authored and provided by
AxT-dev
Description
AxT-dev/swe-agent-lm-32b-r2e-gym-trajectories dataset hosted on Hugging Face and contributed by the HF Datasets community
h
SWESwiss-SFT-Repair-4K
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Swiss, SWESwiss-SFT-Repair-4K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Repair-4K
Explore at:
Dataset authored and provided by
SWE-Swiss
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Overview

SFT dataset for training SWE-Swiss models on the repair task. The prompts are based on issues from SWE-Gym and SWE-smith. To create a challenging task, the code content in each prompt consists of two components: 'oracle' files, which are the ground-truth files requiring a patch, and 'distractor' files, which are plausible but incorrect files predicted by an LLM. The responses are generated by DeepSeek-R1-0528, and we filter out any data where the generated patch cannot pass… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Repair-4K.
h
SWESwiss-SFT-Localization-5K
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Swiss, SWESwiss-SFT-Localization-5K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Localization-5K
Explore at:
Dataset authored and provided by
SWE-Swiss
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Overview

SFT dataset for training SWE-Swiss models on the localization task. Prompts are constructed from a subset of issues in SWE-Gym-Raw and the SWE-bench training set. To prevent data leakage, we've filtered out any repositories that also appear in the SWE-bench test set. The responses are generated by DeepSeek-R1-0528. An instance is included in the final dataset only if the model's prediction meets two conditions: the number of predicted files is five or fewer, and the recall… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Localization-5K.
h
SWESwiss-SFT-Unittest-1K
huggingface.co
Updated Aug 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SWE-Swiss (2025). SWESwiss-SFT-Unittest-1K [Dataset]. https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Unittest-1K
Explore at:
Dataset updated
Aug 6, 2025
Dataset authored and provided by
SWE-Swiss
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Overview

SFT dataset for training SWE-Swiss models on the unit test generation task. The prompts contain issues sourced from SWE-Gym and SWE-smith, while the responses are generated by DeepSeek-R1-0528. To ensure quality, we filter out data where the generated unit tests do not perform as expected. A generated test is kept only if its execution results correctly distinguish between a set of correct and incorrect patches, mirroring the behavior of the repository's own test suite.… See the full description on the dataset page: https://huggingface.co/datasets/SWE-Swiss/SWESwiss-SFT-Unittest-1K.
h
SWE-rebench-filtered
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiahao, SWE-rebench-filtered [Dataset]. https://huggingface.co/datasets/hubert233/SWE-rebench-filtered
Explore at:
Authors
Jiahao
Description
SWE-rebench-R2E (Filtered Dataset)

Dataset Description

This is a filtered version of the nebius/SWE-rebench dataset. The filtering process removes instances that overlap in repo with other established SWE-bench datasets to ensure uniqueness and reduce data contamination. Thus, you could directly use it as training data along with SWE-smith/R2E-Gym-Subset and test it on SWE-bench_Verified/Lite.

Filtering Criteria

The dataset was filtered using the following… See the full description on the dataset page: https://huggingface.co/datasets/hubert233/SWE-rebench-filtered.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

SWE-Gym, SWE-Gym [Dataset]. https://huggingface.co/datasets/SWE-Gym/SWE-Gym

SWE-Gym

SWE-Gym/SWE-Gym

Explore at:

86 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset authored and provided by

SWE-Gym

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

SWE-Gym contains 2438 instances sourced from 11 Python repos, following SWE-Bench data collection procedure. Get started at project page github.com/SWE-Gym/SWE-Gym

Clear search

Close search

Google apps

Main menu

SWE-Gym

SWE-Gym-logs

OpenHands-Sampled-Trajectories

leader-training-swe-gym-rest

SWE-Bench-Verified

SWE-Gym-Small

Selected_SWE-Gym

threshold-calib-sonnet-4-swe-gym-lite-13k

Nano-SFT-SWE-Gym-gemini-2.5-flash

SWE-Bench-Verified-R2E-Gym-100

SWESwiss-Repair-RL-SWEGym-SWESmith-12K

swe-agent-lm-32b-r2e-gym-trajectories

SWESwiss-SFT-Repair-4K

SWESwiss-SFT-Localization-5K

SWESwiss-SFT-Unittest-1K

SWE-rebench-filtered

SWE-Gym

SWE-Gym/SWE-Gym