Dataset Card for "winogrande"
Dataset Summary
WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.
Supported Tasks and Leaderboards
More Information… See the full description on the dataset page: https://huggingface.co/datasets/allenai/winogrande.
The WinoGrande, a large-scale dataset of 44k problems, inspired by the original Winograd Schema Challenge design, but adjusted to improve both the scale and the hardness of the dataset.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('winogrande', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
alvinming/winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Version 1.1 (Sep 16th, 2020)
./data/
├── train_[xs,s,m,l,xl].jsonl # training set with differnt sizes
├── train_[xs,s,m,l,xl]-labels.lst # answer labels for training sets
├── train_debiased.jsonl # debiased training set
├── train_debiased-labels.lst # answer labels for debiased training set
├── dev.jsonl # development set
├── dev-labels.lst # answer labels for development set
├── test.jsonl # test set
├── sample-submissions-labels.lst # example submission file for leaderboard
└── eval.py # evaluation script
You can use train_*.jsonl
for training models and dev
for validation.
Please note that labels are not included in test.jsonl
. To evaluate your models on test
set, make a submission to our leaderboard.
You can use eval.py
for evaluation on the dev split, which yields metrics.json
.
e.g., python eval.py --preds_file ./YOUR_PREDICTIONS.lst --labels_file ./dev-labels.lst
In the prediction file, each line consists of the predictions (1 or 2) by 5 training sets (ordered by xs
, s
, m
, l
, xl
, separated by comma) for each evauation set question.
2,1,1,1,1
1,1,2,2,2
1,1,1,1,1
.........
.........
Namely, the first column is the predictions by a model trained/finetuned on train_xs.jsonl
, followed by a model prediction by train_s.jsonl
, ... , and the last (fifth) column is the predictions by a model from train_xl.jsonl
.
Please checkout a sample submission file (sample-submission-labels.lst
) for reference.
You can submit your predictions on test
set to the leaderboard.
The submission file must be named as predictions.lst
. The format is the same as above.
If you use this dataset, please cite the following paper:
@article{sakaguchi2019winogrande,
title={WinoGrande: An Adversarial Winograd Schema Challenge at Scale},
author={Sakaguchi, Keisuke and Bras, Ronan Le and Bhagavatula, Chandra and Choi, Yejin},
journal={arXiv preprint arXiv:1907.10641},
year={2019}
}
Winogrande dataset is licensed under CC BY 2.0.
You may ask us questions at our google group.
Email: keisukes[at]allenai.org
jaypyon/Winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
vikhyatk/winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
ura-hcmut/Pre-computed-embedding-Mistral-7B-Instruct-v0.2-winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: Evaluation code for each benchmark dataset is under preparation and will be released soon to support standardized model assessment.
Dataset Card for Ko-WinoGrande
Dataset Summary
Ko-WinoGrande is a Korean adaptation of the WinoGrande dataset, which tests language models' commonsense reasoning through pronoun resolution tasks. Each item is a fill-in-the-blank sentence with two possible antecedents. Models must determine which choice best fits the blank given the… See the full description on the dataset page: https://huggingface.co/datasets/thunder-research-group/SNU_Ko-WinoGrande.
ura-hcmut/Pre-computed-embedding-Llama-2-7b-hf-winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wingrande v1.1
Dataset Summary
WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.
Data Fields
The data fields are the same among all splits.… See the full description on the dataset page: https://huggingface.co/datasets/coref-data/winogrande_raw.
rbelanec/winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
ura-hcmut/Pre-computed-embedding-Llama-2-13b-hf-winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
ura-hcmut/Pre-computed-embedding-Meta-Llama-3-8B-Instruct-winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
ura-hcmut/Pre-computed-embedding-gemma-7b-it-winogrande dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Rainbow is multi-task benchmark for common-sense reasoning that uses different existing QA datasets: aNLI, Cosmos QA, HellaSWAG. Physical IQa, Social IQa, WinoGrande.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is a deduplicated subset of the XL train split of WinoGrande, as used in the paper How Much Can We Forget about Data Contamination?. The deduplication was performed using this script. The data fields are the same as in https://huggingface.co/datasets/allenai/winogrande, with the additional "split-id" column that can be used to partition the benchmark questions into different subsets. The dataset can be used as a plug-in replacement if you want to work with the deduplicated… See the full description on the dataset page: https://huggingface.co/datasets/sbordt/forgetting-contamination-winogrande.
underactuated/winogrande-text dataset hosted on Hugging Face and contributed by the HF Datasets community
MilaWang/winogrande-routerbench-0shot-correct-choices-contrast-full-dense-2-shots-sd1 dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
WinoGrande Human Translated Sample for Basque
A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019).
Dataset Creation
Source Data
A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019).
Annotations
Annotation process
A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019). A cultural… See the full description on the dataset page: https://huggingface.co/datasets/orai-nlp/WinoGrande_HT_eu_sample.
yangzhang33/E2H-Winogrande-Middle dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "winogrande"
Dataset Summary
WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.
Supported Tasks and Leaderboards
More Information… See the full description on the dataset page: https://huggingface.co/datasets/allenai/winogrande.