22 datasets found

winogrande
huggingface.co
Updated Oct 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ai2 (2022). winogrande [Dataset]. https://huggingface.co/datasets/allenai/winogrande
Explore at:
Dataset updated
Oct 28, 2022
Dataset provided by
Allen Institute for AIhttp://allenai.org/
Authors
Ai2
Description
Dataset Card for "winogrande"

Dataset Summary

WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.

Supported Tasks and Leaderboards

More Information… See the full description on the dataset page: https://huggingface.co/datasets/allenai/winogrande.
h
winogrande_raw
huggingface.co
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
coref-data (2024). winogrande_raw [Dataset]. https://huggingface.co/datasets/coref-data/winogrande_raw
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2024
Dataset authored and provided by
coref-data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Wingrande v1.1

Dataset Summary

WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.

Data Fields

The data fields are the same among all splits.… See the full description on the dataset page: https://huggingface.co/datasets/coref-data/winogrande_raw.
h
niv2_winogrande_raw
huggingface.co
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
coref-data (2024). niv2_winogrande_raw [Dataset]. https://huggingface.co/datasets/coref-data/niv2_winogrande_raw
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2024
Dataset authored and provided by
coref-data
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Natural Instructions v2 Winogrande Tasks

Project: https://github.com/allenai/natural-instructions Data source: DataProvenanceInitiative/niv2_submix_original

Details

This dataset contains all Winogrande examples that were included in the Flan 2022 collection which were orignally published in Super-Natural-Instructions. The data is copied from the preprocessed Natural Instructions v2 dataset at DataProvenanceInitiative/niv2_submix_original. These tasks are:… See the full description on the dataset page: https://huggingface.co/datasets/coref-data/niv2_winogrande_raw.
SNU_Ko-WinoGrande
huggingface.co
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
THUNDER Research Group (2025). SNU_Ko-WinoGrande [Dataset]. https://huggingface.co/datasets/thunder-research-group/SNU_Ko-WinoGrande
Explore at:
Dataset updated
Aug 20, 2025
Dataset provided by
Research group
Authors
THUNDER Research Group
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note: Evaluation code for each benchmark dataset is under preparation and will be released soon to support standardized model assessment.

Dataset Card for Ko-WinoGrande Dataset Summary

Ko-WinoGrande is a Korean adaptation of the WinoGrande dataset, which tests language models' commonsense reasoning through pronoun resolution tasks. Each item is a fill-in-the-blank sentence with two possible antecedents. Models must determine which choice best fits the blank given the… See the full description on the dataset page: https://huggingface.co/datasets/thunder-research-group/SNU_Ko-WinoGrande.
h
WinoGrande_HT_eu_sample
huggingface.co
Updated Jul 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Orai NLP technologies (2025). WinoGrande_HT_eu_sample [Dataset]. https://huggingface.co/datasets/orai-nlp/WinoGrande_HT_eu_sample
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 22, 2025
Dataset authored and provided by
Orai NLP technologies
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
WinoGrande Human Translated Sample for Basque

A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019).

Dataset Creation Source Data

A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019).

Annotations Annotation process

A subset of 250 samples manually translated to Basque from the WinoGrande dataset (Sakaguchi et al., 2019). A cultural… See the full description on the dataset page: https://huggingface.co/datasets/orai-nlp/WinoGrande_HT_eu_sample.
h
forgetting-contamination-winogrande
huggingface.co
Updated Sep 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Bordt (2025). forgetting-contamination-winogrande [Dataset]. https://huggingface.co/datasets/sbordt/forgetting-contamination-winogrande
Explore at:
Dataset updated
Sep 5, 2025
Authors
Sebastian Bordt
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a deduplicated subset of the XL train split of WinoGrande, as used in the paper How Much Can We Forget about Data Contamination?. The deduplication was performed using this script. The data fields are the same as in https://huggingface.co/datasets/allenai/winogrande, with the additional "split-id" column that can be used to partition the benchmark questions into different subsets. The dataset can be used as a plug-in replacement if you want to work with the deduplicated… See the full description on the dataset page: https://huggingface.co/datasets/sbordt/forgetting-contamination-winogrande.
h
winogrande-tr-v0.2
huggingface.co
Updated Apr 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamad Alhajar (2024). winogrande-tr-v0.2 [Dataset]. https://huggingface.co/datasets/malhajar/winogrande-tr-v0.2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 26, 2024
Authors
Mohamad Alhajar
Description
Dataset Card for "malhajar/winogrande-tr-v0.2"

This Dataset is part of a series of datasets aimed at advancing Turkish LLM Developments by establishing rigid Turkish benchmarks to evaluate the performance of LLM's Produced in the Turkish Language. malhajar/winogrande-tr-v0.2 is a translated version of winogrande using GPT4 Technologies aimed specifically to be used in the OpenLLMTurkishLeaderboard_v0.2 Translated by: Mohamad Alhajar

Dataset Summary

WinoGrande is a new… See the full description on the dataset page: https://huggingface.co/datasets/malhajar/winogrande-tr-v0.2.
h
lt_winogrande
huggingface.co
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neurotechnology (2024). lt_winogrande [Dataset]. https://huggingface.co/datasets/neurotechnology/lt_winogrande
Explore at:
Dataset updated
Aug 26, 2024
Dataset authored and provided by
Neurotechnology
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Description

Winogrande is a set of 273 expert-crafted pronoun resolution problems originally designed to be unsolvable for statistical models that rely on selectional preferences or word associations. This dataset has been translated into Lithuanian using GPT-4. This dataset is utilized as a benchmark and forms part of the evaluation protocol for Lithuanian language models, as outlined in the technical report OPEN LLAMA2 MODEL FOR THE LITHUANIAN LANGUAGE (Nakvosas et al.… See the full description on the dataset page: https://huggingface.co/datasets/neurotechnology/lt_winogrande.
h
WinoGrande
huggingface.co
opendatalab.com
+1more
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). WinoGrande [Dataset]. https://huggingface.co/datasets/mteb/WinoGrande
Explore at:
Dataset updated
Sep 10, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
WinoGrande An MTEB dataset Massive Text Embedding Benchmark

Measuring the ability to retrieve the groundtruth answers to reasoning task queries on winogrande.

Task category t2t

Domains Encyclopaedic, Written

Referencehttps://winogrande.allenai.org/

Source datasets:

mteb/AlloprofRetrieval

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_task("WinoGrande") evaluator… See the full description on the dataset page: https://huggingface.co/datasets/mteb/WinoGrande.
h
icelandic-winogrande
huggingface.co
Updated Sep 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miðeind ehf. (2022). icelandic-winogrande [Dataset]. https://huggingface.co/datasets/mideind/icelandic-winogrande
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 19, 2022
Dataset authored and provided by
Miðeind ehf.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Icelandic WinoGrande dataset

This is the Icelandic WinoGrande dataset described in the IceBERT paper https://aclanthology.org/2022.lrec-1.464.pdf .

Translation and localization

The records were manually translated and localized (skipped if localization was not possible) from English. For the examples which were singlets instead of sentence pairs we added a corresponding sentence. The "translations per se" are not exact since accurately preserving the original semantics is… See the full description on the dataset page: https://huggingface.co/datasets/mideind/icelandic-winogrande.
h
ro_winogrande
huggingface.co
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenLLM-Ro (2024). ro_winogrande [Dataset]. https://huggingface.co/datasets/OpenLLM-Ro/ro_winogrande
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 11, 2024
Dataset authored and provided by
OpenLLM-Ro
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Description

Winogrande is a set of 273 expert-crafted pronoun resolution problems originally designed to be unsolvable for statistical models that rely on selectional preferences or word associations. Here we provide the Romanian translation of the Winogrande benchmark, translated with Systran. This dataset is used as a benchmark and is part of the evaluation protocol for Romanian LLMs proposed in "Vorbeşti Româneşte?" A Recipe to Train Powerful Romanian LLMs with English… See the full description on the dataset page: https://huggingface.co/datasets/OpenLLM-Ro/ro_winogrande.
h
winogrande_coref
huggingface.co
Updated Jan 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
coref-data (2024). winogrande_coref [Dataset]. https://huggingface.co/datasets/coref-data/winogrande_coref
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 28, 2024
Dataset authored and provided by
coref-data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Wingrande Recast as Coreference Resolution

Dataset Summary

WinoGrande train and development sets recast as coreference resolution as described in Investigating Failures to Generalize for Coreference Resolution Models. Conllu columns are parsed using Stanza.

Data Fields

{ "id": str, # example id "text": str, # untokenized example text "sentences": [ { "id": int, # sentence index "text": str, # untokenized sentence text "speaker": None… See the full description on the dataset page: https://huggingface.co/datasets/coref-data/winogrande_coref.
h
mmlu-winogrande-afr
huggingface.co
Updated Aug 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute-Disease-Modeling (2025). mmlu-winogrande-afr [Dataset]. https://huggingface.co/datasets/Institute-Disease-Modeling/mmlu-winogrande-afr
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2025
Authors
Institute-Disease-Modeling
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Bridging the Gap: Enhancing LLM Performance for Low-Resource African Languages with New Benchmarks, Fine-Tuning, and Cultural Adjustments

Authors: Tuka Alhanai tuka@ghamut.com, Adam Kasumovic adam.kasumovic@ghamut.com, Mohammad Ghassemi ghassemi@ghamut.com, Aven Zitzelberger aven.zitzelberger@ghamut.com, Jessica Lundin jessica.lundin@gatesfoundation.org, Guillaume Chabot-Couture Guillaume.Chabot-Couture@gatesfoundation.org This HuggingFace Dataset contains the human-translated… See the full description on the dataset page: https://huggingface.co/datasets/Institute-Disease-Modeling/mmlu-winogrande-afr.
h
winogrande_italian
huggingface.co
Updated Dec 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sapienza NLP, Sapienza University of Rome (2024). winogrande_italian [Dataset]. https://huggingface.co/datasets/sapienzanlp/winogrande_italian
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2024
Dataset authored and provided by
Sapienza NLP, Sapienza University of Rome
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Winogrande - Italian (IT)

This dataset is an Italian translation of Winogrande. Winogrande is a large-scale dataset for coreference resolution, commonsense reasoning, and world knowledge. It is based on the original Winograd Schema Challenge dataset.

Dataset Details

The dataset consists of almost 40K examples, each containing a sentence with a blank and two possible fill-in-the-blank options. The task is to choose the correct option that correctly fills in the blank based… See the full description on the dataset page: https://huggingface.co/datasets/sapienzanlp/winogrande_italian.
h
AraDiCE-WinoGrande
huggingface.co
Updated May 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qatar Computing Research Institute (2025). AraDiCE-WinoGrande [Dataset]. https://huggingface.co/datasets/QCRI/AraDiCE-WinoGrande
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 18, 2025
Dataset authored and provided by
Qatar Computing Research Institute
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs

Overview

The AraDiCE dataset is designed to evaluate dialectal and cultural capabilities in large language models (LLMs). The dataset consists of post-edited versions of various benchmark datasets, curated for validation in cultural and dialectal contexts relevant to Arabic. In this repository we show the winogrande split of the data.

Evaluation

We have used lm-harness eval framework to for… See the full description on the dataset page: https://huggingface.co/datasets/QCRI/AraDiCE-WinoGrande.
h
WinoWhat
huggingface.co
Updated Aug 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ine Gevers (2025). WinoWhat [Dataset]. https://huggingface.co/datasets/IneG/WinoWhat
Explore at:
Dataset updated
Aug 11, 2025
Authors
Ine Gevers
Description
This is the dataset accompanying the paper "WinoWhat: A Parallel Corpus of Paraphrased WinoGrande Sentences with Common Sense Categorization", presented and published at CoNLL 2025: https://aclanthology.org/2025.conll-1.5/. In this work, we evaluate LLMs' performance on Winograd Schema Challenges by paraphrasing the validation set of WinoGrande. We provide each instance with common sense category annotations. The dataset structure is as follows: sentence: the original text as it appears in… See the full description on the dataset page: https://huggingface.co/datasets/IneG/WinoWhat.
h
winogrande_greek
huggingface.co
Updated Sep 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Athena Research Center | Institute for Language and Speech Processing (2025). winogrande_greek [Dataset]. https://huggingface.co/datasets/ilsp/winogrande_greek
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 9, 2025
Dataset authored and provided by
Athena Research Center | Institute for Language and Speech Processing
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Card for Winogrande Greek

The Winogrande Greek dataset is a set of 41665 pairs of sentences from the WinoGrande dataset, machine-translated into Greek. The original dataset is formulated as a fill-in-a-blank task with binary options, and the goal is to choose the right option for a given sentence which requires commonsense reasoning. In Winogrande Greek the task is formulated as a pair of sentences, from which a model is to choose the most plausible sentence.… See the full description on the dataset page: https://huggingface.co/datasets/ilsp/winogrande_greek.
h
flan_v2
huggingface.co
Updated Feb 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neural (2023). flan_v2 [Dataset]. https://huggingface.co/datasets/SirNeural/flan_v2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2023
Authors
Neural
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for Flan V2

Dataset Summary

This is a processed version of the Flan V2 dataset. I'm not affiliated with the creators, I'm just releasing the files in an easier-to-access format after processing. The authors of the Flan Collection recommend experimenting with different mixing ratio's of tasks to get optimal results downstream.

Setup Instructions

Here are the steps I followed to get everything working:

Build AESLC and WinoGrande datasets… See the full description on the dataset page: https://huggingface.co/datasets/SirNeural/flan_v2.
h
jwinogrande
huggingface.co
Updated Jan 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
weblab-GENIAC (2025). jwinogrande [Dataset]. https://huggingface.co/datasets/weblab-GENIAC/jwinogrande
Explore at:
Dataset updated
Jan 10, 2025
Dataset authored and provided by
weblab-GENIAC
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
jwinograndeのデータセットカード

データセット情報

WinoGrandeから97サンプルをランダムに抽出し、日本語に翻訳したものです。ファイルサイズ: 26.9 kB サンプルの例は以下のようになります。 { "sentence": "マイケルはマシューとは違って、仕事のために中国語を学ぶ必要がありました。なぜなら_はイギリスで働いていたためです。", "option1": "マイケル", "option2": "マシュー", "answer": "2" }

ライセンス情報

apache-2.0

引用文献

@InProceedings{ai2:winogrande, title = {WinoGrande: An Adversarial Winograd Schema Challenge at Scale}, authors={Keisuke, Sakaguchi and Ronan, Le Bras and… See the full description on the dataset page: https://huggingface.co/datasets/weblab-GENIAC/jwinogrande.
h
winogrande_et
huggingface.co
Updated Sep 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TartuNLP (2025). winogrande_et [Dataset]. https://huggingface.co/datasets/tartuNLP/winogrande_et
Explore at:
Dataset updated
Sep 10, 2025
Dataset authored and provided by
TartuNLP
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Changelog

10.09.2025 Added citation info. 22.08.2025 Added train and dev splits to the machine_translated subset for compatibility with EuroEval. As a result, the subset now has the answer column in the test split containing empty strings. The examples were translated with the same GPT4o model for consistency.

Description

winogrande_et includes the test set of the winogrande dataset that was manually translated and culturally adapted to the Estonian language. The… See the full description on the dataset page: https://huggingface.co/datasets/tartuNLP/winogrande_et.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ai2 (2022). winogrande [Dataset]. https://huggingface.co/datasets/allenai/winogrande

winogrande

WinoGrande

allenai/winogrande

Explore at:

Dataset updated

Oct 28, 2022

Dataset provided by

Allen Institute for AIhttp://allenai.org/

Authors

Ai2

Description

Dataset Card for "winogrande"

  Dataset Summary

WinoGrande is a new collection of 44k problems, inspired by Winograd Schema Challenge (Levesque, Davis, and Morgenstern 2011), but adjusted to improve the scale and robustness against the dataset-specific bias. Formulated as a fill-in-a-blank task with binary options, the goal is to choose the right option for a given sentence which requires commonsense reasoning.

  Supported Tasks and Leaderboards

More Information… See the full description on the dataset page: https://huggingface.co/datasets/allenai/winogrande.

Clear search

Close search

Google apps

Main menu

winogrande

winogrande_raw

niv2_winogrande_raw

SNU_Ko-WinoGrande

WinoGrande_HT_eu_sample

forgetting-contamination-winogrande

winogrande-tr-v0.2

lt_winogrande

WinoGrande

icelandic-winogrande

ro_winogrande

winogrande_coref

mmlu-winogrande-afr

winogrande_italian

AraDiCE-WinoGrande

WinoWhat

winogrande_greek

flan_v2

jwinogrande

winogrande_et

winogrande

WinoGrande

allenai/winogrande