28 datasets found
  1. T

    glue

    • tensorflow.org
    • tensorflow.google.cn
    • +1more
    Updated Apr 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). glue [Dataset]. https://www.tensorflow.org/datasets/catalog/glue
    Explore at:
    Dataset updated
    Apr 3, 2019
    Description

    GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('glue', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  2. h

    mnli

    • huggingface.co
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SeaEval (2024). mnli [Dataset]. https://huggingface.co/datasets/SeaEval/mnli
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Dataset authored and provided by
    SeaEval
    Description

    SeaEval/mnli dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. h

    mnli-amr

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon Tang, mnli-amr [Dataset]. https://huggingface.co/datasets/Tverous/mnli-amr
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Simon Tang
    Description

    Dataset Card for "mnli-amr"

    More Information needed

  4. mnli-fluency

    • kaggle.com
    zip
    Updated Apr 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khương Trần (2024). mnli-fluency [Dataset]. https://www.kaggle.com/datasets/khuongtran1209/mnli-fluency/suggestions?status=pending&yourSuggestions=true
    Explore at:
    zip(578380 bytes)Available download formats
    Dataset updated
    Apr 15, 2024
    Authors
    Khương Trần
    Description

    Dataset

    This dataset was created by Khương Trần

    Contents

  5. Data from: BERTs of a feather do not generalize together: Large variability...

    • zenodo.org
    zip
    Updated Jan 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R. Thomas McCoy; Junghyun Min; Tal Linzen; R. Thomas McCoy; Junghyun Min; Tal Linzen (2021). BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance [Dataset]. http://doi.org/10.5281/zenodo.4110593
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 11, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    R. Thomas McCoy; Junghyun Min; Tal Linzen; R. Thomas McCoy; Junghyun Min; Tal Linzen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Zenodo repository contains 100 copies of the model BERT fine-tuned on the MNLI dataset, created for the paper "BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance." Please see the project GitHub page for more details about using these models and how to cite any such usage: https://github.com/tommccoy1/hans/tree/master/berts_of_a_feather

  6. T

    xnli

    • tensorflow.org
    • huggingface.co
    Updated Dec 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). xnli [Dataset]. https://www.tensorflow.org/datasets/catalog/xnli
    Explore at:
    Dataset updated
    Dec 6, 2022
    Description

    XNLI is a subset of a few thousand examples from MNLI which has been translated into a 14 different languages (some low-ish resource). As with MNLI, the goal is to predict textual entailment (does sentence A imply/contradict/neither sentence B) and is a classification task (given two sentences, predict one of three labels).

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('xnli', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  7. D MNLI

    • kaggle.com
    zip
    Updated Jul 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Udbhav Bamba (2023). D MNLI [Dataset]. https://www.kaggle.com/datasets/ubamba98/dlbmnli
    Explore at:
    zip(8082477814 bytes)Available download formats
    Dataset updated
    Jul 23, 2023
    Authors
    Udbhav Bamba
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Udbhav Bamba

    Released under CC0: Public Domain

    Contents

  8. h

    autoeval-staging-eval-glue-mnli-026a6e-14686017

    • huggingface.co
    Updated Sep 15, 1996
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation on the Hub (1996). autoeval-staging-eval-glue-mnli-026a6e-14686017 [Dataset]. https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-glue-mnli-026a6e-14686017
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 1996
    Dataset authored and provided by
    Evaluation on the Hub
    Description

    Dataset Card for AutoTrain Evaluator

    This repository contains model predictions generated by AutoTrain for the following task and dataset:

    Task: Natural Language Inference Model: Jiva/xlm-roberta-large-it-mnli Dataset: glue Config: mnli Split: validation_matched

    To run new evaluation jobs, visit Hugging Face's automatic model evaluator.

      Contributions
    

    Thanks to @lewtun for evaluating this model.

  9. bart-mnli

    • kaggle.com
    zip
    Updated Jun 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Equilan (2021). bart-mnli [Dataset]. https://www.kaggle.com/datasets/sriramanathan/bartmnli/suggestions
    Explore at:
    zip(989090569 bytes)Available download formats
    Dataset updated
    Jun 25, 2021
    Authors
    Equilan
    Description

    Dataset

    This dataset was created by Equilan

    Contents

  10. T

    multi_nli

    • tensorflow.org
    • huggingface.co
    • +1more
    Updated Dec 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). multi_nli [Dataset]. https://www.tensorflow.org/datasets/catalog/multi_nli
    Explore at:
    Dataset updated
    Dec 6, 2022
    Description

    The Multi-Genre Natural Language Inference (MultiNLI) corpus is a crowd-sourced collection of 433k sentence pairs annotated with textual entailment information. The corpus is modeled on the SNLI corpus, but differs in that covers a range of genres of spoken and written text, and supports a distinctive cross-genre generalization evaluation. The corpus served as the basis for the shared task of the RepEval 2017 Workshop at EMNLP in Copenhagen.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('multi_nli', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  11. h

    mnli_matched

    • huggingface.co
    Updated Sep 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mnli_matched [Dataset]. https://huggingface.co/datasets/westphal-jan/mnli_matched
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 5, 2022
    Authors
    Jan Westphal
    Description

    Dataset Description

    This dataset provides easier accessibility to the original MNLI dataset. We randomly choose 10% of the original validation_matched split and use it as the validation split. The remaining 90% are used for the test split. The train split remains unchanged.

  12. h

    autoeval-staging-eval-glue-mnli-026a6e-14686016

    • huggingface.co
    Updated Sep 23, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evaluation on the Hub (2022). autoeval-staging-eval-glue-mnli-026a6e-14686016 [Dataset]. https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-glue-mnli-026a6e-14686016
    Explore at:
    Dataset updated
    Sep 23, 2022
    Dataset authored and provided by
    Evaluation on the Hub
    Description

    autoevaluate/autoeval-staging-eval-glue-mnli-026a6e-14686016 dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    mnli-all

    • huggingface.co
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mnli-all [Dataset]. https://huggingface.co/datasets/BurgerTruck/mnli-all
    Explore at:
    Dataset updated
    Mar 4, 2025
    Authors
    Jaime Perez
    Description

    BurgerTruck/mnli-all dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. MNLI_3000to5000_latest

    • kaggle.com
    zip
    Updated Apr 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neranjhana (2021). MNLI_3000to5000_latest [Dataset]. https://www.kaggle.com/datasets/neranjhana/mnli-3000to5000-latest
    Explore at:
    zip(220003 bytes)Available download formats
    Dataset updated
    Apr 30, 2021
    Authors
    Neranjhana
    Description

    Dataset

    This dataset was created by Neranjhana

    Contents

  15. P

    OCNLI Dataset

    • paperswithcode.com
    Updated Nov 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OCNLI Dataset [Dataset]. https://paperswithcode.com/dataset/ocnli
    Explore at:
    Dataset updated
    Nov 16, 2021
    Authors
    Hai Hu; Kyle Richardson; Liang Xu; Lu Li; Sandra Kuebler; Lawrence S. Moss
    Description

    OCNLI stands for Original Chinese Natural Language Inference. It is corpus for Chinese Natural Language Inference, collected following closely the procedures of MNLI, but with enhanced strategies aiming for more challenging inference pairs. No human/machine translation is used in creating the dataset, and thus the Chinese texts are original and not translated.

    OCNLI has roughly 50k pairs for training, 3k for development and 3k for test. Only the test data is released but not its labels.

    OCNLI is part of the CLUE benchmark.

  16. mnli_indonesia_translated

    • kaggle.com
    zip
    Updated Jun 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Zidan (2021). mnli_indonesia_translated [Dataset]. https://www.kaggle.com/datasets/lan666as/mnli-indonesia-translated
    Explore at:
    zip(17210372 bytes)Available download formats
    Dataset updated
    Jun 26, 2021
    Authors
    Ahmad Zidan
    Description

    Dataset

    This dataset was created by Ahmad Zidan

    Contents

  17. h

    Translated-MNLI-2

    • huggingface.co
    Updated Aug 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Llama Adapt (2024). Translated-MNLI-2 [Dataset]. https://huggingface.co/datasets/llama-lang-adapt/Translated-MNLI-2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 20, 2024
    Dataset authored and provided by
    Llama Adapt
    Description

    llama-lang-adapt/Translated-MNLI-2 dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. OntoLAMA: LAnguage Model Analysis for Ontology Subsumption Inference

    • zenodo.org
    zip
    Updated Aug 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2023). OntoLAMA: LAnguage Model Analysis for Ontology Subsumption Inference [Dataset]. http://doi.org/10.5281/zenodo.7699244
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 7, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    OntoLAMA Datasets (anonymised access because the paper is under review)

    This pre-release version is missing the complex SI dataset constructed from GO.

    Dataset Source#Concepts#EquivAxioms#Datasets(Train/Dev/Test)
    Schema.org894N/A

    Atomic SI: 808/404/2, 830

    DOID11,157N/A

    Atomic SI: 90,500/11,312/11,314

    FoodOn30,9952,383

    Atomic SI: 768,486/96,060/96,062

    Complex SI: 3,754/1,850/13,080

    GO43,30311,456

    Atomic SI: 772,870/96,608/96,610

    Complex SI: ...

    MNLIN/AN/A

    biMNLI: 235,622/26,180/12,906

  19. deberta_large_mnli_cv_08483

    • kaggle.com
    zip
    Updated Jun 2, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandiago (2022). deberta_large_mnli_cv_08483 [Dataset]. https://www.kaggle.com/datasets/sandiago21/deberta-large-mnli-cv-08483/suggestions
    Explore at:
    zip(7508068133 bytes)Available download formats
    Dataset updated
    Jun 2, 2022
    Authors
    Sandiago
    Description

    Dataset

    This dataset was created by Sandiago

    Contents

  20. FB3_deberta_v3_small_mnli

    • kaggle.com
    zip
    Updated Sep 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Splend1dChan(燦爛) (2022). FB3_deberta_v3_small_mnli [Dataset]. https://www.kaggle.com/datasets/a24998667/fb3-deberta-v3-small-mnli
    Explore at:
    zip(1936504738 bytes)Available download formats
    Dataset updated
    Sep 6, 2022
    Authors
    Splend1dChan(燦爛)
    Description

    Dataset

    This dataset was created by Splend1dChan(燦爛)

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2019). glue [Dataset]. https://www.tensorflow.org/datasets/catalog/glue

glue

Explore at:
7 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Apr 3, 2019
Description

GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

To use this dataset:

import tensorflow_datasets as tfds

ds = tfds.load('glue', split='train')
for ex in ds.take(4):
 print(ex)

See the guide for more informations on tensorflow_datasets.

Search
Clear search
Close search
Google apps
Main menu