3 datasets found
  1. h

    amazon_counterfactual

    • huggingface.co
    Updated Feb 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark, amazon_counterfactual [Dataset]. https://huggingface.co/datasets/mteb/amazon_counterfactual
    Explore at:
    Dataset updated
    Feb 9, 2023
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AmazonCounterfactualClassification An MTEB dataset Massive Text Embedding Benchmark

    A collection of Amazon customer reviews annotated for counterfactual detection pair classification.

    Task category t2c

    Domains Reviews, Written

    Reference https://arxiv.org/abs/2104.06893

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb

    task = mteb.get_tasks(["AmazonCounterfactualClassification"])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_counterfactual.

  2. h

    amazon_polarity

    • huggingface.co
    Updated Oct 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2024). amazon_polarity [Dataset]. https://huggingface.co/datasets/mteb/amazon_polarity
    Explore at:
    Dataset updated
    Oct 27, 2024
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    AmazonPolarityClassification An MTEB dataset Massive Text Embedding Benchmark

    Amazon Polarity Classification Dataset.

    Task category t2c

    Domains Reviews, Written

    Reference https://huggingface.co/datasets/amazon_polarity

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb

    task = mteb.get_tasks(["AmazonPolarityClassification"]) evaluator = mteb.MTEB(task)

    model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_polarity.

  3. h

    language-identification

    • huggingface.co
    • opendatalab.com
    Updated Dec 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Papariello (2021). language-identification [Dataset]. https://huggingface.co/datasets/papluca/language-identification
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 19, 2021
    Authors
    Luca Papariello
    License

    https://choosealicense.com/licenses/undefined/https://choosealicense.com/licenses/undefined/

    Description

    Dataset Card for Language Identification dataset

      Dataset Summary
    

    The Language Identification dataset is a collection of 90k samples consisting of text passages and corresponding language label. This dataset was created by collecting data from 3 sources: Multilingual Amazon Reviews Corpus, XNLI, and STSb Multi MT.

      Supported Tasks and Leaderboards
    

    The dataset can be used to train a model for language identification, which is a multi-class text classification… See the full description on the dataset page: https://huggingface.co/datasets/papluca/language-identification.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Massive Text Embedding Benchmark, amazon_counterfactual [Dataset]. https://huggingface.co/datasets/mteb/amazon_counterfactual

amazon_counterfactual

mteb/amazon_counterfactual

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 9, 2023
Dataset authored and provided by
Massive Text Embedding Benchmark
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

AmazonCounterfactualClassification An MTEB dataset Massive Text Embedding Benchmark

A collection of Amazon customer reviews annotated for counterfactual detection pair classification.

Task category t2c

Domains Reviews, Written

Reference https://arxiv.org/abs/2104.06893

  How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["AmazonCounterfactualClassification"])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_counterfactual.

Search
Clear search
Close search
Google apps
Main menu