Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AmazonCounterfactualClassification An MTEB dataset Massive Text Embedding Benchmark
A collection of Amazon customer reviews annotated for counterfactual detection pair classification.
Task category t2c
Domains Reviews, Written
Reference https://arxiv.org/abs/2104.06893
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["AmazonCounterfactualClassification"])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_counterfactual.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
AmazonPolarityClassification An MTEB dataset Massive Text Embedding Benchmark
Amazon Polarity Classification Dataset.
Task category t2c
Domains Reviews, Written
Reference https://huggingface.co/datasets/amazon_polarity
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["AmazonPolarityClassification"]) evaluator = mteb.MTEB(task)
model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_polarity.
https://choosealicense.com/licenses/undefined/https://choosealicense.com/licenses/undefined/
Dataset Card for Language Identification dataset
Dataset Summary
The Language Identification dataset is a collection of 90k samples consisting of text passages and corresponding language label. This dataset was created by collecting data from 3 sources: Multilingual Amazon Reviews Corpus, XNLI, and STSb Multi MT.
Supported Tasks and Leaderboards
The dataset can be used to train a model for language identification, which is a multi-class text classification… See the full description on the dataset page: https://huggingface.co/datasets/papluca/language-identification.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AmazonCounterfactualClassification An MTEB dataset Massive Text Embedding Benchmark
A collection of Amazon customer reviews annotated for counterfactual detection pair classification.
Task category t2c
Domains Reviews, Written
Reference https://arxiv.org/abs/2104.06893
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["AmazonCounterfactualClassification"])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/amazon_counterfactual.