Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
MSMARCO An MTEB dataset Massive Text Embedding Benchmark
MS MARCO is a collection of datasets focused on deep learning in search
Task category t2t
Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web Reference https://microsoft.github.io/msmarco/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["MSMARCO"]) evaluator… See the full description on the dataset page: https://huggingface.co/datasets/mteb/msmarco.
Facebook
Twitterhttps://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
TwitterURLCorpus An MTEB dataset Massive Text Embedding Benchmark
Paraphrase-Pairs of Tweets.
Task category t2t
Domains Social, Written
Reference https://languagenet.github.io/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["TwitterURLCorpus"]) evaluator = mteb.MTEB(task)
model = mteb.get_model(YOUR_MODEL) evaluator.run(model)
To learn more about how to run… See the full description on the dataset page: https://huggingface.co/datasets/mteb/twitterurlcorpus-pairclassification.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
ArmenianParaphrasePC An MTEB dataset Massive Text Embedding Benchmark
asparius/Armenian-Paraphrase-PC
Task category t2t
Domains News, Written
Reference https://github.com/ivannikov-lab/arpa-paraphrase-corpus
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["ArmenianParaphrasePC"]) evaluator = mteb.MTEB(task)
model = mteb.get_model(YOUR_MODEL)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/ArmenianParaphrasePC.
Facebook
TwitterSciFact-PL An MTEB dataset Massive Text Embedding Benchmark
SciFact verifies scientific claims using evidence from the research literature containing scientific paper abstracts.
Task category t2t
Domains Academic, Medical, Written
Reference https://github.com/allenai/scifact
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["SciFact-PL"]) evaluator =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/SciFact-PL.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NanoMSMARCORetrieval An MTEB dataset Massive Text Embedding Benchmark
NanoMSMARCORetrieval is a smaller subset of MS MARCO, a collection of datasets focused on deep learning in search.
Task category t2t
Domains Web
Referencehttps://microsoft.github.io/msmarco/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["NanoMSMARCORetrieval"]) evaluator = mteb.MTEB(task)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/NanoMSMARCORetrieval.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
HotpotQAHardNegatives An MTEB dataset Massive Text Embedding Benchmark
HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems. The hard negative version has been created by pooling the 250 top documents per query from BM25, e5-multilingual-large and e5-mistral-instruct.
Task category t2t
Domains Web, Written
Reference https://hotpotqa.github.io/… See the full description on the dataset page: https://huggingface.co/datasets/mteb/HotpotQA_test_top_250_only_w_correct-v2.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
StatcanDialogueDatasetRetrieval An MTEB dataset Massive Text Embedding Benchmark
A Dataset for Retrieving Data Tables through Conversations with Genuine Intents, available in English and French.
Task category t2t
Domains Government, Web, Written Reference https://mcgill-nlp.github.io/statcan-dialogue-dataset/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/StatcanDialogueDatasetRetrieval.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GerDaLIRSmall An MTEB dataset Massive Text Embedding Benchmark
The dataset consists of documents, passages and relevance labels in German. In contrast to the original dataset, only documents that have corresponding queries in the query set are chosen to create a smaller corpus for evaluation purposes.
Task category t2t
Domains Legal, Written
Reference https://github.com/lavis-nlp/GerDaLIR
How to evaluate on this task
You can evaluate an embedding model… See the full description on the dataset page: https://huggingface.co/datasets/mteb/GerDaLIRSmall.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
DBPedia-PL An MTEB dataset Massive Text Embedding Benchmark
DBpedia-Entity is a standard test collection for entity search over the DBpedia knowledge base
Task category t2t
Domains Written, Encyclopaedic
Reference https://github.com/iai-group/DBpedia-Entity/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["DBPedia-PL"]) evaluator = mteb.MTEB(task)
model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/DBPedia-PL.
Facebook
TwitterMTEB Toxic Conversations 50k Triplets Dataset
This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed/blob/main/scripts/create_classification_dataset.py.
Citation
@article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/mteb-toxic_conversations_50k-avs_triplets.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
LeCaRDv2 An MTEB dataset Massive Text Embedding Benchmark
The task involves identifying and retrieving the case document that best matches or is most relevant to the scenario described in each of the provided queries.
Task category t2t
Domains Legal, Written
Reference https://github.com/THUIR/LeCaRDv2
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/LeCaRDv2.
Facebook
TwitterCMedQAv2-reranking An MTEB dataset Massive Text Embedding Benchmark
Chinese community medical question answering
Task category t2t
Domains Medical, Written
Reference https://github.com/zhangsheng93/cMedQA2
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["CMedQAv2-reranking"]) evaluator = mteb.MTEB(task)
model = mteb.get_model(YOUR_MODEL) evaluator.run(model)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CMedQAv2-reranking.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
HotpotQA-PL An MTEB dataset Massive Text Embedding Benchmark
HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems.
Task category t2t
Domains Web, Written
Reference https://hotpotqa.github.io/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/HotpotQA-PL.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
SWEPolyBenchRR An MTEB dataset Massive Text Embedding Benchmark
Multilingual Software Issue Localization.
Task category t2t
Domains Programming, Written
Reference https://amazon-science.github.io/SWE-PolyBench/
Source datasets:
tarsur909/mteb-swe-bench-poly-reranking
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_task("SWEPolyBenchRR") evaluator =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/SWEPolyBenchRR.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
MultiSWEbenchRR An MTEB dataset Massive Text Embedding Benchmark
Multilingual Software Issue Localization.
Task category t2t
Domains Programming, Written
Reference https://multi-swe-bench.github.io/#/
Source datasets:
tarsur909/mteb-swe-bench-multi-reranking
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_task("MultiSWEbenchRR") evaluator = mteb.MTEB([task])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/MultiSWEbenchRR.
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
TempReasonL1 An MTEB dataset Massive Text Embedding Benchmark
Measuring the ability to retrieve the groundtruth answers to reasoning task queries on TempReason l1.
Task category t2t
Domains Encyclopaedic, Written
Referencehttps://github.com/DAMO-NLP-SG/TempReason
Source datasets:
RAR-b/TempReason-l1
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/TempReasonL1.
Facebook
TwitterMTEB Amazon Massive Intent Triplets Dataset
This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed/blob/main/scripts/create_classification_dataset.py.
Citation
@article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/mteb-amazon_massive_intent-avs_triplets.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
MSMARCOHardNegatives An MTEB dataset Massive Text Embedding Benchmark
MS MARCO is a collection of datasets focused on deep learning in search. The hard negative version has been created by pooling the 250 top documents per query from BM25, e5-multilingual-large and e5-mistral-instruct.
Task category t2t
Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web
Reference https://microsoft.github.io/msmarco/
How to… See the full description on the dataset page: https://huggingface.co/datasets/mteb/MSMARCO_test_top_250_only_w_correct-v2.
Facebook
Twitterhttps://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
NeuCLIR2023Retrieval An MTEB dataset Massive Text Embedding Benchmark
The task involves identifying and retrieving the documents that are relevant to the queries.
Task category t2t
Domains News, Written
Reference https://neuclir.github.io/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["NeuCLIR2023Retrieval"]) evaluator = mteb.MTEB(task)
model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/neuclir-2023.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
SummEvalSummarization.v2 An MTEB dataset Massive Text Embedding Benchmark
News Article Summary Semantic Similarity Estimation. This version fixes a bug in the evaluation script that caused the main score to be computed incorrectly.
Task category t2t
Domains News, Written
Reference https://github.com/Yale-LILY/SummEval
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/summeval.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
MSMARCO An MTEB dataset Massive Text Embedding Benchmark
MS MARCO is a collection of datasets focused on deep learning in search
Task category t2t
Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web Reference https://microsoft.github.io/msmarco/
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code: import mteb
task = mteb.get_tasks(["MSMARCO"]) evaluator… See the full description on the dataset page: https://huggingface.co/datasets/mteb/msmarco.