73 datasets found

h
msmarco
huggingface.co
Updated Sep 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). msmarco [Dataset]. https://huggingface.co/datasets/mteb/msmarco
Explore at:
Dataset updated
Sep 15, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
MSMARCO An MTEB dataset Massive Text Embedding Benchmark

MS MARCO is a collection of datasets focused on deep learning in search

Task category t2t

Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web Reference https://microsoft.github.io/msmarco/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["MSMARCO"]) evaluator… See the full description on the dataset page: https://huggingface.co/datasets/mteb/msmarco.
h
twitterurlcorpus-pairclassification
huggingface.co
Updated Apr 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2022). twitterurlcorpus-pairclassification [Dataset]. https://huggingface.co/datasets/mteb/twitterurlcorpus-pairclassification
Explore at:
Dataset updated
Apr 29, 2022
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
TwitterURLCorpus An MTEB dataset Massive Text Embedding Benchmark

Paraphrase-Pairs of Tweets.

Task category t2t

Domains Social, Written

Reference https://languagenet.github.io/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["TwitterURLCorpus"]) evaluator = mteb.MTEB(task)

model = mteb.get_model(YOUR_MODEL) evaluator.run(model)

To learn more about how to run… See the full description on the dataset page: https://huggingface.co/datasets/mteb/twitterurlcorpus-pairclassification.
h
ArmenianParaphrasePC
huggingface.co
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). ArmenianParaphrasePC [Dataset]. https://huggingface.co/datasets/mteb/ArmenianParaphrasePC
Explore at:
Dataset updated
May 6, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
ArmenianParaphrasePC An MTEB dataset Massive Text Embedding Benchmark

asparius/Armenian-Paraphrase-PC

Task category t2t

Domains News, Written

Reference https://github.com/ivannikov-lab/arpa-paraphrase-corpus

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["ArmenianParaphrasePC"]) evaluator = mteb.MTEB(task)

model = mteb.get_model(YOUR_MODEL)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/ArmenianParaphrasePC.
h
SciFact-PL
huggingface.co
Updated Feb 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). SciFact-PL [Dataset]. https://huggingface.co/datasets/mteb/SciFact-PL
Explore at:
Dataset updated
Feb 6, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
Description
SciFact-PL An MTEB dataset Massive Text Embedding Benchmark

SciFact verifies scientific claims using evidence from the research literature containing scientific paper abstracts.

Task category t2t

Domains Academic, Medical, Written

Reference https://github.com/allenai/scifact

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["SciFact-PL"]) evaluator =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/SciFact-PL.
h
NanoMSMARCORetrieval
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark, NanoMSMARCORetrieval [Dataset]. https://huggingface.co/datasets/mteb/NanoMSMARCORetrieval
Explore at:
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NanoMSMARCORetrieval An MTEB dataset Massive Text Embedding Benchmark

NanoMSMARCORetrieval is a smaller subset of MS MARCO, a collection of datasets focused on deep learning in search.

Task category t2t

Domains Web

Referencehttps://microsoft.github.io/msmarco/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["NanoMSMARCORetrieval"]) evaluator = mteb.MTEB(task)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/NanoMSMARCORetrieval.
h
HotpotQA_test_top_250_only_w_correct-v2
huggingface.co
Updated Sep 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2024). HotpotQA_test_top_250_only_w_correct-v2 [Dataset]. https://huggingface.co/datasets/mteb/HotpotQA_test_top_250_only_w_correct-v2
Explore at:
Dataset updated
Sep 29, 2024
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
HotpotQAHardNegatives An MTEB dataset Massive Text Embedding Benchmark

HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems. The hard negative version has been created by pooling the 250 top documents per query from BM25, e5-multilingual-large and e5-mistral-instruct.

Task category t2t

Domains Web, Written

Reference https://hotpotqa.github.io/… See the full description on the dataset page: https://huggingface.co/datasets/mteb/HotpotQA_test_top_250_only_w_correct-v2.
h
StatcanDialogueDatasetRetrieval
huggingface.co
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). StatcanDialogueDatasetRetrieval [Dataset]. https://huggingface.co/datasets/mteb/StatcanDialogueDatasetRetrieval
Explore at:
Dataset updated
Jun 21, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
StatcanDialogueDatasetRetrieval An MTEB dataset Massive Text Embedding Benchmark

A Dataset for Retrieving Data Tables through Conversations with Genuine Intents, available in English and French.

Task category t2t

Domains Government, Web, Written Reference https://mcgill-nlp.github.io/statcan-dialogue-dataset/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/StatcanDialogueDatasetRetrieval.
h
GerDaLIRSmall
huggingface.co
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2024). GerDaLIRSmall [Dataset]. https://huggingface.co/datasets/mteb/GerDaLIRSmall
Explore at:
Dataset updated
Apr 5, 2024
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
GerDaLIRSmall An MTEB dataset Massive Text Embedding Benchmark

The dataset consists of documents, passages and relevance labels in German. In contrast to the original dataset, only documents that have corresponding queries in the query set are chosen to create a smaller corpus for evaluation purposes.

Task category t2t

Domains Legal, Written

Reference https://github.com/lavis-nlp/GerDaLIR

How to evaluate on this task

You can evaluate an embedding model… See the full description on the dataset page: https://huggingface.co/datasets/mteb/GerDaLIRSmall.
h
DBPedia-PL
huggingface.co
Updated Feb 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). DBPedia-PL [Dataset]. https://huggingface.co/datasets/mteb/DBPedia-PL
Explore at:
Dataset updated
Feb 6, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
DBPedia-PL An MTEB dataset Massive Text Embedding Benchmark

DBpedia-Entity is a standard test collection for entity search over the DBpedia knowledge base

Task category t2t

Domains Written, Encyclopaedic

Reference https://github.com/iai-group/DBpedia-Entity/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["DBPedia-PL"]) evaluator = mteb.MTEB(task)

model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/DBPedia-PL.
h
mteb-toxic_conversations_50k-avs_triplets
huggingface.co
Updated May 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aivin Solatorio (2024). mteb-toxic_conversations_50k-avs_triplets [Dataset]. https://huggingface.co/datasets/avsolatorio/mteb-toxic_conversations_50k-avs_triplets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 11, 2024
Authors
Aivin Solatorio
Description
MTEB Toxic Conversations 50k Triplets Dataset

This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed/blob/main/scripts/create_classification_dataset.py.

Citation

@article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/mteb-toxic_conversations_50k-avs_triplets.
h
LeCaRDv2
huggingface.co
Updated Mar 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2024). LeCaRDv2 [Dataset]. https://huggingface.co/datasets/mteb/LeCaRDv2
Explore at:
Dataset updated
Mar 31, 2024
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
LeCaRDv2 An MTEB dataset Massive Text Embedding Benchmark

The task involves identifying and retrieving the case document that best matches or is most relevant to the scenario described in each of the provided queries.

Task category t2t

Domains Legal, Written

Reference https://github.com/THUIR/LeCaRDv2

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/LeCaRDv2.
h
CMedQAv2-reranking
huggingface.co
Updated Feb 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). CMedQAv2-reranking [Dataset]. https://huggingface.co/datasets/mteb/CMedQAv2-reranking
Explore at:
Dataset updated
Feb 19, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
Description
CMedQAv2-reranking An MTEB dataset Massive Text Embedding Benchmark

Chinese community medical question answering

Task category t2t

Domains Medical, Written

Reference https://github.com/zhangsheng93/cMedQA2

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["CMedQAv2-reranking"]) evaluator = mteb.MTEB(task)

model = mteb.get_model(YOUR_MODEL) evaluator.run(model)… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CMedQAv2-reranking.
h
HotpotQA-PL
huggingface.co
Updated Feb 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). HotpotQA-PL [Dataset]. https://huggingface.co/datasets/mteb/HotpotQA-PL
Explore at:
Dataset updated
Feb 6, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
HotpotQA-PL An MTEB dataset Massive Text Embedding Benchmark

HotpotQA is a question answering dataset featuring natural, multi-hop questions, with strong supervision for supporting facts to enable more explainable question answering systems.

Task category t2t

Domains Web, Written

Reference https://hotpotqa.github.io/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/HotpotQA-PL.
h
SWEPolyBenchRR
huggingface.co
Updated Oct 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). SWEPolyBenchRR [Dataset]. https://huggingface.co/datasets/mteb/SWEPolyBenchRR
Explore at:
Dataset updated
Oct 3, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SWEPolyBenchRR An MTEB dataset Massive Text Embedding Benchmark

Multilingual Software Issue Localization.

Task category t2t

Domains Programming, Written

Reference https://amazon-science.github.io/SWE-PolyBench/

Source datasets:

tarsur909/mteb-swe-bench-poly-reranking

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_task("SWEPolyBenchRR") evaluator =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/SWEPolyBenchRR.
h
MultiSWEbenchRR
huggingface.co
Updated Oct 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2025). MultiSWEbenchRR [Dataset]. https://huggingface.co/datasets/mteb/MultiSWEbenchRR
Explore at:
Dataset updated
Oct 3, 2025
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
MultiSWEbenchRR An MTEB dataset Massive Text Embedding Benchmark

Multilingual Software Issue Localization.

Task category t2t

Domains Programming, Written

Reference https://multi-swe-bench.github.io/#/

Source datasets:

tarsur909/mteb-swe-bench-multi-reranking

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_task("MultiSWEbenchRR") evaluator = mteb.MTEB([task])… See the full description on the dataset page: https://huggingface.co/datasets/mteb/MultiSWEbenchRR.
h
TempReasonL1
huggingface.co
Updated Feb 15, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2007). TempReasonL1 [Dataset]. https://huggingface.co/datasets/mteb/TempReasonL1
Explore at:
Dataset updated
Feb 15, 2007
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
TempReasonL1 An MTEB dataset Massive Text Embedding Benchmark

Measuring the ability to retrieve the groundtruth answers to reasoning task queries on TempReason l1.

Task category t2t

Domains Encyclopaedic, Written

Referencehttps://github.com/DAMO-NLP-SG/TempReason

Source datasets:

RAR-b/TempReason-l1

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/TempReasonL1.
h
mteb-amazon_massive_intent-avs_triplets
huggingface.co
Updated Feb 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aivin Solatorio (2024). mteb-amazon_massive_intent-avs_triplets [Dataset]. https://huggingface.co/datasets/avsolatorio/mteb-amazon_massive_intent-avs_triplets
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 29, 2024
Authors
Aivin Solatorio
Description
MTEB Amazon Massive Intent Triplets Dataset

This dataset was used in the paper GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning. Refer to https://arxiv.org/abs/2402.16829 for details. The code for generating the data is available at https://github.com/avsolatorio/GISTEmbed/blob/main/scripts/create_classification_dataset.py.

Citation

@article{solatorio2024gistembed, title={GISTEmbed: Guided In-sample Selection of Training… See the full description on the dataset page: https://huggingface.co/datasets/avsolatorio/mteb-amazon_massive_intent-avs_triplets.
h
MSMARCO_test_top_250_only_w_correct-v2
huggingface.co
Updated Sep 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2024). MSMARCO_test_top_250_only_w_correct-v2 [Dataset]. https://huggingface.co/datasets/mteb/MSMARCO_test_top_250_only_w_correct-v2
Explore at:
Dataset updated
Sep 29, 2024
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
MSMARCOHardNegatives An MTEB dataset Massive Text Embedding Benchmark

MS MARCO is a collection of datasets focused on deep learning in search. The hard negative version has been created by pooling the 250 top documents per query from BM25, e5-multilingual-large and e5-mistral-instruct.

Task category t2t

Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web

Reference https://microsoft.github.io/msmarco/

How to… See the full description on the dataset page: https://huggingface.co/datasets/mteb/MSMARCO_test_top_250_only_w_correct-v2.
h
neuclir-2023
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark, neuclir-2023 [Dataset]. https://huggingface.co/datasets/mteb/neuclir-2023
Explore at:
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
NeuCLIR2023Retrieval An MTEB dataset Massive Text Embedding Benchmark

The task involves identifying and retrieving the documents that are relevant to the queries.

Task category t2t

Domains News, Written

Reference https://neuclir.github.io/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["NeuCLIR2023Retrieval"]) evaluator = mteb.MTEB(task)

model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/neuclir-2023.
h
summeval
huggingface.co
Updated Feb 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2023). summeval [Dataset]. https://huggingface.co/datasets/mteb/summeval
Explore at:
Dataset updated
Feb 22, 2023
Dataset authored and provided by
Massive Text Embedding Benchmark
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
SummEvalSummarization.v2 An MTEB dataset Massive Text Embedding Benchmark

News Article Summary Semantic Similarity Estimation. This version fixes a bug in the evaluation script that caused the main score to be computed incorrectly.

Task category t2t

Domains News, Written

Reference https://github.com/Yale-LILY/SummEval

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/summeval.

Facebook

Twitter

Click to copy link

Link copied

Cite

Massive Text Embedding Benchmark (2025). msmarco [Dataset]. https://huggingface.co/datasets/mteb/msmarco

msmarco

mteb/msmarco

Explore at:

Dataset updated

Sep 15, 2025

Dataset authored and provided by

Massive Text Embedding Benchmark

License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

MSMARCO An MTEB dataset Massive Text Embedding Benchmark

MS MARCO is a collection of datasets focused on deep learning in search

Task category t2t

Domains Encyclopaedic, Academic, Blog, News, Medical, Government, Reviews, Non-fiction, Social, Web Reference https://microsoft.github.io/msmarco/

  How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["MSMARCO"]) evaluator… See the full description on the dataset page: https://huggingface.co/datasets/mteb/msmarco.

Clear search

Close search

Google apps

Main menu

msmarco

twitterurlcorpus-pairclassification

ArmenianParaphrasePC

SciFact-PL

NanoMSMARCORetrieval

HotpotQA_test_top_250_only_w_correct-v2

StatcanDialogueDatasetRetrieval

GerDaLIRSmall

DBPedia-PL

mteb-toxic_conversations_50k-avs_triplets

LeCaRDv2

CMedQAv2-reranking

HotpotQA-PL

SWEPolyBenchRR

MultiSWEbenchRR

TempReasonL1

mteb-amazon_massive_intent-avs_triplets

MSMARCO_test_top_250_only_w_correct-v2

neuclir-2023

summeval

msmarcoSee More Versions

mteb/msmarco

msmarco