37 datasets found

h
stsb_multi_mt
huggingface.co
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philip May (2024). stsb_multi_mt [Dataset]. https://huggingface.co/datasets/PhilipMay/stsb_multi_mt
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 2, 2024
Authors
Philip May
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for STSb Multi MT

Dataset Summary

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums. (source)

These are different multilingual translations and the English original of the STSbenchmark dataset. Translation has been done with deepl.com. It can be used to train sentence embeddings… See the full description on the dataset page: https://huggingface.co/datasets/PhilipMay/stsb_multi_mt.
Z
Data from: GiCCS: A German in-Context Conversational Similarity Benchmark
data.niaid.nih.gov
zenodo.org
Updated Oct 31, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asaadi, Shima (2022). GiCCS: A German in-Context Conversational Similarity Benchmark [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7266220
Explore at:
Dataset updated
Oct 31, 2022
Dataset provided by
Kolagar, Zahra
Asaadi, Shima
Zarcone, Alessandra
Liebel, Alina
License
Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
License information was derived automatically
Description
We introduce GiCCS, a first conversational STS evaluation benchmark for German. We collected the similarity annotations for GiCCS using best-worst scaling and presenting the target items in context, in order to obtain highly-reliable context-dependent similarity scores. In our paper, we present benchmarking experiments for evaluating LMs on capturing the similarity of utterances. Results suggest that pretraining LMs on conversational data and providing conversational context can be useful for capturing similarity of utterances in dialogues. GiCCS will be publicly available to encourage benchmarking of conversational LMs.
O
STS Benchmark
opendatalab.com
huggingface.co
zip
Updated Sep 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Google Research (2022). STS Benchmark [Dataset]. https://opendatalab.com/OpenDataLab/STS_Benchmark
Explore at:
zip(2640708 bytes)Available download formats
Dataset updated
Sep 11, 2022
Dataset provided by
George Washington University
University of the Basque Country
Google Research
University of Sheffield
Description
STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums.
h
ro_sts
huggingface.co
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dumitrescu Stefan (2025). ro_sts [Dataset]. https://huggingface.co/datasets/dumitrescustefan/ro_sts
Explore at:
Dataset updated
Mar 26, 2025
Authors
Dumitrescu Stefan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The RO-STS (Romanian Semantic Textual Similarity) dataset contains 8628 pairs of sentences with their similarity score. It is a high-quality translation of the STS benchmark dataset.
h
stsb
huggingface.co
Updated Apr 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sentence Transformers (2024). stsb [Dataset]. https://huggingface.co/datasets/sentence-transformers/stsb
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 25, 2024
Dataset authored and provided by
Sentence Transformers
Description
Dataset Card for STSB

The Semantic Textual Similarity Benchmark (Cer et al., 2017) is a collection of sentence pairs drawn from news headlines, video and image captions, and natural language inference data. Each pair is human-annotated with a similarity score from 1 to 5. However, for this variant, the similarity scores are normalized to between 0 and 1.

Dataset Details

Columns: "sentence1", "sentence2", "score" Column types: str, str, float Examples:{ 'sentence1': 'A… See the full description on the dataset page: https://huggingface.co/datasets/sentence-transformers/stsb.
h
sts-ca
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Projecte Aina, sts-ca [Dataset]. https://huggingface.co/datasets/projecte-aina/sts-ca
Explore at:
Dataset authored and provided by
Projecte Aina
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for STS-ca

Dataset Summary

STS-ca corpus is a benchmark for evaluating Semantic Text Similarity in Catalan. This dataset was developed by BSC TeMU as part of Projecte AINA, to enrich the Catalan Language Understanding Benchmark (CLUB). This work is licensed under a Attribution-ShareAlike 4.0 International License.

Supported Tasks and Leaderboards

This dataset can be used to build and score semantic similarity models in Catalan.… See the full description on the dataset page: https://huggingface.co/datasets/projecte-aina/sts-ca.
E
Data from: Semantic Textual Similarity in Catalan
live.european-language-grid.eu
observatorio-cientifico.ua.es
+2more
tsv
Updated Oct 3, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Semantic Textual Similarity in Catalan [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7869
Explore at:
tsvAvailable download formats
Dataset updated
Oct 3, 2022
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
STS corpus is a benchmark for evaluating Semantic Text Similarity in Catalan.It consists of 3079 sentence pairs, annotated with the semantic similarity between them, using a scale from 0 (no similarity at all) to 5 (semantic equivalence). It is done manually by 4 different people following our guidelines based on previous work from the SemEval challenges (https://www.aclweb.org/anthology/S13-1004.pdf). This dataset was developed by BSC TeMU as part of the AINA project.
h
sickr-sts
huggingface.co
Updated Apr 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2022). sickr-sts [Dataset]. https://huggingface.co/datasets/mteb/sickr-sts
Explore at:
Dataset updated
Apr 27, 2022
Dataset authored and provided by
Massive Text Embedding Benchmark
License
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Description
SICK-R An MTEB dataset Massive Text Embedding Benchmark

Semantic Textual Similarity SICK-R dataset

Task category t2t

Domains Web, Written

Reference https://aclanthology.org/L14-1314/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["SICK-R"]) evaluator = mteb.MTEB(task)

model = mteb.get_model(YOUR_MODEL) evaluator.run(model)

To learn more about how to run models… See the full description on the dataset page: https://huggingface.co/datasets/mteb/sickr-sts.
h
sts17-crosslingual-sts
huggingface.co
Updated Jun 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2022). sts17-crosslingual-sts [Dataset]. https://huggingface.co/datasets/mteb/sts17-crosslingual-sts
Explore at:
Dataset updated
Jun 29, 2022
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
STS17 An MTEB dataset Massive Text Embedding Benchmark

Semeval-2017 task 1: Semantic textual similarity-multilingual and cross-lingual focused evaluation

Task category t2t

Domains News, Web, Written

Reference https://alt.qcri.org/semeval2017/task1/

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["STS17"]) evaluator = mteb.MTEB(task)

model =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/sts17-crosslingual-sts.
Sentence Similarity Nepali Dataset
kaggle.com
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yubraj11 (2024). Sentence Similarity Nepali Dataset [Dataset]. https://www.kaggle.com/datasets/yubraj11/sentence-similarity-nepali-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
yubraj11
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a Nepali version of the Sentence Textual Similarity Benchmark (STS-B) derived from the STS-B Multi-MT corpus. It consists of sentence pairs annotated with similarity scores, indicating how semantically similar the two sentences are. The dataset serves as a valuable resource for developing and evaluating natural language processing (NLP) models focused on understanding and measuring sentence similarity in Nepali. Each sentence pair is assigned a similarity score ranging from 0 to 5, where 0 indicates no similarity and 5 indicates complete semantic equivalence. This dataset is crucial for various NLP applications, including machine translation, paraphrase detection, and semantic search, enabling the advancement of language technologies in the Nepali language.
T
glue
tensorflow.org
tensorflow.google.cn
+1more
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). glue [Dataset]. https://www.tensorflow.org/datasets/catalog/glue
Explore at:
Dataset updated
Dec 6, 2022
Description
GLUE, the General Language Understanding Evaluation benchmark (https://gluebenchmark.com/) is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('glue', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
h
ST-Align-Benchmark
huggingface.co
Updated Jul 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongyu Li (2025). ST-Align-Benchmark [Dataset]. https://huggingface.co/datasets/appletea2333/ST-Align-Benchmark
Explore at:
Dataset updated
Jul 5, 2025
Authors
Hongyu Li
Description
appletea2333/ST-Align-Benchmark dataset hosted on Hugging Face and contributed by the HF Datasets community
PyKEEN Benchmarking Experiment Model Files
zenodo.org
bin, zip
Updated Aug 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mehdi Ali; Mehdi Ali; Max Berrendorf; Max Berrendorf; Charles Tapley Hoyt; Charles Tapley Hoyt; Laurent Vermue; Galkin Mikhail; Galkin Mikhail; Laurent Vermue (2022). PyKEEN Benchmarking Experiment Model Files [Dataset]. http://doi.org/10.5281/zenodo.7018979
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7018979
Dataset updated
Aug 24, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mehdi Ali; Mehdi Ali; Max Berrendorf; Max Berrendorf; Charles Tapley Hoyt; Charles Tapley Hoyt; Laurent Vermue; Galkin Mikhail; Galkin Mikhail; Laurent Vermue
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Model Weights
This repository provides weights of the models from the benchmarking study conducted in "https://arxiv.org/abs/2006.13365">"Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework" which have been upgraded to compatible with PyKEEN 1.9.

The weights are organized as zipfiles, which are named by the dataset-interaction function configuration. For each of these combinations, we chose the best according to validation Hits@10 to include into this repository. For each model, we have three files:

configuration.json contains the (pipeline) configuration used to train the model. It can loaded as

import pathlib import json configuration = json.loads(pathlib.Path("configuration.json").read_text())

Since the configuration is intended for the pipeline, we need some custom code to re-create the model without re-training it.

from pykeen.datasets import get_dataset from pykeen.models import ERModel, model_resolver configuration = configuration["pipeline"] # load the triples factory dataset = get_dataset( dataset=configuration["dataset"], dataset_kwargs=configuration.get("dataset_kwargs", None) ) model: ERModel = model_resolver.make( configuration["model"], configuration["model_kwargs"], triples_factory=dataset.training )

Note, that this only creates the model instance, but does not load the weights, yet.

state_dict.pt contains the weights, stored via torch.save. They can be loaded via

import torch state_dict = torch.load("state_dict.pt")

We can load these weights into the model by using Module.load_state_dict

model.load_state_dict(state_dict, strict=False)

Note that we set strict=False, since the exported weights do not contain regularizers' state, while the re-instantiated models may have regularizers.

results.json contains the results obtained by the original runs. It can be read by

import pathlib import json configuration = json.loads(pathlib.Path("results.json").read_text())

Note that some of the recently added metrics are not available in those results.
r
SweParaphrase
researchdata.se
Updated Jan 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Språkbanken Text (2024). SweParaphrase [Dataset]. http://doi.org/10.23695/6T6H-SS96
Explore at:
Unique identifier
https://doi.org/10.23695/6T6H-SS96
Dataset updated
Jan 1, 2024
Dataset provided by
University of Gothenburg
Authors
Språkbanken Text
Description
A subset of the Semantic Textual Similarity reference data (STS Benchmark).
a
Survey Vertical Benchmark
data-floridaswater.opendata.arcgis.com
mapdirect-fdep.opendata.arcgis.com
+1more
Updated Mar 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SJRWMDOpenData (2025). Survey Vertical Benchmark [Dataset]. https://data-floridaswater.opendata.arcgis.com/items/fc7738702e0b4fdba93a2f01a141d1af
Explore at:
Dataset updated
Mar 24, 2025
Dataset authored and provided by
SJRWMDOpenData
Area covered

Description
This is a dataset of vertical benchmarks collected by the St. Johns River Water Management District (SJRWMD). Maintained by Survey staff; this layer should only be updated with their direct permission.
h
ro_sts_parallel
huggingface.co
Updated Mar 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dumitrescu Stefan (2021). ro_sts_parallel [Dataset]. https://huggingface.co/datasets/dumitrescustefan/ro_sts_parallel
Explore at:
Dataset updated
Mar 6, 2021
Authors
Dumitrescu Stefan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The RO-STS-Parallel (a Parallel Romanian English dataset - translation of the Semantic Textual Similarity) contains 17256 sentences in Romanian and English. It is a high-quality translation of the English STS benchmark dataset into Romanian.
r
SweParaphrase 2.0
researchdata.se
Updated Jan 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dannélls, Dana (2024). SweParaphrase 2.0 [Dataset]. http://doi.org/10.23695/HXHX-1167
Explore at:
Unique identifier
https://doi.org/10.23695/HXHX-1167
Dataset updated
Jan 1, 2024
Dataset provided by
University of Gothenburg
Authors
Dannélls, Dana
Description
Semantic Textual Similarity reference data (STS Benchmark).
h
biosses-sts
huggingface.co
Updated Apr 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massive Text Embedding Benchmark (2022). biosses-sts [Dataset]. https://huggingface.co/datasets/mteb/biosses-sts
Explore at:
Dataset updated
Apr 29, 2022
Dataset authored and provided by
Massive Text Embedding Benchmark
License
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Description
BIOSSES An MTEB dataset Massive Text Embedding Benchmark

Biomedical Semantic Similarity Estimation.

Task category t2t

Domains Medical

Reference https://tabilab.cmpe.boun.edu.tr/BIOSSES/DataSet.html

How to evaluate on this task

You can evaluate an embedding model on this dataset using the following code: import mteb

task = mteb.get_tasks(["BIOSSES"]) evaluator = mteb.MTEB(task)

model = mteb.get_model(YOUR_MODEL) evaluator.run(model)

To learn more… See the full description on the dataset page: https://huggingface.co/datasets/mteb/biosses-sts.
h
farsick-sts
huggingface.co
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MCINext (2024). farsick-sts [Dataset]. https://huggingface.co/datasets/MCINext/farsick-sts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 31, 2024
Dataset authored and provided by
MCINext
Description
Dataset Summary

FarSick STS is a Persian (Farsi) dataset designed for the Semantic Textual Similarity (STS) task. It is a part of the FaMTEB (Farsi Massive Text Embedding Benchmark). The dataset was developed by translating and adapting the English SICK (Sentences Involving Compositional Knowledge) dataset, and it features Persian sentence pairs annotated for their degree of semantic relatedness.

Language(s): Persian (Farsi)
Task(s): Semantic Textual Similarity (STS)
Source:… See the full description on the dataset page: https://huggingface.co/datasets/MCINext/farsick-sts.
f
Control results for the four benchmark systems.
figshare.com
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Günther; Elias Reichensdörfer; Patrick M. Pilarski; Klaus Diepold (2023). Control results for the four benchmark systems. [Dataset]. http://doi.org/10.1371/journal.pone.0243320.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0243320.t001
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Johannes Günther; Elias Reichensdörfer; Patrick M. Pilarski; Klaus Diepold
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Control results for the four benchmark systems.

Facebook

Twitter

Click to copy link

Link copied

Cite

Philip May (2024). stsb_multi_mt [Dataset]. https://huggingface.co/datasets/PhilipMay/stsb_multi_mt

stsb_multi_mt

STSb Multi MT

PhilipMay/stsb_multi_mt

Explore at:

23 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 2, 2024

Authors

Philip May

License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for STSb Multi MT

  Dataset Summary

STS Benchmark comprises a selection of the English datasets used in the STS tasks organized in the context of SemEval between 2012 and 2017. The selection of datasets include text from image captions, news headlines and user forums. (source)

These are different multilingual translations and the English original of the STSbenchmark dataset. Translation has been done with deepl.com. It can be used to train sentence embeddings… See the full description on the dataset page: https://huggingface.co/datasets/PhilipMay/stsb_multi_mt.

Clear search

Close search

Google apps

Main menu

stsb_multi_mt

Data from: GiCCS: A German in-Context Conversational Similarity Benchmark

STS Benchmark

ro_sts

stsb

sts-ca

Data from: Semantic Textual Similarity in Catalan

sickr-sts

sts17-crosslingual-sts

Sentence Similarity Nepali Dataset

glue

ST-Align-Benchmark

PyKEEN Benchmarking Experiment Model Files

SweParaphrase

Survey Vertical Benchmark

ro_sts_parallel

SweParaphrase 2.0

biosses-sts

farsick-sts

Control results for the four benchmark systems.

stsb_multi_mt

STSb Multi MT

PhilipMay/stsb_multi_mt