2 datasets found

z
Data from: LSE-Health-UVigo
zenodo.org
portalcientifico.uvigo.gal
Updated Sep 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jose L. Alba-Castro; Jose L. Alba-Castro; Manuel Vázquez-Enríquez; Manuel Vázquez-Enríquez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Carmen Cabeza-Pereiro; Carmen Cabeza-Pereiro; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Soledad Torres-Guijarro; Soledad Torres-Guijarro; Alba Caderno-Fernández; Sol Cid-Álvarez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Alba Caderno-Fernández; Sol Cid-Álvarez (2024). LSE-Health-UVigo [Dataset]. http://doi.org/10.5281/zenodo.10234465
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10234465
Dataset updated
Sep 5, 2024
Dataset provided by
Zenodo
Authors
Jose L. Alba-Castro; Jose L. Alba-Castro; Manuel Vázquez-Enríquez; Manuel Vázquez-Enríquez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Carmen Cabeza-Pereiro; Carmen Cabeza-Pereiro; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Soledad Torres-Guijarro; Soledad Torres-Guijarro; Alba Caderno-Fernández; Sol Cid-Álvarez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Alba Caderno-Fernández; Sol Cid-Álvarez
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
LSE-Health-UVigo Dataset

The LSE-Health-UVigo dataset is a collection of 273 videos focused on health-related topics, presented in Spanish Sign Language (Lengua de Signos Española, LSE). The dataset offers comprehensive annotations and alignments for various linguistic elements within the videos.

Overview

Total Videos: 273

Total Duration: ~11 hours

The dataset was acquired in studio conditions with blue chroma-key, no shadow effects and uniform illumination, at 25 fps and FHD. The added value of the dataset is he rich and rigorous hand-made annotations. Experts interpreters and deaf people were in charge of annotating the dataset with strict criteria explained below. A previous version of this dataset with less videos and annotations was distributed for the 2022 Sign Spotting Challenge at ECCV. The description of the former dataset, LSE_eSaude_UVIGO (ECCV'22), can be found here, including the train/val/test downloadable splits for the two organized tracks (MSSL-multiple shot supervised learning, and OSLWL-one shot learning and weak labels). The results of the challenge with the description of the dataset, protocols and baseline models, as well as discussing top-winning solutions and future directions on the topic can be found in the ECCV'2022 paper.

Annotations

1. Translation from LSE to Spanish

Description: Each video is translated into Spanish, segmented and aligned into sentences and smaller segments.

Total Segments: 7,738

2. Sign Annotations

Description: 105 distinct signs annotated across all 273 videos.

Total Instances: 15,098

3. Fingerspelling Annotations

Description: Accurate location and annotation of all fingerspelled words.

Total Instances: 1,029

Signers

Total Signers: 10 (7 women, 3 men)

Deaf Signers: 7 (4 women, 3 men)

Hearing Signers: 3 (all women)

Usage

Researchers and practitioners in machine translation, linguistics, healthcare, and sign language interpretation may find this dataset valuable for:

Training/testing machine learning models for sign language translation, sign spotting, isolated sign language recognition and fingerspelling detection/recognition.

Studying health-related sign language communication

Analyzing linguistic patterns and structures within sign language

Distribution

Excel file: Includes the links to download the videos from youtube, meta data and all the annotations (segments, glosses and fingerspelled words)

273 annotation files using the ELAN program: These files contain 3 Tiers explained below

273 video files: spanning ~11 hours of topics related to diseases, symptoms, treatment, care, etc.

Annotations:

LSE-Health-UVigo has been annotated with the ELAN program. Annotators used three Tiers:

Tier 'M_Glosa' for the location of the 105 selected glosses.

Associated Tier 'Var' for annotating variants of the signed gloss. 3-letter codes have been defined to note 8 types of sign variations. These variations are:

linguistic, like slight modifications of the sign due to relaxed execution (coded LAX), slight change of location in space (LOC), abnormal use of the non-dominant hand (MAN) and morphology changes as in distributed plurals (MPH), or

non-linguistic, like very short sign due to speed and large coarticulation SHO), partial occlusion of the sign with other body parts (OCC) or because out of the frame (OUT) and other gloss with similar signed appearance to the selected gloss (SIM). For the sake of completeness, these variations are not filtered out in the distribution, just informed.

Tier 'Trad' for the translation into Spanish and annotation of start and end of each sentence or partial sentence. Translation of a visual language to a written language is not a trivial task. Two-letter codes have been defined to note 8 types of special events regarding visual-to-written translation. The next subsection explains the strict annotation criteria.

Annotation criteria:

The annotation criteria shared by all the annotators (4) were as follows:

For Glosses and fingerspelled words:

The begin_timestamp for a sign is set as soon as the parameters hand configuration, palm orientation and movement and/or location correspond to that sign more than to the transition from the previous one.

The end_timestamp for a sign is set as soon as the parameters hand configuration, palm orientation and movement and/or location start to change to a transition to the next sign.

As far as possible, transitions are not included in the annotated intervals.

A star '*' prefix indicate that there's slight drift from normal realization or that there's a OOV sign very similar visually. The reasons can be linguistic (MAN, LOC, MPH, LAX) or not linguistic (SHO, OCC, OUT, SIM)

It is important to highlight the annotation of the plurals. In Spanish sign Language, plurals use to be performed by sign repetition. Annotations for plurals are coded as a single interval comprising all the concatenated repetitions if no other parameter is modified. The only special case corresponds to the glosses PERSON and its plural PERSON(M-RE), that are coded with different gloss-class because the second repetition use to be smaller and relaxed, as a rebound without changing the hand configuration.

For Translation: The general criterion for segmentation (performed by a professional interpreter involved also in the recordings) was to adapt the text in OL (oral language) to resemble as closely as possible the signed LSE (Spanish Sign Language) and to segment complete phrases, or smaller particles if it results in more semantically coherent segments. Additionally, due to discursive and grammatical differences between OL and LSE, 8 specific types of annotations were defined and marked in brackets:

If the text in OL has a visual but not literal correspondence in LSE, it will be marked with the code "V" (visual) in brackets [V:original text].

When discursive markers are used in LSE that do not usually appear in the text, [MD:type-MD] is inserted in the corresponding text: [MD:one], [MD:two], [MD:three], etc. [MD:theme], [MD:alternative], [MD:section], [MD:next], etc.

When the phrase in LSE is affected by bimodal signing (restricted by the grammatical order in the text), it is marked as [B: and the text signed bimodally]. This phenomenon is very typical in Sign Language when the person is reading and translating in real-time, as in broadcasting.

When fingerspelling is used and/or the corresponding sign is given, it will be translated literally. For example, in tier 'Trad': "The disease called [DL:GALACTORREA], its sign is [GL:GALACTORREA]" is signed in LSE as (gloss-type notation): DISEASE NAME G-A-L-A-C-T-O-RR-E-A (fingerspelled), SIGN cl:fluid-chest (visually explained).

If a semantic confusion occurs when signing a specific sign, the substitution code is used [S: "Spanish text - GLOSS of the sign actually performed"]. For example, the text says "There are many species:" and in the signed video, the sign for "culinary spices" is used, therefore [S: species - SPICES] is inserted.

When a deictic is used that replaces part of the text because they have previously located what they are mentioning there, and it is necessary for the sentence to make sense: [VD:"text replaced by the deictic"].

When a sign is temporarily defined for that discourse context, [T: signed word] is used.

Acknowledgments

This dataset is a collaborative effort of the next research goups and entities:

Group of Multimedia Technologies (GTM) from the atlanTTic Research Center of University of Vigo (Spain)

Group of Discourse and Society (GRADES) from the School of Philology and Translation of University of Vigo (Spain)

Federation of Deaf People Galician Associations (FAXPG)

Galician Health Service (SERGAS)

Gratitude is extended to them for their contributions and support.
Snapture - A Novel Neural Architecture for Combined Static and Dynamic Hand...
zenodo.org
data.niaid.nih.gov
csv
Updated Feb 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hassan Ali; Doreen Jirak; Stefan Wermter; Hassan Ali; Doreen Jirak; Stefan Wermter (2024). Snapture - A Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition:TERAIS Data [Dataset]. http://doi.org/10.5281/zenodo.10693817
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10693817
Dataset updated
Feb 22, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hassan Ali; Doreen Jirak; Stefan Wermter; Hassan Ali; Doreen Jirak; Stefan Wermter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository lists the datasets used for developing the experiments of the paper titled: Snapture - A Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition (Open access) by Hassan Ali, Doreen Jirak and Stefan Wermter. The study was conducted in the Knowledge Technology (WTM) group at the University of Hamburg.

GRIT Robot Commands Dataset

This dataset is was recorded at the Knowledge Technology (WTM) group at the University of Hamburg and can be requested here. The dataset is public and was not collected as part of this study.

Montalbano Co-Speech Dataset

This dataset was recorded as part of the ChaLearn Looking at People Challenge and can be downloaded from here. The dataset is public and was not collected as part of this study. The attached montalbano_segments.csv file can be used to create gesture segmentations of the test subset of this dataset.

Acknow ledgements

This work was partially supported by the DFG under project CML (TRR 169) and BMWK under project KI-SIGS and EU under project TERAIS.

Citation

To cite our paper, you can copy the following into your .bib file

@Article{Ali2023, author={Ali, Hassan and Jirak, Doreen and Wermter, Stefan}, title={Snapture---a Novel Neural Architecture for Combined Static and Dynamic Hand Gesture Recognition}, journal={Cognitive Computation}, year={2023}, month={Jul}, day={17}, issn={1866-9964}, doi={10.1007/s12559-023-10174-z}, url={https://doi.org/10.1007/s12559-023-10174-z} }
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Jose L. Alba-Castro; Jose L. Alba-Castro; Manuel Vázquez-Enríquez; Manuel Vázquez-Enríquez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Carmen Cabeza-Pereiro; Carmen Cabeza-Pereiro; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Soledad Torres-Guijarro; Soledad Torres-Guijarro; Alba Caderno-Fernández; Sol Cid-Álvarez; Ania Pérez-Pérez; Flora Mariño-Pérez; Manuel L Lema-Álvarez; Eduardo Rodríguez-Banga; Laura Docío-Fernández; Alba Caderno-Fernández; Sol Cid-Álvarez (2024). LSE-Health-UVigo [Dataset]. http://doi.org/10.5281/zenodo.10234465

Data from: LSE-Health-UVigo

Explore at:

Unique identifier

https://doi.org/10.5281/zenodo.10234465

Dataset updated

Sep 5, 2024

Dataset provided by

Zenodo

Authors

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

LSE-Health-UVigo Dataset

The LSE-Health-UVigo dataset is a collection of 273 videos focused on health-related topics, presented in Spanish Sign Language (Lengua de Signos Española, LSE). The dataset offers comprehensive annotations and alignments for various linguistic elements within the videos.

Overview

Total Videos: 273
Total Duration: ~11 hours

The dataset was acquired in studio conditions with blue chroma-key, no shadow effects and uniform illumination, at 25 fps and FHD. The added value of the dataset is he rich and rigorous hand-made annotations. Experts interpreters and deaf people were in charge of annotating the dataset with strict criteria explained below. A previous version of this dataset with less videos and annotations was distributed for the 2022 Sign Spotting Challenge at ECCV. The description of the former dataset, LSE_eSaude_UVIGO (ECCV'22), can be found here, including the train/val/test downloadable splits for the two organized tracks (MSSL-multiple shot supervised learning, and OSLWL-one shot learning and weak labels). The results of the challenge with the description of the dataset, protocols and baseline models, as well as discussing top-winning solutions and future directions on the topic can be found in the ECCV'2022 paper.

Annotations

1. Translation from LSE to Spanish

Description: Each video is translated into Spanish, segmented and aligned into sentences and smaller segments.
Total Segments: 7,738

2. Sign Annotations

Description: 105 distinct signs annotated across all 273 videos.
Total Instances: 15,098

3. Fingerspelling Annotations

Description: Accurate location and annotation of all fingerspelled words.
Total Instances: 1,029

Signers

Total Signers: 10 (7 women, 3 men)
- Deaf Signers: 7 (4 women, 3 men)
- Hearing Signers: 3 (all women)

Usage

Researchers and practitioners in machine translation, linguistics, healthcare, and sign language interpretation may find this dataset valuable for:

Training/testing machine learning models for sign language translation, sign spotting, isolated sign language recognition and fingerspelling detection/recognition.
Studying health-related sign language communication
Analyzing linguistic patterns and structures within sign language

Distribution

Excel file: Includes the links to download the videos from youtube, meta data and all the annotations (segments, glosses and fingerspelled words)
273 annotation files using the ELAN program: These files contain 3 Tiers explained below
273 video files: spanning ~11 hours of topics related to diseases, symptoms, treatment, care, etc.

Annotations:

LSE-Health-UVigo has been annotated with the ELAN program. Annotators used three Tiers:

Tier 'M_Glosa' for the location of the 105 selected glosses.
Associated Tier 'Var' for annotating variants of the signed gloss. 3-letter codes have been defined to note 8 types of sign variations. These variations are:
- linguistic, like slight modifications of the sign due to relaxed execution (coded LAX), slight change of location in space (LOC), abnormal use of the non-dominant hand (MAN) and morphology changes as in distributed plurals (MPH), or
- non-linguistic, like very short sign due to speed and large coarticulation SHO), partial occlusion of the sign with other body parts (OCC) or because out of the frame (OUT) and other gloss with similar signed appearance to the selected gloss (SIM). For the sake of completeness, these variations are not filtered out in the distribution, just informed.
Tier 'Trad' for the translation into Spanish and annotation of start and end of each sentence or partial sentence. Translation of a visual language to a written language is not a trivial task. Two-letter codes have been defined to note 8 types of special events regarding visual-to-written translation. The next subsection explains the strict annotation criteria.

Annotation criteria:

The annotation criteria shared by all the annotators (4) were as follows:

For Glosses and fingerspelled words:

The begin_timestamp for a sign is set as soon as the parameters hand configuration, palm orientation and movement and/or location correspond to that sign more than to the transition from the previous one.
The end_timestamp for a sign is set as soon as the parameters hand configuration, palm orientation and movement and/or location start to change to a transition to the next sign.
As far as possible, transitions are not included in the annotated intervals.
A star '*' prefix indicate that there's slight drift from normal realization or that there's a OOV sign very similar visually. The reasons can be linguistic (MAN, LOC, MPH, LAX) or not linguistic (SHO, OCC, OUT, SIM)
It is important to highlight the annotation of the plurals. In Spanish sign Language, plurals use to be performed by sign repetition. Annotations for plurals are coded as a single interval comprising all the concatenated repetitions if no other parameter is modified. The only special case corresponds to the glosses PERSON and its plural PERSON(M-RE), that are coded with different gloss-class because the second repetition use to be smaller and relaxed, as a rebound without changing the hand configuration.

For Translation: The general criterion for segmentation (performed by a professional interpreter involved also in the recordings) was to adapt the text in OL (oral language) to resemble as closely as possible the signed LSE (Spanish Sign Language) and to segment complete phrases, or smaller particles if it results in more semantically coherent segments. Additionally, due to discursive and grammatical differences between OL and LSE, 8 specific types of annotations were defined and marked in brackets:

If the text in OL has a visual but not literal correspondence in LSE, it will be marked with the code "V" (visual) in brackets [V:original text].
When discursive markers are used in LSE that do not usually appear in the text, [MD:type-MD] is inserted in the corresponding text: [MD:one], [MD:two], [MD:three], etc. [MD:theme], [MD:alternative], [MD:section], [MD:next], etc.
When the phrase in LSE is affected by bimodal signing (restricted by the grammatical order in the text), it is marked as [B: and the text signed bimodally]. This phenomenon is very typical in Sign Language when the person is reading and translating in real-time, as in broadcasting.
When fingerspelling is used and/or the corresponding sign is given, it will be translated literally. For example, in tier 'Trad': "The disease called [DL:GALACTORREA], its sign is [GL:GALACTORREA]" is signed in LSE as (gloss-type notation): DISEASE NAME G-A-L-A-C-T-O-RR-E-A (fingerspelled), SIGN cl:fluid-chest (visually explained).
If a semantic confusion occurs when signing a specific sign, the substitution code is used [S: "Spanish text - GLOSS of the sign actually performed"]. For example, the text says "There are many species:" and in the signed video, the sign for "culinary spices" is used, therefore [S: species - SPICES] is inserted.
When a deictic is used that replaces part of the text because they have previously located what they are mentioning there, and it is necessary for the sentence to make sense: [VD:"text replaced by the deictic"].
When a sign is temporarily defined for that discourse context, [T: signed word] is used.

Acknowledgments

This dataset is a collaborative effort of the next research goups and entities:

Group of Multimedia Technologies (GTM) from the atlanTTic Research Center of University of Vigo (Spain)
Group of Discourse and Society (GRADES) from the School of Philology and Translation of University of Vigo (Spain)
Federation of Deaf People Galician Associations (FAXPG)
Galician Health Service (SERGAS)

Gratitude is extended to them for their contributions and support.

Clear search

Close search

Google apps

Main menu

Data from: LSE-Health-UVigo

LSE-Health-UVigo Dataset

Overview

Annotations

1. Translation from LSE to Spanish

2. Sign Annotations

3. Fingerspelling Annotations

Signers

Usage

Distribution

Annotations:

Annotation criteria:

Acknowledgments

Snapture - A Novel Neural Architecture for Combined Static and Dynamic Hand...

GRIT Robot Commands Dataset

Montalbano Co-Speech Dataset

Acknow ledgements

Citation

Data from: LSE-Health-UVigo

LSE-Health-UVigo Dataset

Overview

Annotations

1. Translation from LSE to Spanish

2. Sign Annotations

3. Fingerspelling Annotations

Signers

Usage

Distribution

Annotations:

Annotation criteria:

Acknowledgments