2 datasets found
  1. Granary

    • huggingface.co
    Updated Dec 22, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NVIDIA (2008). Granary [Dataset]. https://huggingface.co/datasets/nvidia/Granary
    Explore at:
    Dataset updated
    Dec 22, 2008
    Dataset provided by
    Nvidiahttp://nvidia.com/
    Authors
    NVIDIA
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Granary: Speech Recognition and Translation Dataset in 25 European Languages

    Granary is a large-scale, open-source multilingual speech dataset covering 25 European languages for Automatic Speech Recognition (ASR) and Automatic Speech Translation (AST) tasks.

      Overview
    

    Granary addresses the scarcity of high-quality speech data for low-resource languages by consolidating multiple datasets under a unified framework: 🗣️ ~1M hours of high-quality… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Granary.

  2. h

    yodas-granary

    • huggingface.co
    Updated Jun 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ESPnet (2025). yodas-granary [Dataset]. https://huggingface.co/datasets/espnet/yodas-granary
    Explore at:
    Dataset updated
    Jun 16, 2025
    Dataset authored and provided by
    ESPnet
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Dataset Card for YODAS-Granary

    Repository: NeMo-speech-data-processor: Granary Paper: Granary: Speech Recognition and Translation Dataset in 25 European Languages Shared by: ESPnet

      Dataset Description
    

    YODAS-Granary is a curated subset of the larger nvidia/Granary dataset, focusing on high-quality pseudo-labeled speech data for Automatic Speech Recognition (ASR) and Automatic Speech Translation (AST) across 23 European languages.

      Overview… See the full description on the dataset page: https://huggingface.co/datasets/espnet/yodas-granary.
    
  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
NVIDIA (2008). Granary [Dataset]. https://huggingface.co/datasets/nvidia/Granary
Organization logo

Granary

Granary

nvidia/Granary

Explore at:
Dataset updated
Dec 22, 2008
Dataset provided by
Nvidiahttp://nvidia.com/
Authors
NVIDIA
License

Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically

Description

Granary: Speech Recognition and Translation Dataset in 25 European Languages

Granary is a large-scale, open-source multilingual speech dataset covering 25 European languages for Automatic Speech Recognition (ASR) and Automatic Speech Translation (AST) tasks.

  Overview

Granary addresses the scarcity of high-quality speech data for low-resource languages by consolidating multiple datasets under a unified framework: 🗣️ ~1M hours of high-quality… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/Granary.

Search
Clear search
Close search
Google apps
Main menu