20 datasets found

h
gigaspeech
huggingface.co
opendatalab.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SpeechColab, gigaspeech [Dataset]. https://huggingface.co/datasets/speechcolab/gigaspeech
Explore at:
Dataset authored and provided by
SpeechColab
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
GigaSpeech is an evolving, multi-domain English speech recognition corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and 40,000 hours of total audio suitable for semi-supervised and unsupervised training. Around 40,000 hours of transcribed audio is first collected from audiobooks, podcasts and YouTube, covering both read and spontaneous speaking styles, and a variety of topics, such as arts, science, sports, etc. A new forced alignment and segmentation pipeline is proposed to create sentence segments suitable for speech recognition training, and to filter out segments with low-quality transcription. For system training, GigaSpeech provides five subsets of different sizes, 10h, 250h, 1000h, 2500h, and 10000h. For our 10,000-hour XL training subset, we cap the word error rate at 4% during the filtering/validation stage, and for all our other smaller training subsets, we cap it at 0%. The DEV and TEST evaluation sets, on the other hand, are re-processed by professional human transcribers to ensure high transcription quality.
h
gigaspeech
huggingface.co
Updated Feb 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ultravox.ai (2025). gigaspeech [Dataset]. https://huggingface.co/datasets/fixie-ai/gigaspeech
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 18, 2025
Dataset provided by
Ultravox.ai
Description
fixie-ai/gigaspeech dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-part-2
huggingface.co
Updated Jul 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahd Safarani (2025). gigaspeech-part-2 [Dataset]. https://huggingface.co/datasets/shahdsaf/gigaspeech-part-2
Explore at:
Dataset updated
Jul 6, 2025
Authors
Shahd Safarani
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Gigaspeech Part 2

This is Part 2 of 8 of a large-scale speech dataset, split to accommodate HuggingFace's repository size limits.

Multi-Part Dataset

This dataset is split across multiple repositories:

Part 1: shahdsaf/gigaspeech-part-1 Part 2 (current): shahdsaf/gigaspeech-part-2 Part 3: shahdsaf/gigaspeech-part-3 Part 4: shahdsaf/gigaspeech-part-4 Part 5: shahdsaf/gigaspeech-part-5 Part 6: shahdsaf/gigaspeech-part-6 Part 7: shahdsaf/gigaspeech-part-7 Part 8:… See the full description on the dataset page: https://huggingface.co/datasets/shahdsaf/gigaspeech-part-2.
h
gigaspeech-tiny-stage4
huggingface.co
Updated Jul 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helin Wang (2024). gigaspeech-tiny-stage4 [Dataset]. https://huggingface.co/datasets/westbrook/gigaspeech-tiny-stage4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 27, 2024
Authors
Helin Wang
Description
westbrook/gigaspeech-tiny-stage4 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-tiny-2
huggingface.co
Updated Aug 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helin Wang (2024). gigaspeech-tiny-2 [Dataset]. https://huggingface.co/datasets/westbrook/gigaspeech-tiny-2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 14, 2024
Authors
Helin Wang
Description
westbrook/gigaspeech-tiny-2 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-tiny-3
huggingface.co
Updated Jul 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helin Wang (2024). gigaspeech-tiny-3 [Dataset]. https://huggingface.co/datasets/westbrook/gigaspeech-tiny-3
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 18, 2024
Authors
Helin Wang
Description
westbrook/gigaspeech-tiny-3 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-processed
huggingface.co
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helin Wang (2024). gigaspeech-processed [Dataset]. https://huggingface.co/datasets/westbrook/gigaspeech-processed
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 12, 2024
Authors
Helin Wang
Description
westbrook/gigaspeech-processed dataset hosted on Hugging Face and contributed by the HF Datasets community
h
indo-split-gigaspeech
huggingface.co
Updated Apr 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bagas S (2025). indo-split-gigaspeech [Dataset]. https://huggingface.co/datasets/bagasshw/indo-split-gigaspeech
Explore at:
Dataset updated
Apr 6, 2025
Authors
Bagas S
Description
This dataset contains transcribed audio data for Indonesian. The dataset consists of audio files and a CSV file. The CSV file contains the audio ID and transcription of the audio in the file.
h
gigaspeech-seed-context-continuation-noise
huggingface.co
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
patrick Li (2025). gigaspeech-seed-context-continuation-noise [Dataset]. https://huggingface.co/datasets/patricklifixie/gigaspeech-seed-context-continuation-noise
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2025
Authors
patrick Li
Description
patricklifixie/gigaspeech-seed-context-continuation-noise dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-vi
huggingface.co
Updated Jan 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A Pham (2025). gigaspeech-vi [Dataset]. https://huggingface.co/datasets/hoanganhpham/gigaspeech-vi
Explore at:
Dataset updated
Jan 23, 2025
Authors
A Pham
Description
hoanganhpham/gigaspeech-vi dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-tiny-0-train
huggingface.co
Updated Jul 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helin Wang (2024). gigaspeech-tiny-0-train [Dataset]. https://huggingface.co/datasets/westbrook/gigaspeech-tiny-0-train
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 31, 2024
Authors
Helin Wang
Description
westbrook/gigaspeech-tiny-0-train dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-l_multi_prompts
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dimitris Damianos, gigaspeech-l_multi_prompts [Dataset]. https://huggingface.co/datasets/ddamianos/gigaspeech-l_multi_prompts
Explore at:
Authors
Dimitris Damianos
Description
ddamianos/gigaspeech-l_multi_prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-hubert_large_ll60k-layer_22
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anil Keshwani, gigaspeech-hubert_large_ll60k-layer_22 [Dataset]. https://huggingface.co/datasets/anilkeshwani/gigaspeech-hubert_large_ll60k-layer_22
Explore at:
Authors
Anil Keshwani
Description
anilkeshwani/gigaspeech-hubert_large_ll60k-layer_22 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
gigaspeech-icefall-data
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yifan Yang (2025). gigaspeech-icefall-data [Dataset]. https://huggingface.co/datasets/yfyeung/gigaspeech-icefall-data
Explore at:
Authors
Yifan Yang
Description
yfyeung/gigaspeech-icefall-data dataset hosted on Hugging Face and contributed by the HF Datasets community
h
asr-alignment
huggingface.co
Updated Jan 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Binh Nguyen (2024). asr-alignment [Dataset]. https://huggingface.co/datasets/nguyenvulebinh/asr-alignment
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 18, 2024
Authors
Binh Nguyen
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Speech Recognition Alignment Dataset

This dataset is a variation of several widely-used ASR datasets, encompassing Librispeech, MuST-C, TED-LIUM, VoxPopuli, Common Voice, and GigaSpeech. The difference is this dataset includes:

Precise alignment between audio and text. Text that has been punctuated and made case-sensitive. Identification of named entities in the text.

Usage

First, install the latest version of the 🤗 Datasets package: pip install --upgrade pip pip… See the full description on the dataset page: https://huggingface.co/datasets/nguyenvulebinh/asr-alignment.
h
dataset
huggingface.co
Updated Jun 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NGUYEN VAN MANH (2024). dataset [Dataset]. https://huggingface.co/datasets/vanmanhnew/dataset
Explore at:
Dataset updated
Jun 19, 2024
Authors
NGUYEN VAN MANH
Description
GigaSpeech 2

This is the official repository of the GigaSpeech 2 dataset. For details of how we created the dataset, please refer to our arXiv preprint paper. GigaSpeech 2 version: 2.0 (2024/06/19)

Download

The dataset is available at HuggingFace and ModelScope. The pre-trained models are available at Thai and Vietnamese.

Leaderboard

Contributor Toolkit Train Recipe Train Data Inference Test CER/WER

Baseline Icefall… See the full description on the dataset page: https://huggingface.co/datasets/vanmanhnew/dataset.
h
CapSpeech_GigaSpeech
huggingface.co
Updated Mar 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenSound (2025). CapSpeech_GigaSpeech [Dataset]. https://huggingface.co/datasets/OpenSound/CapSpeech_GigaSpeech
Explore at:
Dataset updated
Mar 23, 2025
Authors
OpenSound
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
CapSpeech-GigaSpeech Audio

DataSet used for the paper: CapSpeech: Enabling Downstream Applications in Style-Captioned Text-to-Speech Please refer to 🤗CapSpeech for the whole dataset and 🚀CapSpeech repo for more details.

Overview

🔥 CapSpeech is a new benchmark designed for style-captioned TTS (CapTTS) tasks, including style-captioned text-to-speech synthesis with sound effects (CapTTS-SE), accent-captioned TTS (AccCapTTS), emotion-captioned TTS (EmoCapTTS) and… See the full description on the dataset page: https://huggingface.co/datasets/OpenSound/CapSpeech_GigaSpeech.
o
ESPnet2 pretrained model, Shinji...
explore.openaire.eu
Updated Mar 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shinji Watanabe (2021). ESPnet2 pretrained model, Shinji Watanabe/gigaspeech_asr_train_asr_raw_en_bpe5000_valid.acc.ave, fs=16k, lang=en [Dataset]. http://doi.org/10.5281/zenodo.4630405
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4630405
Dataset updated
Mar 23, 2021
Authors
Shinji Watanabe
Description
This model was trained by Shinji Watanabe using gigaspeech recipe in espnet. Python API See https://github.com/espnet/espnet_model_zoo Evaluate in the recipe git clone https://github.com/espnet/espnet cd espnet git checkout dcb5bdb2ffa34a9f44255c0b073759c5b9b3f86e pip install -e . cd egs2/gigaspeech/asr1 ./run.sh --skip_data_prep false --skip_train true --download_model Shinji Watanabe/gigaspeech_asr_train_asr_raw_en_bpe5000_valid.acc.ave Results # RESULTS ## Environments - date: Tue Mar 23 10:03:49 EDT 2021 - python version: 3.8.5 (default, Sep 4 2020, 07:30:14) [GCC 7.3.0] - espnet version: espnet 0.9.8 - pytorch version: pytorch 1.7.1 - Git hash: dcb5bdb2ffa34a9f44255c0b073759c5b9b3f86e - Commit date: Sat Mar 13 10:16:16 2021 -0500 ## asr_train_asr_raw_en_bpe5000 ### WER |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err| |---|---|---|---|---|---|---|---|---| |decode_asr_asr_model_valid.acc.ave/dev|2043|51075|92.9|4.5|2.6|2.1|9.2|65.6| |decode_asr_asr_model_valid.acc.ave/test|9627|175116|90.5|7.0|2.5|6.1|15.6|69.3| ### CER |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err| |---|---|---|---|---|---|---|---|---| |decode_asr_asr_model_valid.acc.ave/dev|2043|271188|97.5|0.9|1.6|1.7|4.2|65.6| |decode_asr_asr_model_valid.acc.ave/test|9627|909930|96.5|1.6|1.9|5.6|9.0|69.3| ### TER |dataset|Snt|Wrd|Corr|Sub|Del|Ins|Err|S.Err| |---|---|---|---|---|---|---|---|---| |decode_asr_asr_model_valid.acc.ave/dev|2043|63598|93.3|3.9|2.8|2.1|8.8|65.6| |decode_asr_asr_model_valid.acc.ave/test|9627|218851|90.8|6.1|3.1|7.0|16.2|69.3| ASR config config: conf/train_asr.yaml print_config: false log_level: INFO dry_run: false iterator_type: sequence output_dir: exp/asr_train_asr_raw_en_bpe5000 ngpu: 1 seed: 0 num_workers: 1 num_att_plot: 3 dist_backend: nccl dist_init_method: env:// dist_world_size: 4 dist_rank: 0 local_rank: 0 dist_master_addr: localhost dist_master_port: 37831 dist_launcher: null multiprocessing_distributed: true unused_parameters: false sharded_ddp: false cudnn_enabled: true cudnn_benchmark: false cudnn_deterministic: true collect_stats: false write_collected_feats: false max_epoch: 20 patience: null val_scheduler_criterion: - valid - loss early_stopping_criterion: - valid - loss - min best_model_criterion: - - valid - acc - max keep_nbest_models: 10 grad_clip: 5.0 grad_clip_type: 2.0 grad_noise: false accum_grad: 4 no_forward_run: false resume: true train_dtype: float32 use_amp: false log_interval: null use_tensorboard: true use_wandb: false wandb_project: null wandb_id: null detect_anomaly: false pretrain_path: null init_param: [] freeze_param: [] num_iters_per_epoch: null batch_size: 20 valid_batch_size: null batch_bins: 35000000 valid_batch_bins: null train_shape_file: - exp/asr_stats_raw_en_bpe5000/train/speech_shape - exp/asr_stats_raw_en_bpe5000/train/text_shape.bpe valid_shape_file: - exp/asr_stats_raw_en_bpe5000/valid/speech_shape - exp/asr_stats_raw_en_bpe5000/valid/text_shape.bpe batch_type: numel valid_batch_type: null fold_length: - 80000 - 150 sort_in_batch: descending sort_batch: descending multiple_iterator: false chunk_length: 500 chunk_shift_ratio: 0.5 num_cache_chunks: 1024 train_data_path_and_name_and_type: - - dump/raw/train/wav.scp - speech - kaldi_ark - - dump/raw/train/text - text - text valid_data_path_and_name_and_type: - - dump/raw/dev/wav.scp - speech - kaldi_ark - - dump/raw/dev/text - text - text allow_variable_data_keys: false max_cache_size: 0.0 max_cache_fd: 32 valid_max_cache_size: null optim: adam optim_conf: lr: 0.0015 scheduler: warmuplr scheduler_conf: warmup_steps: 25000 token_list: - - - S - ▁THE - ▁TO - ▁OF - ▁A - ▁AND - '''' - ▁THAT - ▁IN - ▁YOU - ▁I - ▁IT - T - ▁IS - ▁WAS - ED - ▁WE - ▁FOR - ING - ▁THIS - D - ▁ON - ▁BE - ▁WITH - ▁HAVE - ▁SO - ▁HE - RE - ▁THEY - ▁ARE - ▁NOT - ▁AS - ▁LIKE - ▁AT - ▁KNOW - ▁WHAT - LY - ▁CAN - ▁DO - ▁ABOUT - ▁ALL - ▁HIS - M - ▁HAD - '-' - ▁ONE - ▁OR - ▁FROM - ▁THERE - ▁ME - ▁MY - ▁BUT - ▁JUST - ▁YOUR - ▁AN - ▁BY - Y - ▁IF - ▁OUT - ▁PEOPLE - ▁UP - ▁HER - ER - ▁WERE - ▁THINK - E - N - ▁WOULD - ▁SHE - ▁THEIR - ▁WHO - ▁MORE - ▁OUR - ▁THEM - ▁WHEN - ▁WHICH - ▁VERY - ▁WILL - ▁SOME - ▁TIME - ▁BEEN - R - ▁GET - ▁HAS - ▁GOING - ▁HIM - VE - ▁REALLY - ▁HOW - ▁DON - ▁NO - ▁THEN - LL - ▁GO - ▁BECAUSE - ▁NOW - AL - ▁INTO - ▁THESE - ▁OTHER - ▁RIGHT - ▁SEE - ▁SAID - ▁HERE - ▁WAY - ▁TWO - ▁US - ▁WANT - ▁COULD - ▁S - ▁SAY - ▁OVER - ▁AH - ES - ▁WHERE - ▁BACK - ▁ALSO - ▁THOSE - ▁THINGS - ▁MAKE - ▁KIND - ▁MUCH - IN - ▁WELL - ▁GOOD - ▁DID - L - ▁FIRST - ▁THAN - ▁LITTLE - ▁RE - C - ▁NEW - ▁WORK - ▁ANY - A - P - ▁LOT - ▁DOWN - ▁SOMETHING - ▁THING - OR - LE - ▁MAN - ▁GOT - B - ▁COME - ▁ONLY - G - ▁BEING - ▁ACTUALLY - ▁LOOK - O - ▁TAKE - ▁EVEN - ▁NEED - ▁THROUGH - W - ▁GREAT - ▁WORLD - ▁MANY - ▁SHOULD - ▁YEARS - ATION - ▁UM - ▁MOST - ▁DAY - ▁YEAH - ▁LIFE - ▁BEFORE - ▁THREE - ▁UN - ION - ▁DIFFERENT - ▁DE - ▁MIGHT - ▁LET - ▁MADE - ▁MEAN - ▁PART - IC - ▁AGAIN - TH - ▁AFTER - ▁OWN - ▁USE - ITY - ABLE - ▁LONG - ▁STILL - ▁MAY - F - ▁OFF - ▁NEVER - ▁PUT - ▁C - ▁SAME - ...
h
gigaspeech_test
huggingface.co
Updated Jul 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AudioLLMs (2024). gigaspeech_test [Dataset]. https://huggingface.co/datasets/AudioLLMs/gigaspeech_test
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Dataset authored and provided by
AudioLLMs
Description
@article{chen2021gigaspeech, title={Gigaspeech: An evolving, multi-domain asr corpus with 10,000 hours of transcribed audio}, author={Chen, Guoguo and Chai, Shuzhou and Wang, Guanbo and Du, Jiayu and Zhang, Wei-Qiang and Weng, Chao and Su, Dan and Povey, Daniel and Trmal, Jan and Zhang, Junbo and others}, journal={arXiv preprint arXiv:2106.06909}, year={2021} }

@article{wang2024audiobench, title={AudioBench: A Universal Benchmark for Audio Large Language Models}, author={Wang, Bin… See the full description on the dataset page: https://huggingface.co/datasets/AudioLLMs/gigaspeech_test.
h
gigaspeech2-test
huggingface.co
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AudioLLMs (2025). gigaspeech2-test [Dataset]. https://huggingface.co/datasets/AudioLLMs/gigaspeech2-test
Explore at:
Dataset updated
May 31, 2025
Dataset authored and provided by
AudioLLMs
Description
@article{yang2024gigaspeech, title={GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement}, author={Yang, Yifan and Song, Zheshu and Zhuo, Jianheng and Cui, Mingyu and Li, Jinpeng and Yang, Bo and Du, Yexing and Ma, Ziyang and Liu, Xunying and Wang, Ziyuan and others}, journal={arXiv preprint arXiv:2406.11546}, year={2024} }

@article{wang2024audiobench, title={AudioBench: A Universal… See the full description on the dataset page: https://huggingface.co/datasets/AudioLLMs/gigaspeech2-test.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

SpeechColab, gigaspeech [Dataset]. https://huggingface.co/datasets/speechcolab/gigaspeech

gigaspeech

Gigaspeech

speechcolab/gigaspeech

Explore at:

445 scholarly articles cite this dataset (View in Google Scholar)

Dataset authored and provided by

SpeechColab

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

GigaSpeech is an evolving, multi-domain English speech recognition corpus with 10,000 hours of high quality labeled audio suitable for supervised training, and 40,000 hours of total audio suitable for semi-supervised and unsupervised training. Around 40,000 hours of transcribed audio is first collected from audiobooks, podcasts and YouTube, covering both read and spontaneous speaking styles, and a variety of topics, such as arts, science, sports, etc. A new forced alignment and segmentation pipeline is proposed to create sentence segments suitable for speech recognition training, and to filter out segments with low-quality transcription. For system training, GigaSpeech provides five subsets of different sizes, 10h, 250h, 1000h, 2500h, and 10000h. For our 10,000-hour XL training subset, we cap the word error rate at 4% during the filtering/validation stage, and for all our other smaller training subsets, we cap it at 0%. The DEV and TEST evaluation sets, on the other hand, are re-processed by professional human transcribers to ensure high transcription quality.

Clear search

Close search

Google apps

Main menu

gigaspeech

gigaspeech

gigaspeech-part-2

gigaspeech-tiny-stage4

gigaspeech-tiny-2

gigaspeech-tiny-3

gigaspeech-processed

indo-split-gigaspeech

gigaspeech-seed-context-continuation-noise

gigaspeech-vi

gigaspeech-tiny-0-train

gigaspeech-l_multi_prompts

gigaspeech-hubert_large_ll60k-layer_22

gigaspeech-icefall-data

asr-alignment

dataset

CapSpeech_GigaSpeech

ESPnet2 pretrained model, Shinji...

gigaspeech_test

gigaspeech2-test

gigaspeechSee More Versions

Gigaspeech

speechcolab/gigaspeech

gigaspeech