36 datasets found

h
libritts
huggingface.co
Updated Feb 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mythic Infinity (2024). libritts [Dataset]. https://huggingface.co/datasets/mythicinfinity/libritts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 9, 2024
Dataset authored and provided by
Mythic Infinity
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for LibriTTS

LibriTTS is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, prepared by Heiga Zen with the assistance of Google Speech and Google Brain team members. The LibriTTS corpus is designed for TTS research. It is derived from the original materials (mp3 audio files from LibriVox and text files from Project Gutenberg) of the LibriSpeech corpus.

Overview

This is the LibriTTS dataset, adapted… See the full description on the dataset page: https://huggingface.co/datasets/mythicinfinity/libritts.
h
libritts-aligned
huggingface.co
Updated Mar 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christoph Minixhofer (2024). libritts-aligned [Dataset]. https://huggingface.co/datasets/cdminix/libritts-aligned
Explore at:
Dataset updated
Mar 9, 2024
Authors
Christoph Minixhofer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used for loading TTS spectrograms and waveform audio with alignments and a number of configurable "measures", which are extracted from the raw audio.
h
libritts_r
huggingface.co
Updated Feb 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mythic Infinity (2024). libritts_r [Dataset]. https://huggingface.co/datasets/mythicinfinity/libritts_r
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 2, 2024
Dataset authored and provided by
Mythic Infinity
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for LibriTTS-R

LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus (http://www.openslr.org/60/) which is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, published in 2019.

Overview

This is the LibriTTS-R dataset, adapted for the datasets library.

Usage Splits

There are 7 splits (dots replace dashes from the original dataset, to comply with hf naming… See the full description on the dataset page: https://huggingface.co/datasets/mythicinfinity/libritts_r.
Libri TTS dev
kaggle.com
Updated Nov 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luiz Felipe de Barros Jordão Costa (2020). Libri TTS dev [Dataset]. https://www.kaggle.com/luizfelipebjcosta/libri-tts-dev/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 13, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luiz Felipe de Barros Jordão Costa
Description
This dataset is a subset of a minimal version of google's LibriTTS dataset, for more information on the LibriTTS dataset see this article. It's a minimal version because it contains only the text and audio files, that is, the basics you need to train a text-to-speech model. It's also only a subset, because kaggle has a size limit for the datasets to access the "full minimal dataset", see the list bellow: 1. Libri TTS train clean 100 (from the file train-clean-100 of the dataset) 2. Libri TTS train clean 360 part 1 (from the first half of the file train-clean-360) 3. Libri TTS train clean 360 part 2 (from the second part of the same file) 4. Libri TTS train other 500 part 1 (from the first part of the file train-other-500) 5. Libri TTS train other 500 part 2 (from the same file) 6. Libri TTS test (from the files test-clean and test-other) 7. Libri TTS dev (this dataset, from the files dev-clean and dev-other)
LibriTTS
kaggle.com
zip
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prateek Narain (2025). LibriTTS [Dataset]. https://www.kaggle.com/datasets/prateeknarain/libritts
Explore at:
zip(15443581764 bytes)Available download formats
Dataset updated
May 16, 2025
Authors
Prateek Narain
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Prateek Narain

Released under Apache 2.0

Contents
h
LibriTTS-raw
huggingface.co
Updated Apr 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Zain (2024). LibriTTS-raw [Dataset]. https://huggingface.co/datasets/azain/LibriTTS-raw
Explore at:
Dataset updated
Apr 18, 2024
Authors
Ahmed Zain
Description
azain/LibriTTS-raw dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts_r_tags_tagged_10k_generated
huggingface.co
Updated Apr 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parler TTS (2024). libritts_r_tags_tagged_10k_generated [Dataset]. https://huggingface.co/datasets/parler-tts/libritts_r_tags_tagged_10k_generated
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 10, 2024
Dataset authored and provided by
Parler TTS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for Annotated LibriTTS-R

This dataset is an annotated version of LibriTTS-R [1]. LibriTTS-R [1] is a sound quality improved version of the LibriTTS corpus which is a multi-speaker English corpus of approximately 960 hours of read English speech at 24kHz sampling rate, published in 2019. In the text_description column, it provides natural language annotations on the characteristics of speakers and utterances, that have been generated using the Data-Speech repository.… See the full description on the dataset page: https://huggingface.co/datasets/parler-tts/libritts_r_tags_tagged_10k_generated.
h
voices-libritts
huggingface.co
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SDialog (2025). voices-libritts [Dataset]. https://huggingface.co/datasets/sdialog/voices-libritts
Explore at:
Dataset updated
Aug 2, 2025
Dataset authored and provided by
SDialog
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LibriTTS Speaker Voices & Embeddings

Dataset Description

This dataset provides a collection of speaker voice samples from the LibriTTS corpus. For each speaker, a single 30-second audio clip is provided, created by concatenating their speech segments. The dataset is designed for tasks such as speaker identification, speaker verification, and as a voice bank for Text-to-Speech (TTS) models, particularly for voice cloning. In addition to the audio files and their metadata… See the full description on the dataset page: https://huggingface.co/datasets/sdialog/voices-libritts.
h
LibriTTS-Enhanced
huggingface.co
Updated Jul 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
huanpm (2025). LibriTTS-Enhanced [Dataset]. https://huggingface.co/datasets/tong0/LibriTTS-Enhanced
Explore at:
Dataset updated
Jul 15, 2025
Authors
huanpm
Description
LibriTTS Enhanced Dataset

Enhanced version of LibriTTS dataset for speech enhancement research.
o
ESPnet2 pretrained model,...
explore.openaire.eu
Updated Sep 22, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kan-Bayashi (2021). ESPnet2 pretrained model, kan-bayashi/libritts_tts_train_xvector_vits_raw_phn_tacotron_g2p_en_no_space_train.total_count.ave, fs=22050, lang=en [Dataset]. http://doi.org/10.5281/zenodo.5521416
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5521416
Dataset updated
Sep 22, 2021
Authors
Kan-Bayashi
Description
This model was trained by kan-bayashi using libritts/tts1 recipe in espnet. Python APISee https://github.com/espnet/espnet_model_zoo Evaluate in the recipegit clone https://github.com/espnet/espnet cd espnet git checkout 628b46282537ce532d613d6bafb75e826e8455de pip install -e . cd egs2/libritts/tts1 # Download the model file here ./run.sh --skip_data_prep false --skip_train true --download_model kan-bayashi/libritts_tts_train_xvector_vits_raw_phn_tacotron_g2p_en_no_space_train.total_count.ave Configconfig: ./conf/tuning/train_xvector_vits.yaml print_config: false log_level: INFO dry_run: false iterator_type: sequence output_dir: exp/tts_train_xvector_vits_raw_phn_tacotron_g2p_en_no_space ngpu: 1 seed: 777 num_workers: 4 num_att_plot: 3 dist_backend: nccl dist_init_method: env:// dist_world_size: 4 dist_rank: 0 local_rank: 0 dist_master_addr: localhost dist_master_port: 60056 dist_launcher: null multiprocessing_distributed: true unused_parameters: true sharded_ddp: false cudnn_enabled: true cudnn_benchmark: false cudnn_deterministic: false collect_stats: false write_collected_feats: false max_epoch: 100 patience: null val_scheduler_criterion: - valid - loss early_stopping_criterion: - valid - loss - min best_model_criterion: - - train - total_count - max keep_nbest_models: 10 grad_clip: -1 grad_clip_type: 2.0 grad_noise: false accum_grad: 1 no_forward_run: false resume: true train_dtype: float32 use_amp: false log_interval: 50 use_tensorboard: true use_wandb: false wandb_project: null wandb_id: null wandb_entity: null wandb_name: null wandb_model_log_interval: -1 detect_anomaly: false pretrain_path: null init_param: [] ignore_init_mismatch: false freeze_param: [] num_iters_per_epoch: 10000 batch_size: 20 valid_batch_size: null batch_bins: 5000000 valid_batch_bins: null train_shape_file: - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/train/text_shape.phn - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/train/speech_shape valid_shape_file: - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/valid/text_shape.phn - exp/tts_stats_raw_linear_spectrogram_phn_tacotron_g2p_en_no_space/valid/speech_shape batch_type: numel valid_batch_type: null fold_length: - 150 - 204800 sort_in_batch: descending sort_batch: descending multiple_iterator: false chunk_length: 500 chunk_shift_ratio: 0.5 num_cache_chunks: 1024 train_data_path_and_name_and_type: - - dump/22k/raw/train-clean-460/text - text - text - - dump/22k/raw/train-clean-460/wav.scp - speech - sound - - dump/22k/xvector/train-clean-460/xvector.scp - spembs - kaldi_ark valid_data_path_and_name_and_type: - - dump/22k/raw/dev-clean/text - text - text - - dump/22k/raw/dev-clean/wav.scp - speech - sound - - dump/22k/xvector/dev-clean/xvector.scp - spembs - kaldi_ark allow_variable_data_keys: false max_cache_size: 0.0 max_cache_fd: 32 valid_max_cache_size: null optim: adamw optim_conf: lr: 0.0002 betas: - 0.8 - 0.99 eps: 1.0e-09 weight_decay: 0.0 scheduler: exponentiallr scheduler_conf: gamma: 0.999875 optim2: adamw optim2_conf: lr: 0.0002 betas: - 0.8 - 0.99 eps: 1.0e-09 weight_decay: 0.0 scheduler2: exponentiallr scheduler2_conf: gamma: 0.999875 generator_first: false token_list: - - - AH0 - T - N - D - S - R - L - IH1 - DH - M - K - Z - EH1 - AE1 - IH0 - AH1 - W - ',' - HH - ER0 - P - IY1 - V - F - B - UW1 - AA1 - AY1 - AO1 - . - EY1 - IY0 - OW1 - NG - G - SH - Y - AW1 - CH - ER1 - UH1 - TH - JH - '''' - '?' - OW0 - EH2 - '!' - IH2 - OY1 - EY2 - AY2 - EH0 - UW0 - AA2 - AE2 - OW2 - AO2 - AE0 - AH2 - ZH - AA0 - UW2 - IY2 - AY0 - AO0 - AW2 - EY0 - UH2 - ER2 - AW0 - '...' - UH0 - OY2 - . . . - OY0 - . . . . - .. - . ... - . . - . . . . . - .. .. - '... .' - odim: null model_conf: {} use_preprocessor: true token_type: phn bpemodel: null non_linguistic_symbols: null cleaner: tacotron g2p: g2p_en_no_space feats_extract: linear_spectrogram feats_extract_conf: n_fft: 1024 hop_length: 256 win_length: null normalize: null normalize_conf: {} tts: vits tts_conf: generator_type: vits_generator generator_params: hidden_channels: 192 spks: -1 spk_embed_dim: 512 global_channels: 256 segment_size: 32 text_encoder_attention_heads: 2 text_encoder_ffn_expand: 4 text_encoder_blocks: 6 text_encoder_positionwise_layer_type: conv1d text_encoder_positionwise_conv_kernel_size: 3 text_encoder_positional_encoding_layer_type: rel_pos text_encoder_self_attention_layer_type: rel_selfattn text_encoder_activation_type: swish text_encoder_normalize_before: true text_encoder_dropout_rate: 0.1 text_encoder_positional_dropout_rate: 0.0 text_encoder_attention_dropout_rate: 0.1 use_macaron_style_in_text_encoder: true use_conformer_conv_in_text_encoder: false text_encoder_conformer_kernel_size: -1 decoder_kernel_size: 7 decoder_channels: 512 decoder_upsample_scales: - 8 - 8 - 2 - 2 decoder_upsample_kernel_sizes: - 16 - 16 - 4 - 4 decoder_resblock_kernel_sizes: - 3 - 7 - 11 decoder_resblock_dilations: - - 1 - 3 - 5 - - 1 - 3 - 5 - - 1 - 3 - 5 use_weight...
h
LibriTTS-358-samples
huggingface.co
Updated Sep 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Zain (2024). LibriTTS-358-samples [Dataset]. https://huggingface.co/datasets/azain/LibriTTS-358-samples
Explore at:
Dataset updated
Sep 12, 2024
Authors
Ahmed Zain
Description
azain/LibriTTS-358-samples dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts-r-text-tags-v4
huggingface.co
Updated Feb 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yoach Lacombe (2024). libritts-r-text-tags-v4 [Dataset]. https://huggingface.co/datasets/ylacombe/libritts-r-text-tags-v4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 14, 2024
Authors
Yoach Lacombe
Description
ylacombe/libritts-r-text-tags-v4 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts
huggingface.co
Updated Sep 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
meraki (2024). libritts [Dataset]. https://huggingface.co/datasets/cmeraki/libritts
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 25, 2024
Authors
meraki
Description
cmeraki/libritts dataset hosted on Hugging Face and contributed by the HF Datasets community
h
3-LibriTTS-sample
huggingface.co
Updated Sep 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikhil Kumar Sharma (2024). 3-LibriTTS-sample [Dataset]. https://huggingface.co/datasets/Nikhil20Sharma/3-LibriTTS-sample
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 12, 2024
Authors
Nikhil Kumar Sharma
Description
Nikhil20Sharma/3-LibriTTS-sample dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts-r-mimi
huggingface.co
Updated Dec 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jacob Keisling (2024). libritts-r-mimi [Dataset]. https://huggingface.co/datasets/jkeisling/libritts-r-mimi
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 31, 2024
Authors
Jacob Keisling
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
LibriTTS-R Mimi encoding

This dataset converts all audio in the dev.clean, test.clean, train.100 and train.360 splits of the LibriTTS-R dataset from waveforms to tokens in Kyutai's Mimi neural codec. These tokens are intended as targets for DualAR audio models, but also allow you to simply download all audio in ~50-100x less space, if you're comfortable decoding later on with rustymimi or Transformers. This does NOT contain the original audio, please use the regular LibriTTS-R for… See the full description on the dataset page: https://huggingface.co/datasets/jkeisling/libritts-r-mimi.
h
200-dialogues-voices-libritts
huggingface.co
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SDialog (2025). 200-dialogues-voices-libritts [Dataset]. https://huggingface.co/datasets/sdialog/200-dialogues-voices-libritts
Explore at:
Dataset updated
Jul 30, 2025
Dataset authored and provided by
SDialog
Description
200 dialogues generated using SDialog:

ExpO0O5 > DoPaCo > 001 001: both roles use gemma3:27b-it-qat as LLM only doctor gets truncated '?'

Split without persona overlapp: train set: doc 0-59 pat 0-119 dev set: doc 60 -79 pat 120 - 139 test set: doc 80 - 99 pat 140 - 199

Audio Setup:

Databased of voices build from LibriTTS dataset IndexTTS model for utterances generation dScaper for channels and metadata creation PyRoomAcoustics for spacialization of the audio
h
libritts-r-filtered-speaker-descriptions
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ragman Teodora, libritts-r-filtered-speaker-descriptions [Dataset]. https://huggingface.co/datasets/TeodoraR/libritts-r-filtered-speaker-descriptions
Explore at:
Authors
Ragman Teodora
Description
TeodoraR/libritts-r-filtered-speaker-descriptions dataset hosted on Hugging Face and contributed by the HF Datasets community
h
LibriTTS-dev-clean-16khz-mono-loudnorm-100-random-samples-2024-04-18-17-34-39-similarities...
huggingface.co
Updated Apr 18, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Zain (2024). LibriTTS-dev-clean-16khz-mono-loudnorm-100-random-samples-2024-04-18-17-34-39-similarities [Dataset]. https://huggingface.co/datasets/azain/LibriTTS-dev-clean-16khz-mono-loudnorm-100-random-samples-2024-04-18-17-34-39-similarities
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 18, 2024
Authors
Ahmed Zain
Description
azain/LibriTTS-dev-clean-16khz-mono-loudnorm-100-random-samples-2024-04-18-17-34-39-similarities dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts-r-mhubert-2000units
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryota Komatsu, libritts-r-mhubert-2000units [Dataset]. https://huggingface.co/datasets/ryota-komatsu/libritts-r-mhubert-2000units
Explore at:
Authors
Ryota Komatsu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ryota-komatsu/libritts-r-mhubert-2000units dataset hosted on Hugging Face and contributed by the HF Datasets community
h
libritts-r-test-clean
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jaehoon Kang, libritts-r-test-clean [Dataset]. https://huggingface.co/datasets/morateng/libritts-r-test-clean
Explore at:
Authors
Jaehoon Kang
Description
morateng/libritts-r-test-clean dataset hosted on Hugging Face and contributed by the HF Datasets community

Facebook

Twitter

Click to copy link

Link copied

Cite

Mythic Infinity (2024). libritts [Dataset]. https://huggingface.co/datasets/mythicinfinity/libritts

libritts

mythicinfinity/libritts

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 9, 2024

Dataset authored and provided by

Mythic Infinity

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset Card for LibriTTS

LibriTTS is a multi-speaker English corpus of approximately 585 hours of read English speech at 24kHz sampling rate, prepared by Heiga Zen with the assistance of Google Speech and Google Brain team members. The LibriTTS corpus is designed for TTS research. It is derived from the original materials (mp3 audio files from LibriVox and text files from Project Gutenberg) of the LibriSpeech corpus.

  Overview

This is the LibriTTS dataset, adapted… See the full description on the dataset page: https://huggingface.co/datasets/mythicinfinity/libritts.

Clear search

Close search

Google apps

Main menu

libritts

libritts-aligned

libritts_r

Libri TTS dev

LibriTTS

Dataset

Contents

LibriTTS-raw

libritts_r_tags_tagged_10k_generated

voices-libritts

LibriTTS-Enhanced

ESPnet2 pretrained model,...

LibriTTS-358-samples

libritts-r-text-tags-v4

libritts

3-LibriTTS-sample

libritts-r-mimi

200-dialogues-voices-libritts

libritts-r-filtered-speaker-descriptions

LibriTTS-dev-clean-16khz-mono-loudnorm-100-random-samples-2024-04-18-17-34-39-similarities...

libritts-r-mhubert-2000units

libritts-r-test-clean

librittsSee More Versions

mythicinfinity/libritts

libritts