2 datasets found

h
ClArTTS
huggingface.co
Updated Apr 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Atharva Kulkarni (2024). ClArTTS [Dataset]. https://huggingface.co/datasets/AtharvA7k/ClArTTS
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 17, 2024
Authors
Atharva Kulkarni
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for ClArTTS

Speech corpus for Classical Arabic Text-to-Speech (ClArTTS) to support the development of end-to-end TTS systems for Arabic. The speech is extracted from a LibriVox audiobook, which is then processed, segmented, and manually transcribed and annotated

Dataset Details

ClArTTS corpus contains about 12 hours of speech from a single male speaker sampled at 40100 kHz

Dataset Description

At present, Text-to-speech (TTS) systems that are… See the full description on the dataset page: https://huggingface.co/datasets/AtharvA7k/ClArTTS.
h
ClArTTS
huggingface.co
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohamed Bin Zayed University of Artificial Intelligence (2024). ClArTTS [Dataset]. https://huggingface.co/datasets/MBZUAI/ClArTTS
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 18, 2024
Dataset authored and provided by
Mohamed Bin Zayed University of Artificial Intelligence
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Summary

We present a speech corpus for Classical Arabic Text-to-Speech (ClArTTS) to support the development of end-to-end TTS systems for Arabic. The speech is extracted from a LibriVox audiobook, which is then processed, segmented, and manually transcribed and annotated. The final ClArTTS corpus contains about 12 hours of speech from a single male speaker sampled at 40100 kHz.

Dataset Structure

A typical data point comprises the name of the audio file, called… See the full description on the dataset page: https://huggingface.co/datasets/MBZUAI/ClArTTS.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Atharva Kulkarni (2024). ClArTTS [Dataset]. https://huggingface.co/datasets/AtharvA7k/ClArTTS

ClArTTS

AtharvA7k/ClArTTS

Explore at:

18 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 17, 2024

Authors

Atharva Kulkarni

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset Card for ClArTTS

Speech corpus for Classical Arabic Text-to-Speech (ClArTTS) to support the development of end-to-end TTS systems for Arabic. The speech is extracted from a LibriVox audiobook, which is then processed, segmented, and manually transcribed and annotated

  Dataset Details

ClArTTS corpus contains about 12 hours of speech from a single male speaker sampled at 40100 kHz

  Dataset Description

At present, Text-to-speech (TTS) systems that are… See the full description on the dataset page: https://huggingface.co/datasets/AtharvA7k/ClArTTS.

Clear search

Close search

Google apps

Main menu