This dataset was created by Meher Deepak-2005
This dataset was created by XuChenLong
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Heqing_HappyStar
Released under CC0: Public Domain
This dataset was created by Bruno G. do Amaral
This dataset was created by LYLyyds
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Camaro
Released under CC0: Public Domain
This dataset was created by Vinicius Suaiden
This dataset contains the official pretrained weights of clip, released by OpenAI.
This is a public domain speech dataset consisting of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books. A transcription is provided for each clip. Clips vary in length from 1 to 10 seconds and have a total length of approximately 24 hours.
The texts were published between 1884 and 1964, and are in the public domain. The audio was recorded in 2016-17 by the LibriVox project and is also in the public domain.
Metadata is provided in transcripts.csv. This file consists of one record per line, delimited by the pipe character (0x7c). The fields are: * ID: this is the name of the corresponding .wav file * Transcription: words spoken by the reader (UTF-8) * Normalized Transcription: transcription with numbers, ordinals, and monetary units expanded into full words (UTF-8).
Each audio file is a single-channel 16-bit PCM WAV with a sample rate of 22050 Hz
means ~22 k
.
The audio clips range in length from approximately 1 second to 10 seconds. They were segmented automatically based on silences in the recording. Clip boundaries generally align with sentence or clause boundaries, but not always. The text was matched to the audio manually, and a QA pass was done to ensure that the text accurately matched the words spoken in the audio. The original LibriVox recordings were distributed as 128 kbps MP3 files. As a result, they may contain artifacts introduced by the MP3 encoding. The following abbreviations appear in the text. They may be expanded as follows:
Abbreviation Expansion
Mr. Mister
Mrs. Misess (*)
Dr. Doctor
No. Number
St. Saint
Co. Company
Jr. Junior
Maj. Major
Gen. General
Drs. Doctors
Rev. Reverend
Lt. Lieutenant
Hon. Honorable
Sgt. Sergeant
Capt. Captain
Esq. Esquire
Ltd. Limited
Col. Colonel
Ft. Fort
(*) there's no standard expansion for "Mrs." 19 of the transcriptions contain non-ASCII characters (for example, LJ016-0257 contains "raison d'être"). Example code using this dataset to train a speech synthesis model can be found at: github.com/keithito/tacotron. For more information or to report errors, please email kito@kito.us.
This dataset was created by The Devastator
This dataset was created by Leonid Kulyk
This dataset was created by Leonid Kulyk
This dataset was created by Kevin(Zeming) Wang
This dataset was created by smnd
This dataset was created by SeaLeopard
This dataset was created by ForcewithMe
This dataset was created by kirito174
This dataset was created by prakash
Released under Data files © Original Authors
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by JiangShaoYin
Released under CC0: Public Domain
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset was created by Mathurin Aché
Released under CC BY-NC-SA 4.0
This dataset was created by Meher Deepak-2005