5 datasets found

P
AISHELL-3 Dataset
paperswithcode.com
opendatalab.com
Updated Oct 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). AISHELL-3 Dataset [Dataset]. https://paperswithcode.com/dataset/aishell-3
Explore at:
Dataset updated
Oct 22, 2024
Description
AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by 218 native Chinese mandarin speakers and total 88035 utterances. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided in the corpus. Accordingly, transcripts in Chinese character-level and pinyin-level are provided along with the recordings. The word & tone transcription accuracy rate is above 98%, through professional speech annotation and strict quality inspection for tone and prosody.
h
AISHELL-3
huggingface.co
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
沈云航 Yunhang Shen (2025). AISHELL-3 [Dataset]. https://huggingface.co/datasets/shenyunhang/AISHELL-3
Explore at:
Dataset updated
Feb 20, 2025
Authors
沈云航 Yunhang Shen
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
AISHELL-3

Identifier: SLR93 Summary: Mandarin data, provided by Beijing Shell Shell Technology Co., Ltd.

Category: Speech

License: Apache License v.2.0

Downloads (use a mirror closer to you): data_aishell3.tgz 19G Mirrors: [US]
[EU]
[CN]
About this resource:AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd. It can be used to train… See the full description on the dataset page: https://huggingface.co/datasets/shenyunhang/AISHELL-3.
h
aishell3
huggingface.co
Updated Dec 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phil Shen (2023). aishell3 [Dataset]. https://huggingface.co/datasets/shenberg1/aishell3
Explore at:
Dataset updated
Dec 21, 2023
Authors
Phil Shen
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
shenberg1/aishell3 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
AISHELL-3
huggingface.co
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matrix工作室 (2025). AISHELL-3 [Dataset]. https://huggingface.co/datasets/MatrixStudio/AISHELL-3
Explore at:
Dataset updated
Jun 12, 2025
Dataset authored and provided by
Matrix工作室
Description
MatrixStudio/AISHELL-3 dataset hosted on Hugging Face and contributed by the HF Datasets community
h
voxbox
huggingface.co
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spark Audio (2025). voxbox [Dataset]. https://huggingface.co/datasets/SparkAudio/voxbox
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Spark Audio
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
VoxBox

This dataset is a curated collection of bilingual speech corpora annotated clean transcriptions and rich metadata incluing age, gender, and emotion.

Dataset Structure

. ├── audios/ │ └── aishell-3/ # Audio files (organised by sub-corpus) │ └── ... └── metadata/ ├── aishell-3.jsonl ├── casia.jsonl ├── commonvoice_cn.jsonl ├── ... └── wenetspeech4tts.jsonl # JSONL metadata files

Each JSONL file corresponds to a… See the full description on the dataset page: https://huggingface.co/datasets/SparkAudio/voxbox.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2024). AISHELL-3 Dataset [Dataset]. https://paperswithcode.com/dataset/aishell-3

AISHELL-3 Dataset

Explore at:

287 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Oct 22, 2024

Description

AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by 218 native Chinese mandarin speakers and total 88035 utterances. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided in the corpus. Accordingly, transcripts in Chinese character-level and pinyin-level are provided along with the recordings. The word & tone transcription accuracy rate is above 98%, through professional speech annotation and strict quality inspection for tone and prosody.

Clear search

Close search

Google apps

Main menu

AISHELL-3 Dataset

AISHELL-3

aishell3

AISHELL-3

voxbox

AISHELL-3 DatasetSee More Versions

AISHELL-3 Dataset