5 datasets found
  1. P

    AISHELL-3 Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Oct 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). AISHELL-3 Dataset [Dataset]. https://paperswithcode.com/dataset/aishell-3
    Explore at:
    Dataset updated
    Oct 22, 2024
    Description

    AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by 218 native Chinese mandarin speakers and total 88035 utterances. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided in the corpus. Accordingly, transcripts in Chinese character-level and pinyin-level are provided along with the recordings. The word & tone transcription accuracy rate is above 98%, through professional speech annotation and strict quality inspection for tone and prosody.

  2. h

    AISHELL-3

    • huggingface.co
    Updated Feb 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    沈云航 Yunhang Shen (2025). AISHELL-3 [Dataset]. https://huggingface.co/datasets/shenyunhang/AISHELL-3
    Explore at:
    Dataset updated
    Feb 20, 2025
    Authors
    沈云航 Yunhang Shen
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    AISHELL-3

    Identifier: SLR93 Summary: Mandarin data, provided by Beijing Shell Shell Technology Co., Ltd.

    Category: Speech

    License: Apache License v.2.0

    Downloads (use a mirror closer to you): data_aishell3.tgz 19G Mirrors: [US]
    [EU]
    [CN]
    About this resource:AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus published by Beijing Shell Shell Technology Co.,Ltd. It can be used to train… See the full description on the dataset page: https://huggingface.co/datasets/shenyunhang/AISHELL-3.

  3. h

    aishell3

    • huggingface.co
    Updated Dec 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phil Shen (2023). aishell3 [Dataset]. https://huggingface.co/datasets/shenberg1/aishell3
    Explore at:
    Dataset updated
    Dec 21, 2023
    Authors
    Phil Shen
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    shenberg1/aishell3 dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    AISHELL-3

    • huggingface.co
    Updated Jun 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matrix工作室 (2025). AISHELL-3 [Dataset]. https://huggingface.co/datasets/MatrixStudio/AISHELL-3
    Explore at:
    Dataset updated
    Jun 12, 2025
    Dataset authored and provided by
    Matrix工作室
    Description

    MatrixStudio/AISHELL-3 dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    voxbox

    • huggingface.co
    Updated Jul 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spark Audio (2025). voxbox [Dataset]. https://huggingface.co/datasets/SparkAudio/voxbox
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Spark Audio
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    VoxBox

    This dataset is a curated collection of bilingual speech corpora annotated clean transcriptions and rich metadata incluing age, gender, and emotion.

      Dataset Structure
    

    . ├── audios/ │ └── aishell-3/ # Audio files (organised by sub-corpus) │ └── ... └── metadata/ ├── aishell-3.jsonl ├── casia.jsonl ├── commonvoice_cn.jsonl ├── ... └── wenetspeech4tts.jsonl # JSONL metadata files

    Each JSONL file corresponds to a… See the full description on the dataset page: https://huggingface.co/datasets/SparkAudio/voxbox.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). AISHELL-3 Dataset [Dataset]. https://paperswithcode.com/dataset/aishell-3

AISHELL-3 Dataset

Explore at:
287 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 22, 2024
Description

AISHELL-3 is a large-scale and high-fidelity multi-speaker Mandarin speech corpus which could be used to train multi-speaker Text-to-Speech (TTS) systems. The corpus contains roughly 85 hours of emotion-neutral recordings spoken by 218 native Chinese mandarin speakers and total 88035 utterances. Their auxiliary attributes such as gender, age group and native accents are explicitly marked and provided in the corpus. Accordingly, transcripts in Chinese character-level and pinyin-level are provided along with the recordings. The word & tone transcription accuracy rate is above 98%, through professional speech annotation and strict quality inspection for tone and prosody.

Search
Clear search
Close search
Google apps
Main menu