8 datasets found
  1. MUSDB18 lyrics extension

    • zenodo.org
    • explore.openaire.eu
    • +1more
    text/x-python, txt +1
    Updated Jun 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau; Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau (2021). MUSDB18 lyrics extension [Dataset]. http://doi.org/10.5281/zenodo.3989267
    Explore at:
    zip, txt, text/x-pythonAvailable download formats
    Dataset updated
    Jun 25, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau; Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau
    Description

    This is a set of annotated lyrics transcripts for songs belonging to the MUSDB18 dataset. The set comprises lyrics of all songs which have English lyrics, i.e. 96 out of 100 songs for the training set and 45 out of 50 songs for the test set. MUSDB18 is a dataset for music source separation and provides the following separated tracks for each song: vocals, bass, drums, other (rest of the accompaniment), mixture.

    The lyrics transcripts, together with the audio files of MUSDB18, are a valuable resource for research on tasks such as text-informed singing voice separation, automatic lyrics alignment, automatic lyrics transcription, and singing voice synthesis and analysis. The provided data should be used for research purposes only.

    Disclaimer

    The lyrics were transcribed manually by the authors who are not native English speakers. It is likely that the transcriptions are not 100% correct. The composers of the songs are the copyright holders of the original lyrics.

    The songs were divided into sections of lengths between 3 and 12 seconds. The priority when choosing the section boundaries was that they correspond to natural pauses and do not cut vocal sounds. The sections do not necessarily correspond to lyrically meaningful lines. Most of the sections do not overlap, some have an overlap of 1 second. In some difficult cases, e.g. shouting in metal songs or mumbled words, where the words are barely intelligible, we made an effort to make the transcriptions as accurate as possible phonetically and did not prioritize semantically meaningful phrases.

    Citation

    The dataset was built for the paper

    Schulze-Forster, K., Doire, C., Richard, G., & Badeau, R. "Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation." IEEE/ACM Transactions on Audio, Speech and Language Processing (2021).

    If you use the data for your research, please cite the corresponding paper:

    @article{schulze2021phoneme,
     title={Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation},
     author={Schulze-Forster, Kilian and Doire, Clement and Richard, Ga{\"e}l and Badeau, Roland},
     journal={IEEE/ACM Transactions on Audio, Speech and Language Processing},
     year={2021},
     publisher={IEEE}
    }

    Annotations

    For each section, the annotations comprise: the start and end time, the corresponding lyrics, and a label indicating one of the following four properties:

    (a) only one person is singing
    (b) several singers are pronouncing the same phonemes at the same time (possibly singing different notes)
    (c) several singers are pronouncing different phonemes simultaneously (possibly singing different notes)
    (d) no singing

    Segments that are labelled with the property (b) or (c) do not necessarily have this property over the whole segment duration. As soon as somewhere in a segment several singers are present, label (b) was assigned; as soon as they sung different phonemes somewhere at the same time, label (c) was assigned. Property (a) and (d) are valid for the entire segment. Furthermore, segments with property (c) can contain either some (lead) singer(s) singing some words in the presence of background singers singing long vowels such as ’ah’ or ’oh’ or they can contain multiple singers who sing different words at the same time. In the latter case, it was very difficult to recognise the sung words and to decide in which order to transcribe words or phrases sung simultaneously. These segments are marked with a '*' and it is recommended to reject them for most use cases.

    The annotations have the following format:

    Example:
    00:18 00:23 a i know the reasons why --> starts at 18 sec., ends at 23 sec., vocals type (a), lyrics: i know the reasons why

    The Python script musdb_lyrics_cut_audio.py is provided to automatically cut the MUSDB songs into the annotated segments. The script requires the musdb and soundfile package. The user needs to update the paths and select the desired sources and vocals types in lines 19-26. The script saves wav-files for each selected source for each annotated segment as well as the corresponding lyrics as txt-file. The MUSDB training partition is divided into a training and validation set. The tracks for the validation set can be changed below line 29.

    The file words_and_phonemes.txt contains a list of all words and their decomposition into phonemes. The phonemes are written in 2-letter ARPABET style and obtained with the LOGIOS Lexicon Tool.

    License

    The data is licensed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, read the provided LICENSE.txt file, visit https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

    The creators of MUSDB18 lyrics extension and their corresponding affiliation institutes are not liable for, and expressly exclude, all liability for loss or damage however and whenever caused to anyone by any use of MUSDB18 lyrics extension or any part of it.

    Acknowledgment

    The authors would like to thank Olumide Okubadejo and Sinead Namur for their help with transcribing and correcting part of the lyrics.

  2. P

    MUSDB18 Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rafii (2021). MUSDB18 Dataset [Dataset]. https://paperswithcode.com/dataset/musdb18
    Explore at:
    Dataset updated
    Feb 15, 2021
    Authors
    Rafii
    Description

    The MUSDB18 is a dataset of 150 full lengths music tracks (~10h duration) of different genres along with their isolated drums, bass, vocals and others stems.

    The dataset is split into training and test sets with 100 and 50 songs, respectively. All signals are stereophonic and encoded at 44.1kHz.

  3. Z

    MUSDB18 - a corpus for music separation

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fabian-Robert Stöter (2022). MUSDB18 - a corpus for music separation [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1117371
    Explore at:
    Dataset updated
    Apr 4, 2022
    Dataset provided by
    Rafii, Zafar
    Fabian-Robert Stöter
    Mimilakis, Stylianos Ioannis
    Liutkus, Antoine
    Bittner, Rachel
    Description

    The sigsep musdb18 data set consists of a total of 150 full-track songs of different styles and includes both the stereo mixtures and the original sources, divided between a training subset and a test subset.

    Its purpose is to serve as a reference database for the design and the evaluation of source separation algorithms. The objective of such signal processing methods is to estimate one or more sources from a set of mixtures, e.g. for karaoke applications. It has been used as the official dataset in the professionally-produced music recordings task for SiSEC 2018, which is the international campaign for the evaluation of source separation algorithms.

    musdb18 contains two folders, a folder with a training set: “train”, composed of 100 songs, and a folder with a test set: “test”, composed of 50 songs. Supervised approaches should be trained on the training set and tested on both sets.

    All files from the musdb18 dataset are encoded in the Native Instruments stems format (.mp4). It is a multitrack format composed of 5 stereo streams, each one encoded in AAC @256kbps. These signals correspond to:

    0 - The mixture,

    1 - The drums,

    2 - The bass,

    3 - The rest of the accompaniment,

    4 - The vocals.

    For each file, the mixture correspond to the sum of all the signals. All signals are stereophonic and encoded at 44.1kHz.

    As the MUSDB18 is encoded as STEMS, it relies on ffmpeg to read the multi-stream files. We provide a python wrapper called stempeg that allows to easily parse the dataset and decode the stem tracks on-the-fly.

    License

    MUSDB18 is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders:

    100 tracks were derived from The ‘Mixing Secrets’ Free Multitrack Download Library. Please refer to this original resource for any question regarding your rights on your use of the DSD100 data.

    46 tracks are taken from the MedleyDB licensed under Creative Commons (BY-NC-SA 4.0).

    2 tracks were kindly provided by Native Instruments originally part of their stems pack.

    2 tracks a from from the Canadian rock band The Easton Ellises as part of the heise stems remix competition, licensed under Creative Commons (BY-NC-SA 3.0).

    References

    If you use the MUSDB dataset for your research - Cite the MUSDB18 Dataset

    @misc{MUSDB18, author = {Rafii, Zafar and Liutkus, Antoine and Fabian-Robert St{"o}ter and Mimilakis, Stylianos Ioannis and Bittner, Rachel}, title = {The {MUSDB18} corpus for music separation}, month = dec, year = 2017, doi = {10.5281/zenodo.1117372}, url = {https://doi.org/10.5281/zenodo.1117372} }

    If compare your results with SiSEC 2018 Participants - Cite the SiSEC 2018 LVA/ICA Paper

    @inproceedings{SiSEC18, author="St{"o}ter, Fabian-Robert and Liutkus, Antoine and Ito, Nobutaka", title="The 2018 Signal Separation Evaluation Campaign", booktitle="Latent Variable Analysis and Signal Separation: 14th International Conference, LVA/ICA 2018, Surrey, UK", year="2018", pages="293--305" }

  4. h

    musdb-alt

    • huggingface.co
    Updated Jun 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jaza Syed (2025). musdb-alt [Dataset]. https://huggingface.co/datasets/jazasyed/musdb-alt
    Explore at:
    Dataset updated
    Jun 27, 2025
    Authors
    Jaza Syed
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for MUSDB-ALT

    This dataset contains long-form lyric transcripts following the Jam-ALT guidelines for the test set of the dataset MUSDB18, with line-level timings. There are two versions of each transcript at the song and line level - text contains the normal transcript, and text_tagged contains the transcript with non-lexical vocables enclosed in tags .

      Dataset Details
    

    The dataset was constructed manually, based on the MUSDB18 lyrics… See the full description on the dataset page: https://huggingface.co/datasets/jazasyed/musdb-alt.

  5. P

    MUSDB18-HQ Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). MUSDB18-HQ Dataset [Dataset]. https://paperswithcode.com/dataset/musdb18-hq
    Explore at:
    Dataset updated
    Feb 19, 2021
    Description

    MUSDB18-HQ is a high-quality version of the MUSDB18 music tracks dataset. The high-quality dataset consists of the same 150 songs, but instead of MP4 files (compressed with Advanced Audio Coding encoder at 256kbps, with bandwidth limited to 16kHz), the songs are provided as raw WAV files.

  6. h

    MUSDB18

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artemis Webster, MUSDB18 [Dataset]. https://huggingface.co/datasets/artemisweb/MUSDB18
    Explore at:
    Authors
    Artemis Webster
    Description

    MUSDB 2018 Dataset

    The musdb18 consists of 150 songs of different styles along with the images of their constitutive objects. musdb18 contains two folders, a folder with a training set: "train", composed of 100 songs, and a folder with a test set: "test", composed of 50 songs. Supervised approaches should be trained on the training set and tested on both sets. All files from the musdb18 dataset are encoded in the Native Instruments stems format (.mp4). It is a multitrack format… See the full description on the dataset page: https://huggingface.co/datasets/artemisweb/MUSDB18.

  7. h

    MLPSepSynthdata

    • huggingface.co
    Updated Feb 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    vul traz (2024). MLPSepSynthdata [Dataset]. https://huggingface.co/datasets/therealvul/MLPSepSynthdata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 3, 2024
    Authors
    vul traz
    Description

    Synthetic separation data in musdb format for MLP episodes, constructed from SFX, music, and isolated dialogue lines using scripts here: https://github.com/effusiveperiscope/PPPDataset/blob/main/sfx.py

      language:
    
    • en
  8. CrossNet-Open-Unmix for Music Source Separation (X-UMX)

    • zenodo.org
    • explore.openaire.eu
    • +1more
    bin
    Updated Oct 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryosuke Sawata; Ryosuke Sawata; Stefan Uhlich; Shusuke Takahashi; Yuki Mitsufuji; Yuki Mitsufuji; Stefan Uhlich; Shusuke Takahashi (2023). CrossNet-Open-Unmix for Music Source Separation (X-UMX) [Dataset]. http://doi.org/10.5281/zenodo.4704231
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 28, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ryosuke Sawata; Ryosuke Sawata; Stefan Uhlich; Shusuke Takahashi; Yuki Mitsufuji; Yuki Mitsufuji; Stefan Uhlich; Shusuke Takahashi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Weights of CrossNet-Open-Unmix (X-UMX) trained on MUSDB18. The weights can be used with X-UMX on Asteroid (PyTorch). The details of X-UMX is described in here.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau; Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau (2021). MUSDB18 lyrics extension [Dataset]. http://doi.org/10.5281/zenodo.3989267
Organization logo

MUSDB18 lyrics extension

Explore at:
zip, txt, text/x-pythonAvailable download formats
Dataset updated
Jun 25, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau; Kilian Schulze-Forster; Clement S. J. Doire; Gaël Richard; Roland Badeau
Description

This is a set of annotated lyrics transcripts for songs belonging to the MUSDB18 dataset. The set comprises lyrics of all songs which have English lyrics, i.e. 96 out of 100 songs for the training set and 45 out of 50 songs for the test set. MUSDB18 is a dataset for music source separation and provides the following separated tracks for each song: vocals, bass, drums, other (rest of the accompaniment), mixture.

The lyrics transcripts, together with the audio files of MUSDB18, are a valuable resource for research on tasks such as text-informed singing voice separation, automatic lyrics alignment, automatic lyrics transcription, and singing voice synthesis and analysis. The provided data should be used for research purposes only.

Disclaimer

The lyrics were transcribed manually by the authors who are not native English speakers. It is likely that the transcriptions are not 100% correct. The composers of the songs are the copyright holders of the original lyrics.

The songs were divided into sections of lengths between 3 and 12 seconds. The priority when choosing the section boundaries was that they correspond to natural pauses and do not cut vocal sounds. The sections do not necessarily correspond to lyrically meaningful lines. Most of the sections do not overlap, some have an overlap of 1 second. In some difficult cases, e.g. shouting in metal songs or mumbled words, where the words are barely intelligible, we made an effort to make the transcriptions as accurate as possible phonetically and did not prioritize semantically meaningful phrases.

Citation

The dataset was built for the paper

Schulze-Forster, K., Doire, C., Richard, G., & Badeau, R. "Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation." IEEE/ACM Transactions on Audio, Speech and Language Processing (2021).

If you use the data for your research, please cite the corresponding paper:

@article{schulze2021phoneme,
 title={Phoneme Level Lyrics Alignment and Text-Informed Singing Voice Separation},
 author={Schulze-Forster, Kilian and Doire, Clement and Richard, Ga{\"e}l and Badeau, Roland},
 journal={IEEE/ACM Transactions on Audio, Speech and Language Processing},
 year={2021},
 publisher={IEEE}
}

Annotations

For each section, the annotations comprise: the start and end time, the corresponding lyrics, and a label indicating one of the following four properties:

(a) only one person is singing
(b) several singers are pronouncing the same phonemes at the same time (possibly singing different notes)
(c) several singers are pronouncing different phonemes simultaneously (possibly singing different notes)
(d) no singing

Segments that are labelled with the property (b) or (c) do not necessarily have this property over the whole segment duration. As soon as somewhere in a segment several singers are present, label (b) was assigned; as soon as they sung different phonemes somewhere at the same time, label (c) was assigned. Property (a) and (d) are valid for the entire segment. Furthermore, segments with property (c) can contain either some (lead) singer(s) singing some words in the presence of background singers singing long vowels such as ’ah’ or ’oh’ or they can contain multiple singers who sing different words at the same time. In the latter case, it was very difficult to recognise the sung words and to decide in which order to transcribe words or phrases sung simultaneously. These segments are marked with a '*' and it is recommended to reject them for most use cases.

The annotations have the following format:

Example:
00:18 00:23 a i know the reasons why --> starts at 18 sec., ends at 23 sec., vocals type (a), lyrics: i know the reasons why

The Python script musdb_lyrics_cut_audio.py is provided to automatically cut the MUSDB songs into the annotated segments. The script requires the musdb and soundfile package. The user needs to update the paths and select the desired sources and vocals types in lines 19-26. The script saves wav-files for each selected source for each annotated segment as well as the corresponding lyrics as txt-file. The MUSDB training partition is divided into a training and validation set. The tracks for the validation set can be changed below line 29.

The file words_and_phonemes.txt contains a list of all words and their decomposition into phonemes. The phonemes are written in 2-letter ARPABET style and obtained with the LOGIOS Lexicon Tool.

License

The data is licensed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, read the provided LICENSE.txt file, visit https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode or send a letter to Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.

The creators of MUSDB18 lyrics extension and their corresponding affiliation institutes are not liable for, and expressly exclude, all liability for loss or damage however and whenever caused to anyone by any use of MUSDB18 lyrics extension or any part of it.

Acknowledgment

The authors would like to thank Olumide Okubadejo and Sinead Namur for their help with transcribing and correcting part of the lyrics.

Search
Clear search
Close search
Google apps
Main menu