8 datasets found

Slakh2100
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Mar 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). Slakh2100 [Dataset]. http://doi.org/10.5281/zenodo.4599666
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4599666
Dataset updated
Mar 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

At a Glance:

The dataset comes as a series of directories named like TrackXXXXX, where XXXXX is a number between 00001 and 02100. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here.

All audio in Slakh2100 is distributed in the .flac format. Scripts to batch convert are here.

All audio is mono and was rendered at 44.1kHz, 16-bit (CD quality) before being converted to .flac.

Slakh2100 is a 105 Gb download. Unzipped and converted to .wav, Slakh2100 is almost 500 Gb. Please plan accordingly.

Each mixture has a variable number of sources, with a minimum of 4 sources per mix.

Every mix as at least 1 instance of each of the following instrument types: Piano, Guitar, Drums, Bass.

metadata.yaml has detailed information about each source. Details about the metadata are here.

Helpful Links:

For more information, see www.slakh.com.

Support code for Slakh: Available here.

Code to render Slakh data: Available in this repo.

See the dataset at a glance, and info about metadata.yaml.

A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.

Important Info about Splits:

The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.

The version of Slakh hosted, here, on Zenodo contains a directory called omitted, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits

Citing Slakh:

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting, title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity}, author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan}, booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)}, year={2019}, organization={IEEE} }
Slakh2100-16k
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Mar 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sungkyun Chang; Sungkyun Chang (2023). Slakh2100-16k [Dataset]. http://doi.org/10.5281/zenodo.7708270
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7708270
Dataset updated
Mar 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sungkyun Chang; Sungkyun Chang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction:

This is a variant of Slakh2100 dataset (2019, Manilow et al.) resampled in `16 kHz-mono-16-bit-flac` format. `omitted` directory of the original dataset is reduced. A license for redistribution is attached. Please see the link below for more details.

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

Citing Slakh:

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting, title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity}, author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan}, booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)}, year={2019}, organization={IEEE} }
Slakh2100 (Synthesized Lakh Dataset)
opendatalab.com
zip
Updated Oct 20, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitsubishi Electric Research Laboratories (2019). Slakh2100 (Synthesized Lakh Dataset) [Dataset]. https://opendatalab.com/OpenDataLab/Slakh2100
Explore at:
zip(107879377011 bytes)Available download formats
Dataset updated
Oct 20, 2019
Dataset provided by
三菱电机研究实验室http://www.merl.com/
Interactive Audio Lab at Northwestern University
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Synthesized Lakh (Slakh) Dataset is a dataset for audio source separation that is synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments. This first release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying MIDI files synthesized using a professional-grade sampling engine. The tracks in Slakh2100 are split into training (1500 tracks), validation (375 tracks), and test (225 tracks) subsets, totaling 145 hours of mixtures.
h
Slakh2100-FLAC-Redux-Reduced
huggingface.co
Updated Feb 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nguyễn Thế Hoàng (2024). Slakh2100-FLAC-Redux-Reduced [Dataset]. https://huggingface.co/datasets/DreamyWanderer/Slakh2100-FLAC-Redux-Reduced
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 17, 2024
Authors
Nguyễn Thế Hoàng
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
DreamyWanderer/Slakh2100-FLAC-Redux-Reduced dataset hosted on Hugging Face and contributed by the HF Datasets community
Slakh2100-16k for YourMT3
zenodo.org
application/gzip, bin +1
Updated Mar 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sungkyun Chang; Sungkyun Chang; Simon Dixon; Simon Dixon; Emmanouil Benetos; Emmanouil Benetos (2023). Slakh2100-16k for YourMT3 [Dataset]. http://doi.org/10.5281/zenodo.7717249
Explore at:
bin, json, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7717249
Dataset updated
Mar 15, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sungkyun Chang; Sungkyun Chang; Simon Dixon; Simon Dixon; Emmanouil Benetos; Emmanouil Benetos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
About this version:

This is a variant of Slakh2100 dataset (Manilow, 2019) resampled in '16 kHz-mono-16-bit-wav' format. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached. Note that the 'omitted' directory of the slakh2100-redux version is reduced in this version. Thus, we have three splits: train, validation and test.

MIRData integration:

Like the previous version of Slakh, this version will also be integrated into the MIRData (Bittner, 2019) project for convenient use. For this we provide an index file in 'json' format. The code for the customized MIRData is included in our YourMT3 project.

Citing YourMT3:

@misc{sungkyun_chang_2022_7470191, author = {Sungkyun Chang and Simon Dixon and Emmanouil Benetos}, title = {{YourMT3: a toolkit for training multi-task and multi-track music transcription model for everyone}}, month = dec, year = 2022, note = {{(Poster) Presented at DMRN+17: Digital Music Research Network One-day Workshop 2022}}, publisher = {Zenodo}, doi = {10.5281/zenodo.7470191}, url = {https://doi.org/10.5281/zenodo.7470191} }

About Slakh2100:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

Citing Slakh & MIRData:

@inproceedings{manilow2019cutting, title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity}, author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan}, booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)}, year={2019}, organization={IEEE} }

@inproceedings{ bittner_fuentes_2019, title={mirdata: Software for Reproducible Usage of Datasets}, author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor}, booktitle={International Society for Music Information Retrieval (ISMIR) Conference}, year={2019} }

Acknowledgement:

We thank the Zenodo team for allowing us additional storage.
BabySlakh
zenodo.org
zip
Updated Mar 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). BabySlakh [Dataset]. http://doi.org/10.5281/zenodo.4603844
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4603844
Dataset updated
Mar 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

BabySlakh is a tiny version of Slakh2100 (zenodo link) that is useful for debugging. It consists of the first 20 tracks of Slakh2100 (i.e., Track00001 through Track00020). All of the audio is in the wav format and has a sample rate of 16 kHz. BabySlakh is ready to go once it's unzipped.

About Slakh

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 patches categorized into 34 classes.

Citing Slakh

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting, title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity}, author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan}, booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)}, year={2019}, organization={IEEE} }
h
melodySim
huggingface.co
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AMAAI Lab (2025). melodySim [Dataset]. https://huggingface.co/datasets/amaai-lab/melodySim
Explore at:
Dataset updated
Jun 2, 2025
Dataset authored and provided by
AMAAI Lab
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

Github | Model | Paper The MelodySim dataset contains 1,710 valid synthesized pieces originated from Slakh2100 dataset, each containing 4 different versions (through various augmentation settings), with a total duration of 419 hours. This dataset may help research in:

Music similarity learning Music plagiarism detection

Dataset Details

The MelodySim dataset contains three splits: train… See the full description on the dataset page: https://huggingface.co/datasets/amaai-lab/melodySim.
Librispeech Slakh Unmix (LSX)
zenodo.org
data.niaid.nih.gov
zip
Updated Apr 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darius Petermann; Darius Petermann; Gordon Wichern; Gordon Wichern; Jonathan Le Roux; Jonathan Le Roux (2023). Librispeech Slakh Unmix (LSX) [Dataset]. http://doi.org/10.5281/zenodo.7765140
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7765140
Dataset updated
Apr 4, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Darius Petermann; Darius Petermann; Gordon Wichern; Gordon Wichern; Jonathan Le Roux; Jonathan Le Roux
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

Librispeech Slakh Unmix (LSX) is a proof of concept source separation dataset for training and testing algorithms that separate a monaural audio signal using hyperbolic embeddings for hierarchical separation. The dataset is composed of artificial mixtures using audio from the librispeech (clean subset) and Slakh2100 datasets. The dataset was introduced in our paper Hyperbolic Audio Source Separation.

At a Glance

The size of the unzipped dataset is ~28GB

Each mixture is 60-s in length and denotes the first 60 s of the bass, drums, and guitar stems of the associated Slakh2100 track.

Audio is encoded as 16 bit wav files at a sampling rate of 16 kHz

The data is split into training tr (1390 mixtues), validation cv (348 mixtures) and testing tt (209 mixtures) subsets

The directory for each mixture contains eight wav files:

mix.wav the overall mixture from the five child sources

music_mix.wav the music submix containing guitar, bass, and drums

speech_mix.wav the speech submix containing both male and female speech signals

bass.wav original bass submix from slakh track

drums.wav original drums submix from slakh track

guitar.wav original guitar submix from slakh track

speech_male.wav concatenated male speech utterances filling the length of the song

speech_female.wav concatenated female speech utterances filling the length of the song

Other Resources

Pytorch code for training models along with our hyperbolic separation interface are available here

Citation

If you use LSX in your research, please cite our paper:

@InProceedings{Petermann2023ICASSP_hyper, author = {Petermann, Darius and Wichern, Gordon and Subramanian, Aswin and {Le Roux}, Jonathan}, title = {Hyperbolic Audio Source Separation}, booktitle = {Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year = 2023, month = jun }

Copyright and License

The LSX dataset is released under CC-BY-4.0 license.

All data:

Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023 SPDX-License-Identifier: CC-BY-4.0
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). Slakh2100 [Dataset]. http://doi.org/10.5281/zenodo.4599666

Slakh2100

Explore at:

68 scholarly articles cite this dataset (View in Google Scholar)

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.4599666

Dataset updated

Mar 15, 2021

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Introduction:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

At a Glance:

The dataset comes as a series of directories named like TrackXXXXX, where XXXXX is a number between 00001 and 02100. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here.
All audio in Slakh2100 is distributed in the .flac format. Scripts to batch convert are here.
All audio is mono and was rendered at 44.1kHz, 16-bit (CD quality) before being converted to .flac.
Slakh2100 is a 105 Gb download. Unzipped and converted to .wav, Slakh2100 is almost 500 Gb. Please plan accordingly.
Each mixture has a variable number of sources, with a minimum of 4 sources per mix.
Every mix as at least 1 instance of each of the following instrument types: Piano, Guitar, Drums, Bass.
metadata.yaml has detailed information about each source. Details about the metadata are here.

Helpful Links:

For more information, see www.slakh.com.

Support code for Slakh: Available here.

Code to render Slakh data: Available in this repo.

See the dataset at a glance, and info about metadata.yaml.

A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.

Important Info about Splits:

The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.

The version of Slakh hosted, here, on Zenodo contains a directory called omitted, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits

Citing Slakh:

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting,
 title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
 author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
 booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
 year={2019},
 organization={IEEE}
}

Clear search

Close search

Google apps

Main menu

Slakh2100

Slakh2100-16k

Slakh2100 (Synthesized Lakh Dataset)

Slakh2100-FLAC-Redux-Reduced

Slakh2100-16k for YourMT3

BabySlakh

melodySim

Librispeech Slakh Unmix (LSX)

Slakh2100