Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction:
The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.
At a Glance:
TrackXXXXX
, where XXXXX
is a number between 00001
and 02100
. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here..flac
format. Scripts to batch convert are here..flac
..wav
, Slakh2100 is almost 500 Gb. Please plan accordingly.Piano, Guitar, Drums, Bass
.metadata.yaml
has detailed information about each source. Details about the metadata are here.
Helpful Links:
For more information, see www.slakh.com.
Support code for Slakh: Available here.
Code to render Slakh data: Available in this repo.
See the dataset at a glance, and info about metadata.yaml
.
A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.
Important Info about Splits:
The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.
The version of Slakh hosted, here, on Zenodo contains a directory called omitted
, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted
directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux
. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits
Citing Slakh:
If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:
@inproceedings{manilow2019cutting,
title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2019},
organization={IEEE}
}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction:
This is a variant of Slakh2100 dataset (2019, Manilow et al.) resampled in `16 kHz-mono-16-bit-flac` format. `omitted` directory of the original dataset is reduced. A license for redistribution is attached. Please see the link below for more details.
The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.
Citing Slakh:
If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:
@inproceedings{manilow2019cutting,
title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2019},
organization={IEEE}
}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Synthesized Lakh (Slakh) Dataset is a dataset for audio source separation that is synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments. This first release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying MIDI files synthesized using a professional-grade sampling engine. The tracks in Slakh2100 are split into training (1500 tracks), validation (375 tracks), and test (225 tracks) subsets, totaling 145 hours of mixtures.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
DreamyWanderer/Slakh2100-FLAC-Redux-Reduced dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
About this version:
This is a variant of Slakh2100 dataset (Manilow, 2019) resampled in '16 kHz-mono-16-bit-wav' format. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached. Note that the 'omitted' directory of the slakh2100-redux version is reduced in this version. Thus, we have three splits: train, validation and test.
MIRData integration:
Like the previous version of Slakh, this version will also be integrated into the MIRData (Bittner, 2019) project for convenient use. For this we provide an index file in 'json' format. The code for the customized MIRData is included in our YourMT3 project.
Citing YourMT3:
@misc{sungkyun_chang_2022_7470191,
author = {Sungkyun Chang and Simon Dixon and Emmanouil Benetos},
title = {{YourMT3: a toolkit for training multi-task and multi-track music transcription model for everyone}},
month = dec,
year = 2022,
note = {{(Poster) Presented at DMRN+17: Digital Music Research Network One-day Workshop 2022}},
publisher = {Zenodo},
doi = {10.5281/zenodo.7470191},
url = {https://doi.org/10.5281/zenodo.7470191}
}
About Slakh2100:
The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.
Citing Slakh & MIRData:
@inproceedings{manilow2019cutting,
title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2019},
organization={IEEE}
}
@inproceedings{
bittner_fuentes_2019,
title={mirdata: Software for Reproducible Usage of Datasets},
author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
year={2019}
}
Acknowledgement:
We thank the Zenodo team for allowing us additional storage.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
BabySlakh is a tiny version of Slakh2100 (zenodo link) that is useful for debugging. It consists of the first 20 tracks of Slakh2100 (i.e., Track00001 through Track00020). All of the audio is in the wav format and has a sample rate of 16 kHz. BabySlakh is ready to go once it's unzipped.
About Slakh
The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 patches categorized into 34 classes.
Citing Slakh
If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:
@inproceedings{manilow2019cutting,
title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2019},
organization={IEEE}
}
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection
Github | Model | Paper The MelodySim dataset contains 1,710 valid synthesized pieces originated from Slakh2100 dataset, each containing 4 different versions (through various augmentation settings), with a total duration of 419 hours. This dataset may help research in:
Music similarity learning Music plagiarism detection
Dataset Details
The MelodySim dataset contains three splits: train… See the full description on the dataset page: https://huggingface.co/datasets/amaai-lab/melodySim.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
Librispeech Slakh Unmix (LSX) is a proof of concept source separation dataset for training and testing algorithms that separate a monaural audio signal using hyperbolic embeddings for hierarchical separation. The dataset is composed of artificial mixtures using audio from the librispeech (clean subset) and Slakh2100 datasets. The dataset was introduced in our paper Hyperbolic Audio Source Separation.
At a Glance
tr
(1390 mixtues), validation cv
(348 mixtures) and testing tt
(209 mixtures) subsetsmix.wav
the overall mixture from the five child sourcesmusic_mix.wav
the music submix containing guitar, bass, and drumsspeech_mix.wav
the speech submix containing both male and female speech signalsbass.wav
original bass submix from slakh trackdrums.wav
original drums submix from slakh trackguitar.wav
original guitar submix from slakh trackspeech_male.wav
concatenated male speech utterances filling the length of the songspeech_female.wav
concatenated female speech utterances filling the length of the songOther Resources
Pytorch code for training models along with our hyperbolic separation interface are available here
Citation
If you use LSX in your research, please cite our paper:
@InProceedings{Petermann2023ICASSP_hyper,
author = {Petermann, Darius and Wichern, Gordon and Subramanian, Aswin and {Le Roux}, Jonathan},
title = {Hyperbolic Audio Source Separation},
booktitle = {Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year = 2023,
month = jun
}
Copyright and License
The LSX dataset is released under CC-BY-4.0 license.
All data:
Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023
SPDX-License-Identifier: CC-BY-4.0
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction:
The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.
At a Glance:
TrackXXXXX
, where XXXXX
is a number between 00001
and 02100
. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here..flac
format. Scripts to batch convert are here..flac
..wav
, Slakh2100 is almost 500 Gb. Please plan accordingly.Piano, Guitar, Drums, Bass
.metadata.yaml
has detailed information about each source. Details about the metadata are here.
Helpful Links:
For more information, see www.slakh.com.
Support code for Slakh: Available here.
Code to render Slakh data: Available in this repo.
See the dataset at a glance, and info about metadata.yaml
.
A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.
Important Info about Splits:
The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.
The version of Slakh hosted, here, on Zenodo contains a directory called omitted
, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted
directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux
. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits
Citing Slakh:
If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:
@inproceedings{manilow2019cutting,
title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
year={2019},
organization={IEEE}
}