8 datasets found
  1. Slakh2100

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Mar 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). Slakh2100 [Dataset]. http://doi.org/10.5281/zenodo.4599666
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction:

    The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

    At a Glance:

    • The dataset comes as a series of directories named like TrackXXXXX, where XXXXX is a number between 00001 and 02100. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here.
    • All audio in Slakh2100 is distributed in the .flac format. Scripts to batch convert are here.
    • All audio is mono and was rendered at 44.1kHz, 16-bit (CD quality) before being converted to .flac.
    • Slakh2100 is a 105 Gb download. Unzipped and converted to .wav, Slakh2100 is almost 500 Gb. Please plan accordingly.
    • Each mixture has a variable number of sources, with a minimum of 4 sources per mix.
    • Every mix as at least 1 instance of each of the following instrument types: Piano, Guitar, Drums, Bass.
    • metadata.yaml has detailed information about each source. Details about the metadata are here.

    Helpful Links:

    For more information, see www.slakh.com.

    Support code for Slakh: Available here.

    Code to render Slakh data: Available in this repo.

    See the dataset at a glance, and info about metadata.yaml.

    A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.

    Important Info about Splits:

    The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.

    The version of Slakh hosted, here, on Zenodo contains a directory called omitted, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits

    Citing Slakh:

    If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

    @inproceedings{manilow2019cutting,
     title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
     author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
     booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
     year={2019},
     organization={IEEE}
    }

  2. Slakh2100-16k

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sungkyun Chang; Sungkyun Chang (2023). Slakh2100-16k [Dataset]. http://doi.org/10.5281/zenodo.7708270
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sungkyun Chang; Sungkyun Chang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction:

    This is a variant of Slakh2100 dataset (2019, Manilow et al.) resampled in `16 kHz-mono-16-bit-flac` format. `omitted` directory of the original dataset is reduced. A license for redistribution is attached. Please see the link below for more details.

    The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

    Citing Slakh:

    If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

    @inproceedings{manilow2019cutting,
     title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
     author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
     booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
     year={2019},
     organization={IEEE}
    }
  3. Slakh2100 (Synthesized Lakh Dataset)

    • opendatalab.com
    zip
    Updated Oct 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mitsubishi Electric Research Laboratories (2019). Slakh2100 (Synthesized Lakh Dataset) [Dataset]. https://opendatalab.com/OpenDataLab/Slakh2100
    Explore at:
    zip(107879377011 bytes)Available download formats
    Dataset updated
    Oct 20, 2019
    Dataset provided by
    三菱电机研究实验室http://www.merl.com/
    Interactive Audio Lab at Northwestern University
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Synthesized Lakh (Slakh) Dataset is a dataset for audio source separation that is synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments. This first release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying MIDI files synthesized using a professional-grade sampling engine. The tracks in Slakh2100 are split into training (1500 tracks), validation (375 tracks), and test (225 tracks) subsets, totaling 145 hours of mixtures.

  4. h

    Slakh2100-FLAC-Redux-Reduced

    • huggingface.co
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nguyễn Thế Hoàng (2024). Slakh2100-FLAC-Redux-Reduced [Dataset]. https://huggingface.co/datasets/DreamyWanderer/Slakh2100-FLAC-Redux-Reduced
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2024
    Authors
    Nguyễn Thế Hoàng
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    DreamyWanderer/Slakh2100-FLAC-Redux-Reduced dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. Slakh2100-16k for YourMT3

    • zenodo.org
    application/gzip, bin +1
    Updated Mar 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sungkyun Chang; Sungkyun Chang; Simon Dixon; Simon Dixon; Emmanouil Benetos; Emmanouil Benetos (2023). Slakh2100-16k for YourMT3 [Dataset]. http://doi.org/10.5281/zenodo.7717249
    Explore at:
    bin, json, application/gzipAvailable download formats
    Dataset updated
    Mar 15, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sungkyun Chang; Sungkyun Chang; Simon Dixon; Simon Dixon; Emmanouil Benetos; Emmanouil Benetos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    About this version:

    This is a variant of Slakh2100 dataset (Manilow, 2019) resampled in '16 kHz-mono-16-bit-wav' format. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached. Note that the 'omitted' directory of the slakh2100-redux version is reduced in this version. Thus, we have three splits: train, validation and test.

    MIRData integration:

    Like the previous version of Slakh, this version will also be integrated into the MIRData (Bittner, 2019) project for convenient use. For this we provide an index file in 'json' format. The code for the customized MIRData is included in our YourMT3 project.

    Citing YourMT3:

    @misc{sungkyun_chang_2022_7470191,
     author = {Sungkyun Chang and Simon Dixon and Emmanouil Benetos},
     title = {{YourMT3: a toolkit for training multi-task and multi-track music transcription model for everyone}},
     month = dec,
     year = 2022,
     note = {{(Poster) Presented at DMRN+17: Digital Music Research Network One-day Workshop 2022}},
     publisher = {Zenodo},
     doi = {10.5281/zenodo.7470191},
     url = {https://doi.org/10.5281/zenodo.7470191}
    }

    About Slakh2100:

    The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

    Citing Slakh & MIRData:

    @inproceedings{manilow2019cutting,
     title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
     author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
     booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
     year={2019},
     organization={IEEE}
    }
    @inproceedings{
     bittner_fuentes_2019,
     title={mirdata: Software for Reproducible Usage of Datasets},
     author={Bittner, Rachel M and Fuentes, Magdalena and Rubinstein, David and Jansson, Andreas and Choi, Keunwoo and Kell, Thor},
     booktitle={International Society for Music Information Retrieval (ISMIR) Conference},
     year={2019}
    }

    Acknowledgement:

    We thank the Zenodo team for allowing us additional storage.

  6. BabySlakh

    • zenodo.org
    zip
    Updated Mar 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). BabySlakh [Dataset]. http://doi.org/10.5281/zenodo.4603844
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 15, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    BabySlakh is a tiny version of Slakh2100 (zenodo link) that is useful for debugging. It consists of the first 20 tracks of Slakh2100 (i.e., Track00001 through Track00020). All of the audio is in the wav format and has a sample rate of 16 kHz. BabySlakh is ready to go once it's unzipped.

    About Slakh

    The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 patches categorized into 34 classes.

    Citing Slakh

    If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

    @inproceedings{manilow2019cutting,
     title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
     author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
     booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
     year={2019},
     organization={IEEE}
    }

  7. h

    melodySim

    • huggingface.co
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AMAAI Lab (2025). melodySim [Dataset]. https://huggingface.co/datasets/amaai-lab/melodySim
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    AMAAI Lab
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    MelodySim: Measuring Melody-aware Music Similarity for Plagiarism Detection

    Github | Model | Paper The MelodySim dataset contains 1,710 valid synthesized pieces originated from Slakh2100 dataset, each containing 4 different versions (through various augmentation settings), with a total duration of 419 hours. This dataset may help research in:

    Music similarity learning Music plagiarism detection

      Dataset Details
    

    The MelodySim dataset contains three splits: train… See the full description on the dataset page: https://huggingface.co/datasets/amaai-lab/melodySim.

  8. Librispeech Slakh Unmix (LSX)

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darius Petermann; Darius Petermann; Gordon Wichern; Gordon Wichern; Jonathan Le Roux; Jonathan Le Roux (2023). Librispeech Slakh Unmix (LSX) [Dataset]. http://doi.org/10.5281/zenodo.7765140
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 4, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Darius Petermann; Darius Petermann; Gordon Wichern; Gordon Wichern; Jonathan Le Roux; Jonathan Le Roux
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    Librispeech Slakh Unmix (LSX) is a proof of concept source separation dataset for training and testing algorithms that separate a monaural audio signal using hyperbolic embeddings for hierarchical separation. The dataset is composed of artificial mixtures using audio from the librispeech (clean subset) and Slakh2100 datasets. The dataset was introduced in our paper Hyperbolic Audio Source Separation.

    At a Glance

    • The size of the unzipped dataset is ~28GB
    • Each mixture is 60-s in length and denotes the first 60 s of the bass, drums, and guitar stems of the associated Slakh2100 track.
    • Audio is encoded as 16 bit wav files at a sampling rate of 16 kHz
    • The data is split into training tr (1390 mixtues), validation cv (348 mixtures) and testing tt (209 mixtures) subsets
    • The directory for each mixture contains eight wav files:
      • mix.wav the overall mixture from the five child sources
      • music_mix.wav the music submix containing guitar, bass, and drums
      • speech_mix.wav the speech submix containing both male and female speech signals
      • bass.wav original bass submix from slakh track
      • drums.wav original drums submix from slakh track
      • guitar.wav original guitar submix from slakh track
      • speech_male.wav concatenated male speech utterances filling the length of the song
      • speech_female.wav concatenated female speech utterances filling the length of the song

    Other Resources

    Pytorch code for training models along with our hyperbolic separation interface are available here

    Citation

    If you use LSX in your research, please cite our paper:

    @InProceedings{Petermann2023ICASSP_hyper,
     author =  {Petermann, Darius and Wichern, Gordon and Subramanian, Aswin and {Le Roux}, Jonathan},
     title =  {Hyperbolic Audio Source Separation},
     booktitle =  {Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
     year =   2023,
     month =  jun
    }

    Copyright and License

    The LSX dataset is released under CC-BY-4.0 license.

    All data:

    Created by Mitsubishi Electric Research Laboratories (MERL), 2022-2023
     
    SPDX-License-Identifier: CC-BY-4.0
  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux (2021). Slakh2100 [Dataset]. http://doi.org/10.5281/zenodo.4599666
Organization logo

Slakh2100

Explore at:
68 scholarly articles cite this dataset (View in Google Scholar)
application/gzipAvailable download formats
Dataset updated
Mar 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux; Ethan Manilow; Gordon Wichern; Prem Seetharaman; Jonathan Le Roux
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Introduction:

The Synthesized Lakh (Slakh) Dataset is a dataset of multi-track audio and aligned MIDI for music source separation and multi-instrument automatic transcription. Individual MIDI tracks are synthesized from the Lakh MIDI Dataset v0.1 using professional-grade sample-based virtual instruments, and the resulting audio is mixed together to make musical mixtures. This release of Slakh, called Slakh2100, contains 2100 automatically mixed tracks and accompanying, aligned MIDI files, synthesized from 187 instrument patches categorized into 34 classes, totaling 145 hours of mixture data.

At a Glance:

  • The dataset comes as a series of directories named like TrackXXXXX, where XXXXX is a number between 00001 and 02100. This number is the ID of the track. Each Track directory contains exactly 1 mixture, a variable number of audio files for each source that made the mixture, and the MIDI files that were used to synthesize each source. The directory structure is shown here.
  • All audio in Slakh2100 is distributed in the .flac format. Scripts to batch convert are here.
  • All audio is mono and was rendered at 44.1kHz, 16-bit (CD quality) before being converted to .flac.
  • Slakh2100 is a 105 Gb download. Unzipped and converted to .wav, Slakh2100 is almost 500 Gb. Please plan accordingly.
  • Each mixture has a variable number of sources, with a minimum of 4 sources per mix.
  • Every mix as at least 1 instance of each of the following instrument types: Piano, Guitar, Drums, Bass.
  • metadata.yaml has detailed information about each source. Details about the metadata are here.

Helpful Links:

For more information, see www.slakh.com.

Support code for Slakh: Available here.

Code to render Slakh data: Available in this repo.

See the dataset at a glance, and info about metadata.yaml.

A tiny subset of Slakh2100, called BabySlakh, is also available for prototyping and debugging.

Important Info about Splits:

The original release of Slakh2100 was found to have many duplicate MIDI files. Some MIDI duplicates that are present in more than one of the train/test/validation splits. Even though each song is rendered with a random set of synthesizers, the same versions of the songs appear more than once. Same MIDI, different audio files. This can be an issue for some uses of Slakh2100, e.g., automatic transcription.

The version of Slakh hosted, here, on Zenodo contains a directory called omitted, where the tracks that have duplicated MIDI files have been moved. We recommend that you do not use track directories in the omitted directory if you plan on training an automatic transcription system. This version of Slakh is called Slakh2100-redux. For information about how to create other splits from this version, see https://github.com/ethman/slakh-utils/tree/master/splits

Citing Slakh:

If you use Slakh2100 or generate data using the same method we ask that you cite it using the following bibtex entry:

@inproceedings{manilow2019cutting,
 title={Cutting Music Source Separation Some {Slakh}: A Dataset to Study the Impact of Training Data Quality and Quantity},
 author={Manilow, Ethan and Wichern, Gordon and Seetharaman, Prem and Le Roux, Jonathan},
 booktitle={Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
 year={2019},
 organization={IEEE}
}

Search
Clear search
Close search
Google apps
Main menu