39 datasets found

MusicNet
zenodo.org
opendatalab.com
+1more
application/gzip, csv
Updated Jul 22, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Thickstun; Zaid Harchaoui; Sham M. Kakade; John Thickstun; Zaid Harchaoui; Sham M. Kakade (2021). MusicNet [Dataset]. http://doi.org/10.5281/zenodo.5120004
Explore at:
application/gzip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5120004
Dataset updated
Jul 22, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
John Thickstun; Zaid Harchaoui; Sham M. Kakade; John Thickstun; Zaid Harchaoui; Sham M. Kakade
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MusicNet is a collection of 330 freely-licensed classical music recordings, together with over 1 million annotated labels indicating the precise time of each note in every recording, the instrument that plays each note, and the note's position in the metrical structure of the composition. The labels are acquired from musical scores aligned to recordings by dynamic time warping. The labels are verified by trained musicians; we estimate a labeling error rate of 4%. We offer the MusicNet labels to the machine learning and music communities as a resource for training models and a common benchmark for comparing results. This dataset was introduced in the paper "Learning Features of Music from Scratch." [1]

This repository consists of 3 top-level files:

musicnet.tar.gz - This file contains the MusicNet dataset itself, consisting of PCM-encoded audio wave files (.wav) and corresponding CSV-encoded note label files (.csv). The data is organized according to the train/test split described and used in "Invariances and Data Augmentation for Supervised Music Transcription". [2]

musicnet_metadata.csv - This file contains track-level information about recordings contained in MusicNet. The data and label files are named with MusicNet ids, which you can use to cross-index the data and labels with this metadata file.

musicnet_midis.tar.gz - This file contains the reference MIDI files used to construct the MusicNet labels.

A PyTorch interface for accessing the MusicNet dataset is available on GitHub. For an audio/visual introduction and summary of this dataset, see the MusicNet inspector, created by Jong Wook Kim. The audio recordings in MusicNet consist of Creative Commons licensed and Public Domain performances, sourced from the Isabella Stewart Gardner Museum, the European Archive Foundation, and Musopen. The provenance of specific recordings and midis are described in the metadata file.

[1] Learning Features of Music from Scratch. John Thickstun, Zaid Harchaoui, and Sham M. Kakade. In International Conference on Learning Representations (ICLR), 2017. ArXiv Report.

@inproceedings{thickstun2017learning, title={Learning Features of Music from Scratch}, author = {John Thickstun and Zaid Harchaoui and Sham M. Kakade}, year={2017}, booktitle = {International Conference on Learning Representations (ICLR)} }

[2] Invariances and Data Augmentation for Supervised Music Transcription. John Thickstun, Zaid Harchaoui, Dean P. Foster, and Sham M. Kakade. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018. ArXiv Report.

@inproceedings{thickstun2018invariances, title={Invariances and Data Augmentation for Supervised Music Transcription}, author = {John Thickstun and Zaid Harchaoui and Dean P. Foster and Sham M. Kakade}, year={2018}, booktitle = {International Conference on Acoustics, Speech, and Signal Processing (ICASSP)} }
Z
YourMT3 dataset (Part 1)
data.niaid.nih.gov
zenodo.org
Updated Oct 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benetos, Emmanouil (2023). YourMT3 dataset (Part 1) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7793341
Explore at:
Dataset updated
Oct 18, 2023
Dataset provided by
Dixon, Simon
Benetos, Emmanouil
Sungkyun Chang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
UNDER CONSTRUCTION >

About this version:

This particular variant of the MusicNet dataset has been resampled to a 16 kHz-mono-16-bit-wav format, which makes it more suitable for certain audio processing tasks, particularly those that require lower sampling rates. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached.

Moreover, this version of the dataset includes various split options derived from previous works on automatic music transcription as python dictionary (see README.md). Below is a brief description of available split options:

MUSICNET_SPLIT_INFO = { 'train_mt3': [], # the first 300 songs are synth dataset, while the remaining 300 songs are acoustic dataset. 'train_mt3_synth' : [], # Note: this is not the synthetic dataset of EM (MIDI Pop 80K) nor pitch-augmented. Just recording of MusicNet MIDI, split by MT3 author's split. But not sure if they used this (maybe not). 'train_mt3_acoustic': [], 'validation_mt3': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'validation_mt3_synth': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'validation_mt3_acoustic': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'test_mt3_acoustic': [1729, 1776, 1813, 1893, 2118, 2186, 2296, 2431, 2432, 2487, 2497, 2501, 2507, 2537, 2621], 'train_thickstun': [], # the first 327 songs are synth dataset, while the remaining 327 songs are acoustic dataset.
'test_thickstun': [1819, 2303, 2382], 'train_mt3_em': [], # 293 tracks. MT3 train set - 7 missing tracks[2194, 2211, 2227, 2230, 2292, 2305, 2310], ours 'validation_mt3_em': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], # ours 'test_mt3_em': [1729, 1776, 1813, 1893, 2118, 2186, 2296, 2431, 2432, 2487, 2497, 2501, 2507, 2537, 2621], # ours 'train_em_table2' : [], # 317 tracks. Whole set - 7 missing tracks[2194, 2211, 2227, 2230, 2292, 2305, 2310] - 6 test_em 'test_em_table2' : [2191, 2628, 2106, 2298, 1819, 2416], # strings and winds from Cheuk's split, using EM annotations 'test_cheuk_table2' : [2191, 2628, 2106, 2298, 1819, 2416], # strings and winds from Cheuk's split, using Thickstun's annotations }

About MusicNet:

The MusicNet dataset, originally released in 2016 by Thickstun et al., "Learning Features of Music from Scratch". It is a collection of music recordings annotated with labels for various tasks, such as automatic music transcription, instrument recognition, and genre classification. The original dataset contains over 330 hours of audio, sourced from various public domain recordings of classical music, and is labeled with instrument activations and note-wise annotations.

About MusicNet EM:

MusicNetEM are refined labels for the MusicNet dataset, in the form of MIDI files. They are aligned with the recordings, with onset timing within 32ms. They were created using an EM process, similar to the one described in the Ben Maman and Amit H. Bermano, "Unaligned Supervision for Automatic Music Transcription in The Wild". Their split (Table 2 of this paper) derived from another paper, Kin Wai Cheuk et al., "ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data".

License:

CC-BY-4.0
o
MusicNet-16k + EM for YourMT3
explore.openaire.eu
zenodo.org
Updated Apr 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sungkyun Chang; Simon Dixon; Emmanouil Benetos (2023). MusicNet-16k + EM for YourMT3 [Dataset]. http://doi.org/10.5281/zenodo.7928301
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7928301
Dataset updated
Apr 2, 2023
Authors
Sungkyun Chang; Simon Dixon; Emmanouil Benetos
Description
< UNDER CONSTRUCTION > About this version: This particular variant of the MusicNet dataset has been resampled to a 16 kHz-mono-16-bit-wav format, which makes it more suitable for certain audio processing tasks, particularly those that require lower sampling rates. We redistribute this data as a part of YourMT3 project. The license for redistribution is attached. Moreover, this version of the dataset includes various split options derived from previous works on automatic music transcription as python dictionary (see README.md). Below is a brief description of available split options: MUSICNET_SPLIT_INFO = { 'train_mt3': [], # the first 300 songs are synth dataset, while the remaining 300 songs are acoustic dataset. 'train_mt3_synth' : [], # Note: this is not the synthetic dataset of EM (MIDI Pop 80K) nor pitch-augmented. Just recording of MusicNet MIDI, split by MT3 author's split. But not sure if they used this (maybe not). 'train_mt3_acoustic': [], 'validation_mt3': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'validation_mt3_synth': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'validation_mt3_acoustic': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], 'test_mt3_acoustic': [1729, 1776, 1813, 1893, 2118, 2186, 2296, 2431, 2432, 2487, 2497, 2501, 2507, 2537, 2621], 'train_thickstun': [], # the first 327 songs are synth dataset, while the remaining 327 songs are acoustic dataset. 'test_thickstun': [1819, 2303, 2382], 'train_mt3_em': [], # 293 tracks. MT3 train set - 7 missing tracks[2194, 2211, 2227, 2230, 2292, 2305, 2310], ours 'validation_mt3_em': [1733, 1765, 1790, 1818, 2160, 2198, 2289, 2300, 2308, 2315, 2336, 2466, 2477, 2504, 2611], # ours 'test_mt3_em': [1729, 1776, 1813, 1893, 2118, 2186, 2296, 2431, 2432, 2487, 2497, 2501, 2507, 2537, 2621], # ours 'train_em_table2' : [], # 317 tracks. Whole set - 7 missing tracks[2194, 2211, 2227, 2230, 2292, 2305, 2310] - 6 test_em 'test_em_table2' : [2191, 2628, 2106, 2298, 1819, 2416], # strings and winds from Cheuk's split, using EM annotations 'test_cheuk_table2' : [2191, 2628, 2106, 2298, 1819, 2416], # strings and winds from Cheuk's split, using Thickstun's annotations } About MusicNet: The MusicNet dataset, originally released in 2016 by Thickstun et al., "Learning Features of Music from Scratch". It is a collection of music recordings annotated with labels for various tasks, such as automatic music transcription, instrument recognition, and genre classification. The original dataset contains over 330 hours of audio, sourced from various public domain recordings of classical music, and is labeled with instrument activations and note-wise annotations. About MusicNet EM: MusicNetEM are refined labels for the MusicNet dataset, in the form of MIDI files. They are aligned with the recordings, with onset timing within 32ms. They were created using an EM process, similar to the one described in the Ben Maman and Amit H. Bermano, "Unaligned Supervision for Automatic Music Transcription in The Wild". Their split (Table 2 of this paper) derived from another paper, Kin Wai Cheuk et al., "ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data". License: CC-BY-4.0
h
musicnet_jukebox_embeddings
huggingface.co
Updated Oct 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jon Flynn (2024). musicnet_jukebox_embeddings [Dataset]. https://huggingface.co/datasets/jonflynn/musicnet_jukebox_embeddings
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 26, 2024
Authors
Jon Flynn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Jukebox Embeddings for MusicNet Dataset

Repo with Colab notebook used to extract the embeddings.

Overview

This dataset extends the MusicNet Dataset by providing embeddings for each audio file.

Original MusicNet Dataset

Link to original dataset

Jukebox Embeddings

Embeddings are derived from OpenAI's Jukebox model, following the approach described in Castellon et al. (2021) with some modifications followed in Spotify's Llark paper:

Source: Output of… See the full description on the dataset page: https://huggingface.co/datasets/jonflynn/musicnet_jukebox_embeddings.
h
MusicNet
huggingface.co
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabricio Rivera (2025). MusicNet [Dataset]. https://huggingface.co/datasets/Ftsos/MusicNet
Explore at:
Dataset updated
Jun 17, 2025
Authors
Fabricio Rivera
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ftsos/MusicNet dataset hosted on Hugging Face and contributed by the HF Datasets community
w
MusicNet Codex
data.wu.ac.at
api/sparql, html
Updated Oct 10, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UK Discovery (2013). MusicNet Codex [Dataset]. https://data.wu.ac.at/schema/datahub_io/MTQ3NWJhMGQtYTY0Yy00Mjc4LWFiNmQtOWY0OWJmY2IyZjgw
Explore at:
html, api/sparqlAvailable download formats
Dataset updated
Oct 10, 2013
Dataset provided by
UK Discovery
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The MusicNet Codex provides canonical linked data references, aka “minted” URIs, for classical music composers. These URIs are associated with recognized reference data sources in Musicology like COPAC, RISM, Grove, the British Library into standard representative pointers for composers.

Basic biographical data is also available (dates of birth and death)

Data is available as Linked Data (RDF) in a number of ways (including a dump of the entire dataset) detailed in the documentation. There is also a standard web interface to search and browse the data.
MusicNet midi files
kaggle.com
Updated Apr 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rohit Singh (2021). MusicNet midi files [Dataset]. https://www.kaggle.com/rohitsingh0210/musicnet-midi-files/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 5, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rohit Singh
Description
Dataset

This dataset was created by Rohit Singh

Contents
h
cmd-musicnet-metadata
huggingface.co
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
cmd-musicnet-metadata [Dataset]. https://huggingface.co/datasets/seungheondoh/cmd-musicnet-metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 20, 2025
Authors
seungheon.doh
Description
seungheondoh/cmd-musicnet-metadata dataset hosted on Hugging Face and contributed by the HF Datasets community
h
music-net
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EPR Labs, music-net [Dataset]. https://huggingface.co/datasets/epr-labs/music-net
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset authored and provided by
EPR Labs
Description
epr-labs/music-net dataset hosted on Hugging Face and contributed by the HF Datasets community
f
Music classification tree MusicNet and the data associated with top levels...
figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerardo Febres; Klaus Jaffe (2023). Music classification tree MusicNet and the data associated with top levels of the tree. [Dataset]. http://doi.org/10.1371/journal.pone.0185757.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0185757.t002
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Gerardo Febres; Klaus Jaffe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Music classification tree MusicNet and the data associated with top levels of the tree.
w
free-piano-sheet-music.net - Historical whois Lookup
whoisdatacenter.com
csv
Updated Sep 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc (2023). free-piano-sheet-music.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/free-piano-sheet-music.net/
Explore at:
csvAvailable download formats
Dataset updated
Sep 18, 2023
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 28, 2025
Description
Explore the historical Whois records related to free-piano-sheet-music.net (Domain). Get insights into ownership history and changes over time.
Tencent Music's net profit 2016-2024
statista.com
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Tencent Music's net profit 2016-2024 [Dataset]. https://www.statista.com/statistics/933990/china-tencent-music-profit/
Explore at:
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
China
Description
Tencent Music Entertainment, the music division of the Chinese tech giant Tencent, has demonstrated sustained profitability in recent years. In 2024, the company reported **** billion yuan in net profit with a robust growth of online music-paying users.
pick-music.net - Historical whois Lookup
whoisdatacenter.com
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, pick-music.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/pick-music.net/
Explore at:
csvAvailable download formats
Dataset provided by
AllHeart Web
Authors
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 18, 2025
Description
Explore the historical Whois records related to pick-music.net (Domain). Get insights into ownership history and changes over time.
Indigo Books & Music net cash 2018-2022
statista.com
Updated Nov 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Indigo Books & Music net cash 2018-2022 [Dataset]. https://www.statista.com/statistics/1534219/indigo-books-music-net-cash/
Explore at:
Dataset updated
Nov 1, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada
Description
The net cash of Indigo Books & Music with headquarters in Canada amounted to 77.8 million Canadian dollars in 2022. The reported fiscal year ends on March 27.Compared to the earliest depicted value from 2018 this is a total increase by approximately 80.59 million Canadian dollars. The trend from 2018 to 2022 shows, however, that this increase did not happen continuously.
Indigo Books & Music net income 2018-2022
statista.com
Updated Oct 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Indigo Books & Music net income 2018-2022 [Dataset]. https://www.statista.com/statistics/1534308/indigo-books-music-net-income/
Explore at:
Dataset updated
Oct 29, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada
Description
The net income of Indigo Books & Music with headquarters in Canada amounted to ****** million Canadian dollars in 2022. The reported fiscal year ends on March 27.Compared to the earliest depicted value from 2018 this is a total decrease by approximately ***** million Canadian dollars. The trend from 2018 to 2022 shows, however, that this decrease did not happen continuously.
P
YourMT3 Dataset Dataset
library.toponeai.link
Updated Apr 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sungkyun Chang; Emmanouil Benetos; Holger Kirchhoff; Simon Dixon (2025). YourMT3 Dataset Dataset [Dataset]. https://library.toponeai.link/dataset/yourmt3-dataset
Explore at:
Dataset updated
Apr 29, 2025
Authors
Sungkyun Chang; Emmanouil Benetos; Holger Kirchhoff; Simon Dixon
Description
We redistribute a suite of datasets as part of the YourMT3 project. The license for redistribution is attached.

YourMT3 Dataset Includes:

Slakh MusicNet (original and EM) MAPS (not used for training) Maestro GuitarSet ENST-drums EGMD MIR-ST500 Restricted Access CMedia Restricted Access RWC-Pop (Bass and Full) Restricted Access URMP IDMT-SMT-Bass
royalty-free-classical-music.net - Historical whois Lookup
whoisdatacenter.com
csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, royalty-free-classical-music.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/royalty-free-classical-music.net/
Explore at:
csvAvailable download formats
Dataset provided by
AllHeart Web
Authors
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 31, 2025
Description
Explore the historical Whois records related to royalty-free-classical-music.net (Domain). Get insights into ownership history and changes over time.
w
athena-music.net - Historical whois Lookup
whoisdatacenter.com
csv
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, athena-music.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/athena-music.net/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jul 26, 2025
Description
Explore the historical Whois records related to athena-music.net (Domain). Get insights into ownership history and changes over time.
w
rs-music.net - Historical whois Lookup
whoisdatacenter.com
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AllHeart Web Inc, rs-music.net - Historical whois Lookup [Dataset]. https://whoisdatacenter.com/domain/rs-music.net/
Explore at:
csvAvailable download formats
Dataset authored and provided by
AllHeart Web Inc
License
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Time period covered
Mar 15, 1985 - Jun 4, 2025
Description
Explore the historical Whois records related to rs-music.net (Domain). Get insights into ownership history and changes over time.
Net income Source Music 2020-2024
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Net income Source Music 2020-2024 [Dataset]. https://www.statista.com/statistics/1386757/source-music-net-income/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
South Korea
Description
In 2024, South Korean music label Source Music posted a net profit of around **** billion South Korean won. While this represents a decrease from the previous year, it was also the second year of profit. Their only signed artist for the past few years, K-pop girl group GFRIEND disbanded in May of 2021, leaving the label without a direct income stream for the rest of the year. Having been acquired by HYBE Corporation in 2019, the label debuted a new girl group, LE SSERAFIM, in May of 2022.

Facebook

Twitter

Click to copy link

Link copied

Cite

John Thickstun; Zaid Harchaoui; Sham M. Kakade; John Thickstun; Zaid Harchaoui; Sham M. Kakade (2021). MusicNet [Dataset]. http://doi.org/10.5281/zenodo.5120004

MusicNet

Explore at:

5 scholarly articles cite this dataset (View in Google Scholar)

application/gzip, csvAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5120004

Dataset updated

Jul 22, 2021

Dataset provided by

Zenodohttp://zenodo.org/

Authors

John Thickstun; Zaid Harchaoui; Sham M. Kakade; John Thickstun; Zaid Harchaoui; Sham M. Kakade

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

MusicNet is a collection of 330 freely-licensed classical music recordings, together with over 1 million annotated labels indicating the precise time of each note in every recording, the instrument that plays each note, and the note's position in the metrical structure of the composition. The labels are acquired from musical scores aligned to recordings by dynamic time warping. The labels are verified by trained musicians; we estimate a labeling error rate of 4%. We offer the MusicNet labels to the machine learning and music communities as a resource for training models and a common benchmark for comparing results. This dataset was introduced in the paper "Learning Features of Music from Scratch." [1]

This repository consists of 3 top-level files:

musicnet.tar.gz - This file contains the MusicNet dataset itself, consisting of PCM-encoded audio wave files (.wav) and corresponding CSV-encoded note label files (.csv). The data is organized according to the train/test split described and used in "Invariances and Data Augmentation for Supervised Music Transcription". [2]
musicnet_metadata.csv - This file contains track-level information about recordings contained in MusicNet. The data and label files are named with MusicNet ids, which you can use to cross-index the data and labels with this metadata file.
musicnet_midis.tar.gz - This file contains the reference MIDI files used to construct the MusicNet labels.

A PyTorch interface for accessing the MusicNet dataset is available on GitHub. For an audio/visual introduction and summary of this dataset, see the MusicNet inspector, created by Jong Wook Kim. The audio recordings in MusicNet consist of Creative Commons licensed and Public Domain performances, sourced from the Isabella Stewart Gardner Museum, the European Archive Foundation, and Musopen. The provenance of specific recordings and midis are described in the metadata file.

[1] Learning Features of Music from Scratch. John Thickstun, Zaid Harchaoui, and Sham M. Kakade. In International Conference on Learning Representations (ICLR), 2017. ArXiv Report.

@inproceedings{thickstun2017learning,
  title={Learning Features of Music from Scratch},
  author = {John Thickstun and Zaid Harchaoui and Sham M. Kakade},
  year={2017},
  booktitle = {International Conference on Learning Representations (ICLR)}
}

[2] Invariances and Data Augmentation for Supervised Music Transcription. John Thickstun, Zaid Harchaoui, Dean P. Foster, and Sham M. Kakade. In International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018. ArXiv Report.

@inproceedings{thickstun2018invariances,
title={Invariances and Data Augmentation for Supervised Music Transcription},
author = {John Thickstun and Zaid Harchaoui and Dean P. Foster and Sham M. Kakade},
year={2018},
booktitle = {International Conference on Acoustics, Speech, and Signal Processing (ICASSP)}
}

Clear search

Close search

Google apps

Main menu

MusicNet

YourMT3 dataset (Part 1)

MusicNet-16k + EM for YourMT3

musicnet_jukebox_embeddings

MusicNet

MusicNet Codex

MusicNet midi files

Dataset

Contents

cmd-musicnet-metadata

music-net

Music classification tree MusicNet and the data associated with top levels...

free-piano-sheet-music.net - Historical whois Lookup

Tencent Music's net profit 2016-2024

pick-music.net - Historical whois Lookup

Indigo Books & Music net cash 2018-2022

Indigo Books & Music net income 2018-2022

YourMT3 Dataset Dataset

royalty-free-classical-music.net - Historical whois Lookup

athena-music.net - Historical whois Lookup

rs-music.net - Historical whois Lookup

Net income Source Music 2020-2024

MusicNetSee More Versions

MusicNet