GuitarSet is a dataset of high-quality guitar recordings and rich annotations. It contains 360 excerpts 30 seconds in length. The 360 excerpts are the result of the following combinations:
6 players, 2 versions: comping and soloing, 5 styles: Rock, Singer-Songwriter, Bossa Nova, Jazz, and Funk, 3 progressions: 12 Bar Blues, Autumn Leaves, and Pachelbel Canon, 2 tempi: slow and fast.
Each excerpt is annotated with 6 pitch contour and midi note annotations (one per string), 2 chord annotations (instructed and performed), beat and tempo annotations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accompanying website and repository are here.
If you make use of GuitarSet for academic purposes, please cite the following publication:
Q. Xi, R. Bittner, J. Pauwels, X. Ye, and J. P. Bello, "Guitarset: A Dataset for Guitar Transcription", in 19th International Society for Music Information Retrieval Conference, Paris, France, Sept. 2018.
This project was lead by Qingyang Xi at NYU's Music and Audio Research Lab, along with Rachel Bittner, Xuzhou Ye and Juan Pablo Bello from the same lab, as well as Johan Pauwels at the Center for Digital Music at Queen Mary University.
We present GuitarSet, a dataset that provides high-quality guitar recordings alongside rich annotations and metadata.
In particular, by recording guitars using a hexaphonic pickup, we are able to not only provide recordings of the individual strings but also to largely automate the expensive annotation process, therefore providing rich annotation.
The dataset contains recordings of a variety of musical excerpts played on an acoustic guitar, along with time-aligned annotations of pitch contours, string and fret positions, chords, beats, downbeats, and playing style.
N.B. Known Errors:
Duplicated note in one MIDI annotation Thanks to @maxpv
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Jukebox Embeddings for the GuitarSet Dataset
Repo with Colab notebook used to extract the embeddings.
Overview
This dataset extends the GuitarSet Dataset by providing embeddings for each audio file.
Original GuitarSet Dataset
Link to official site GuitarSet is a dataset that provides high quality guitar recordings alongside rich annotations and metadata. By recording guitars using a hexaphonic pickup, it provides recordings of individual strings and… See the full description on the dataset page: https://huggingface.co/datasets/jonflynn/guitarset_jukebox_embeddings.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This collection contains mel spectrograms and annotations of 16 datasets for beat and downbeat tracking. All datasets have been used in "Beat This! Accurate beat tracking without DBN postprocessing" (Foscarin/Schlüter/Widmer, ISMIR 2024) and prior publications by other authors, but for many of these datasets, audio data is not publicly available. By publishing the spectrograms, we invite other researchers to improve the state of the art in beat and downbeat tracking.
Spectrograms for the following datasets are included in the collection:
If given, links in the above list point to locations for obtaining the original audio.
The corresponding annotations are available on https://github.com/CPJKU/beat_this_annotations. A snapshot of v1.0 is included in this collection as beat_this_annotations.zip
, but you may want to use a later release.
Spectrograms are computed from monophonic audio at a sample rate of 22050 Hz with a window size of 1024 and hop size of 441 samples (yielding 50 frames per second), processed with a mel filterbank of 128 bands from 30 Hz to 11 kHz, and magnitudes scaled with ln(1+1000x). They are provided in half-precision floating-point format. Spectrograms can be reproduced with torchaudio 2.3.1 from a 22050 Hz waveform tensor (resampled with soxr.resample()
, if needed) via:
melspect = torchaudio.transforms.MelSpectrogram(sample_rate=22050, n_fft=1024, hop_length=441, f_min=30, f_max=11000, n_mels=128, mel_scale='slaney', normalized='frame_length', power=1)(waveform).mul(1000).log1p()
For each dataset, a compressed .zip file is provided, which in turn holds an uncompressed .npz file. The .npz file holds a set of numpy arrays in subdirectories named after the annotations. Each subdirectory contains a spectrogram of the original audio file ("track.npy"), 11 pitch-shifted versions from -5 to +6 semitones ("track_ps-5.npy" to "track_ps6.npy") and 10 time-stretched versions from -20% to +20% ("track_ts-20.npy" to "track_ts20.npy"), except for gtzan.npz
, which is designated for testing and only holds the original audio files. The .npz files can be loaded in numpy via np.load()
, or unzipped into a set of .npy files that can again be loaded via np.load()
. We also provide code to load .npz files as memory maps for more efficiency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A dataset of guitars with the necks of the guitars annotated, images are similar to what would be expected from a video guitar tutorial.
We redistribute a suite of datasets as part of the YourMT3 project. The license for redistribution is attached.
YourMT3 Dataset Includes:
Slakh MusicNet (original and EM) MAPS (not used for training) Maestro GuitarSet ENST-drums EGMD MIR-ST500 Restricted Access CMedia Restricted Access RWC-Pop (Bass and Full) Restricted Access URMP IDMT-SMT-Bass
Not seeing a result you expected?
Learn how you can add new datasets to our index.
GuitarSet is a dataset of high-quality guitar recordings and rich annotations. It contains 360 excerpts 30 seconds in length. The 360 excerpts are the result of the following combinations:
6 players, 2 versions: comping and soloing, 5 styles: Rock, Singer-Songwriter, Bossa Nova, Jazz, and Funk, 3 progressions: 12 Bar Blues, Autumn Leaves, and Pachelbel Canon, 2 tempi: slow and fast.
Each excerpt is annotated with 6 pitch contour and midi note annotations (one per string), 2 chord annotations (instructed and performed), beat and tempo annotations.