66 datasets found
  1. h

    song-describer-dataset

    • huggingface.co
    • data.niaid.nih.gov
    Updated Feb 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Renumics (2024). song-describer-dataset [Dataset]. https://huggingface.co/datasets/renumics/song-describer-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 2, 2024
    Dataset authored and provided by
    Renumics
    Description

    This is a mirror to the example dataset "The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation" paper by Manco et al. Project page on Github: https://github.com/mulab-mir/song-describer-dataset Dataset on Zenodoo: https://zenodo.org/records/10072001 Explore the dataset on your local machine: import datasets from renumics import spotlight

    ds = datasets.load_dataset('renumics/song-describer-dataset') spotlight.show(ds)

  2. P

    Song Describer Dataset Dataset

    • paperswithcode.com
    Updated Oct 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ilaria Manco; Benno Weck; Seungheon Doh; Minz Won; Yixiao Zhang; Dmitry Bogdanov; Yusong Wu; Ke Chen; Philip Tovstogan; Emmanouil Benetos; Elio Quinton; György Fazekas; Juhan Nam (2024). Song Describer Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/song-describer-dataset
    Explore at:
    Dataset updated
    Oct 16, 2024
    Authors
    Ilaria Manco; Benno Weck; Seungheon Doh; Minz Won; Yixiao Zhang; Dmitry Bogdanov; Yusong Wu; Ke Chen; Philip Tovstogan; Emmanouil Benetos; Elio Quinton; György Fazekas; Juhan Nam
    Description

    The Song Describer Dataset (SDD) contains ~1.1k captions for 706 permissively licensed music recordings. It is designed for use in evaluation of models that address music-and-language (M&L) tasks such as music captioning, text-to-music generation and music-language retrieval.

  3. h

    muchomusic

    • huggingface.co
    Updated Oct 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LMMs-Lab (2024). muchomusic [Dataset]. https://huggingface.co/datasets/lmms-lab/muchomusic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 16, 2024
    Dataset authored and provided by
    LMMs-Lab
    Description

    Dataset Summary MuChoMusic is a benchmark designed to evaluate music understanding in multimodal audio-language models (Audio LLMs). The dataset comprises 1,187 multiple-choice questions created from 644 music tracks, sourced from two publicly available music datasets: MusicCaps and the Song Describer Dataset (SDD). The questions test knowledge and reasoning abilities across dimensions such as music theory, cultural context, and functional applications. All questions and answers have been… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/muchomusic.

  4. Z

    Spotify and Youtube

    • data.niaid.nih.gov
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guarisco, Marco (2023). Spotify and Youtube [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10253414
    Explore at:
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    Sallustio, Marco
    Guarisco, Marco
    Rastelli, Salvatore
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This is the statistics for the Top 10 songs of various spotify artists and their YouTube videos. The Creators above generated the data and uploaded it to Kaggle on February 6-7 2023. The license to use this data is "CC0: Public Domain", allowing the data to be copied, modified, distributed, and worked on without having to ask permission. The data is in numerical and textual CSV format as attached. This dataset contains the statistics and attributes of the top 10 songs of various artists in the world. As described by the creators above, it includes 26 variables for each of the songs collected from spotify. These variables are briefly described next:

    Track: name of the song, as visible on the Spotify platform. Artist: name of the artist. Url_spotify: the Url of the artist. Album: the album in wich the song is contained on Spotify. Album_type: indicates if the song is relesead on Spotify as a single or contained in an album. Uri: a spotify link used to find the song through the API. Danceability: describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. Energy: is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. Key: the key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1. Loudness: the overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db. Speechiness: detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. Acousticness: a confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. Instrumentalness: predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal". The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. Liveness: detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. Valence: a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). Tempo: the overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. Duration_ms: the duration of the track in milliseconds. Stream: number of streams of the song on Spotify. Url_youtube: url of the video linked to the song on Youtube, if it have any. Title: title of the videoclip on youtube. Channel: name of the channel that have published the video. Views: number of views. Likes: number of likes. Comments: number of comments. Description: description of the video on Youtube. Licensed: Indicates whether the video represents licensed content, which means that the content was uploaded to a channel linked to a YouTube content partner and then claimed by that partner. official_video: boolean value that indicates if the video found is the official video of the song. The data was last updated on February 7, 2023.

  5. R

    The Digitised Dataset of Slovenian Folk Song Ballads

    • entrepot.recherche.data.gouv.fr
    zip
    Updated Dec 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vanessa Nina Borsan; Vanessa Nina Borsan; Mojca Kovačič; Mojca Kovačič; Mathieu Giraud; Mathieu Giraud; Marjeta Pisk; Marjeta Pisk; Matevž Pesek; Matevž Pesek; Matija Marolt; Matija Marolt (2024). The Digitised Dataset of Slovenian Folk Song Ballads [Dataset]. http://doi.org/10.57745/SINZFK
    Explore at:
    zip(23346092)Available download formats
    Dataset updated
    Dec 20, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Vanessa Nina Borsan; Vanessa Nina Borsan; Mojca Kovačič; Mojca Kovačič; Mathieu Giraud; Mathieu Giraud; Marjeta Pisk; Marjeta Pisk; Matevž Pesek; Matevž Pesek; Matija Marolt; Matija Marolt
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/SINZFKhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57745/SINZFK

    Time period covered
    1819 - 1995
    Area covered
    Slovenia
    Description

    Slovenian Folk Song Ballads 404 Slovenian folk song ballads with family themes, collected between the years 1819 and 1995. This dataset is an archive of the “Slovenian Folk Song Ballads” corpus (scores, measure maps, analyses, recordings, synchronizations, metadata). It provides both raw data and data for integration with the Dezrann music web platform: https://www.dezrann.net/explore/slovenian-folk-song-ballads. Family songs are folk song ballads, i.e., songs with relatively short but repetitive melodic sections through which the singer narrates a longer story through the lyrics. The melody and lyrics are known not to be fully dependent on each other, meaning, the same melody could be adapted for another lyric and vice versa. Thematically, they fall into the category of those that sing about family destinies, including several motifs from Slovenian, Slavic, and broader European themes. The content often focuses on describing the natural course of family life, from courtship to childbirth. The themes and motifs of family ballads (feudal, rural environment, the time of Turkish invasions, pilgrimages, etc.) revolve around socio-legal relations and both immediate and broader family matters in historical periods from which individual ballads originate. The collection of Slovenian folk song ballads contains transcribed field material collected by Slovenian ethnologists, folklorists, ethnomusicologists and various colleagues of Glasbenonarodopisni inštitut ZRC SAZU spanning from the years 1819 to 1995. Categorized thematically as family ballads, this collection features 404 folk songs, and includes the initial verse of the lyrics, extensive metadata, and musical analysis, encompassing contours, harmony, and song structure (melody and lyric) (see (Borsan et al., 2023) and (Borsan et al., under submission)). 23 of these songs have historical recordings. License: CC-BY-NC-SA-4.0 Maintainers: Vanessa Nina Borsan vanessa@algomus.fr, Mathieu Giraud mathieu@algomus.fr References (Borsan et al., submitted) https://dx.doi.org/10.57745/SINZFK https://www.algomus.fr/data Dataset content slovenian-folk-song-ballads.json. Main archive catalog, with metadata, as described on https://doc.dezrann.net/metadata score/: Scores measure-map/: Measure maps as defined by https://doi.org/10.1145/3625135.3625136, analysis/: Analyses, in .dez format, as described on https://doc.dezrann.net/dez-format audio/: Audio recordings synchro/: Sychronizations between musical time and audio time, as described on https://doc.dezrann.net/synchro

  6. A

    ‘Song Popularity Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Song Popularity Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-song-popularity-dataset-71e4/a729ea15/
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Song Popularity Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yasserh/song-popularity-dataset on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    https://raw.githubusercontent.com/Masterx-AI/Project_Song_Popularity_Prediction_/main/songs.jpg" alt="">

    Description:

    Humans have greatly associated themselves with Songs & Music. It can improve mood, decrease pain and anxiety, and facilitate opportunities for emotional expression. Research suggests that music can benefit our physical and mental health in numerous ways.

    Lately, multiple studies have been carried out to understand songs & it's popularity based on certain factors. Such song samples are broken down & their parameters are recorded to tabulate. Predicting the Song Popularity is the main aim.

    The project is simple yet challenging, to predict the song popularity based on energy, acoustics, instumentalness, liveness, dancibility, etc. The dataset is large & it's complexity arises due to the fact that it has strong multicollinearity. Can you overcome these obstacles & build a decent predictive model?

    Acknowledgement:

    The dataset is referred from Kaggle.

    Objective:

    • Understand the Dataset & cleanup (if required).
    • Build Regression models to predict the song popularity.
    • Also evaluate the models & compare their respective scores like R2, RMSE, etc.

    --- Original source retains full ownership of the source dataset ---

  7. h

    autonlp-data-song-lyrics

    • huggingface.co
    Updated Mar 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julien Simon (2022). autonlp-data-song-lyrics [Dataset]. https://huggingface.co/datasets/juliensimon/autonlp-data-song-lyrics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 1, 2022
    Authors
    Julien Simon
    Description

    AutoNLP Dataset for project: song-lyrics

      Table of content
    

    Dataset Description Languages

    Dataset Structure Data Instances Data Fields Data Splits

      Dataset Descritpion
    

    This dataset has been automatically processed by AutoNLP for project song-lyrics.

      Languages
    

    The BCP-47 code for the dataset's language is en.

      Dataset Structure
    
    
    
    
    
    
    
      Data Instances
    

    A sample from this dataset looks as follows: [ {… See the full description on the dataset page: https://huggingface.co/datasets/juliensimon/autonlp-data-song-lyrics.

  8. One Direction All Songs With Lyrics

    • kaggle.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saksham Nanda (2024). One Direction All Songs With Lyrics [Dataset]. https://www.kaggle.com/datasets/mllion/one-direction-all-songs-with-lyrics/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Saksham Nanda
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    In the Loving Memory 💐 of Liam Payne (1993-2024)

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F20924130%2Fae17be6b14a27c4882b9fcaadd23bfd9%2FEI_ZpYZntoovd3fRtY4SZxvRvqHlTwAHoNjFMu1Bfaj.jpg?generation=1732537774342370&alt=media" alt="">

    Dataset Description:

    1. S.No.
    2. Description: Serial number representing the index of each song entry in the dataset.
    3. Usability: Acts as a unique identifier for each row but does not provide analytical value for exploration or modeling.
    4. Song
    5. Description: Name of the song by One Direction.
    6. Usability: Essential for song-level analysis, such as identifying trends or filtering data by specific songs.
    7. Artist(s)
    8. Description: The performing artist(s) for the song.
    9. Usability: Useful for verifying the artist's contributions if the dataset expands to include collaborations with other artists. Can be used for grouping or filtering data.
    10. Writer(s)
    11. Description: Names of the writers who contributed to the song.
    12. Usability: Provides insight into creative contributors, allowing for analysis of recurring writers or studying patterns in songwriting styles.
    13. Album(s)
    14. Description: Album(s) in which the song was included.
    15. Usability: Useful for album-level aggregation, release patterns, or analyzing thematic or stylistic evolution across albums.
    16. Year
    17. Description: The year the song was released.
    18. Usability: Critical for temporal analysis, such as studying trends over time, release frequency, or the evolution of lyrics and themes.
    19. Lyrics
    20. Description: Full text of the song's lyrics.
    21. Usability: Valuable for natural language processing tasks like sentiment analysis, thematic exploration, keyword extraction, or lyrical comparison across songs.

    Acknowledgements Wikipedia, ChatGPT, genius.com

    OPEN FOR COLLABORATION: - I am open to collaborate with anyone who wants to add on features to this dataset or knows how to collect data by using APIs (For instance the Spotify API for developers)

  9. Spotify Tracks Attributes and Popularity

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Melissa Monfared
    Description

    About Dataset

    Overview:

    This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

    Dataset Details:

    This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

    Schema and Column Descriptions:

    Column NameDescription
    indexUnique index for each track (can be ignored for analysis)
    track_idSpotify's unique identifier for the track
    artistsName of the performing artist(s)
    album_nameTitle of the album the track belongs to
    track_nameTitle of the track
    popularityPopularity score on Spotify (0–100 scale)
    duration_msDuration of the track in milliseconds
    explicitIndicates whether the track contains explicit content
    danceabilityHow suitable the track is for dancing (0.0 to 1.0)
    energyIntensity and activity level of the track (0.0 to 1.0)
    keyMusical key (0 = C, 1 = C♯/D♭, …, 11 = B)
    loudnessOverall loudness of the track in decibels (dB)
    modeModality (major = 1, minor = 0)
    speechinessPresence of spoken words in the track (0.0 to 1.0)
    acousticnessConfidence measure of whether the track is acoustic (0.0 to 1.0)
    instrumentalnessPredicts whether the track contains no vocals (0.0 to 1.0)
    livenessPresence of an audience in the recording (0.0 to 1.0)
    valenceMusical positivity conveyed (0.0 = sad, 1.0 = happy)
    tempoEstimated tempo in beats per minute (BPM)
    time_signatureTime signature of the track (e.g., 4 = 4/4)
    track_genreAssigned genre label for the track

    Key Features:

    • Comprehensive Track Data: Metadata combined with detailed audio analysis.
    • Genre Diversity: Includes tracks from various music genres.
    • Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
    • Machine Learning Friendly: Clean and numerical format ideal for ML models.

    Usage:

    This dataset is valuable for:

    • 🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
    • 📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
    • 🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
    • 🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

    Data Maintenance:

    Additional Notes:

    • This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
    • Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.
  10. r

    RV1-269A251109 - Tag questions; Song and song description

    • researchdata.edu.au
    Updated Feb 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PARADISEC (2020). RV1-269A251109 - Tag questions; Song and song description [Dataset]. http://doi.org/10.26278/5E56880554D2D
    Explore at:
    Dataset updated
    Feb 26, 2020
    Dataset provided by
    PARADISEC
    Time period covered
    Jan 1, 1970
    Area covered
    Description

    a- The consultants are taking turns in uttering sentences containing tag questions and offering possible answers. b- A song is sung then the consultants talk about it.. Language as given: Blanga

  11. C

    Data from: Sound and music recommendation with knowledge graphs [dataset]

    • dataverse.csuc.cat
    txt, zip
    Updated Oct 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Oramas; Sergio Oramas; Vito Claudio Ostuni; Gabriel Vigliensoni; Gabriel Vigliensoni; Vito Claudio Ostuni (2023). Sound and music recommendation with knowledge graphs [dataset] [Dataset]. http://doi.org/10.34810/data444
    Explore at:
    txt(3751), zip(56553416)Available download formats
    Dataset updated
    Oct 9, 2023
    Dataset provided by
    CORA.Repositori de Dades de Recerca
    Authors
    Sergio Oramas; Sergio Oramas; Vito Claudio Ostuni; Gabriel Vigliensoni; Gabriel Vigliensoni; Vito Claudio Ostuni
    License

    https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data444https://dataverse.csuc.cat/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.34810/data444

    Description

    Music Recommendation Dataset (KGRec-music). Number of items: 8,640. Number of users: 5,199. Number of items-users interactions: 751,531. All the data comes from songfacts.com and last.fm websites. Items are songs, which are described in terms of textual description extracted from songfacts.com, and tags from last.fm. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. Multiword tags are separated by -. The name of the file is the id of the item plus the ".txt" extension. Not all items have tags, there are 401 items without tags. implicit_lf_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: user_id /t sound_id /t 1 /n. Sound Recommendation Dataset (KGRec-sound). Number of items: 21,552. Number of users: 20,000. Number of items-users interactions: 2,117,698. All the data comes from Freesound.org. Items are sounds, which are described in terms of textual description and tags created by the sound creator at uploading time. Files and folders in the dataset: /descriptions: In this folder there is one file per item with the textual description of the item. The name of the file is the id of the item plus the ".txt" extension. /tags: In this folder there is one file per item with the tags of the item separated by spaces. The name of the file is the id of the item plus the ".txt" extension. downloads_fs_dataset.txt: This file contains the interactions between users and items. There is one line per interaction (a user that downloaded a sound in this case) with the following format, fields in one line are separated by tabs: /nuser_id /t sound_id /t 1 /n. Two different datasets with users, items, implicit feedback interactions between users and items, item tags, and item text descriptions are provided, one for Music Recommendation (KGRec-music), and other for Sound Recommendation (KGRec-sound).

  12. Z

    The DALI dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meseguer Brocal, Gabriel (2020). The DALI dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2577914
    Explore at:
    Dataset updated
    Aug 2, 2020
    Dataset authored and provided by
    Meseguer Brocal, Gabriel
    Description

    The DALI dataset is a large Dataset of synchronised audio, lyrics and notes for the audio full-duration, with – its time-aligned lyrics and – its time-aligned notes (of the vocal melody). Lyrics are described according to four levels of granularity: notes (and textual information un- derlying a given note), words, lines and paragraphs. For each song, we also provide additional multimodal information such as genre, language, musician, album covers or links to video clips.

    Go to https://github.com/gabolsgabs/DALI where you can find all the tools to work with the DALI dataset and a detailed description of how to use it.

    For this version cite the article:

    @article{meseguer2020creating, title={Creating DALI, a Large Dataset of Synchronized Audio, Lyrics, and Notes}, author={Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy}, journal={Transactions of the International Society for Music Information Retrieval}, volume={3}, number={1}, year={2020}, publisher={Ubiquity Press} }

    and the original paper:

    @inproceedings{meseguer2019dali, title={Dali: A large dataset of synchronized audio, lyrics and notes, automatically created using teacher-student machine learning paradigm}, author={Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy}, journal={arXiv preprint arXiv:1906.10606}, year={2019} }

    This research has received funding from the French National Research Agency under the contract ANR-16-CE23-0017-01 (WASABI project)

  13. f

    Data from: Towards transformational creation of novel songs

    • tandf.figshare.com
    mpga
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jukka M. Toivanen; Matti Järvisalo; Olli Alm; Dan Ventura; Martti Vainio; Hannu Toivonen (2023). Towards transformational creation of novel songs [Dataset]. http://doi.org/10.6084/m9.figshare.5951038.v2
    Explore at:
    mpgaAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Jukka M. Toivanen; Matti Järvisalo; Olli Alm; Dan Ventura; Martti Vainio; Hannu Toivonen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We study transformational computational creativity in the context of writing songs and describe an implemented system that is able to modify its own goals and operation. With this, we contribute to three aspects of computational creativity and song generation: (1) Application-wise, songs are an interesting and challenging target for creativity, as they require the production of complementary music and lyrics. (2) Technically, we approach the problem of creativity and song generation using constraint programming. We show how constraints can be used declaratively to define a search space of songs so that a standard constraint solver can then be used to generate songs. (3) Conceptually, we describe a concrete architecture for transformational creativity where the creative (song writing) system has some responsibility for setting its own search space and goals. In the proposed architecture, a meta-level control component does this transparently by manipulating the constraints at runtime based on self-reflection of the system. Empirical experiments suggest the system is able to create songs according to its own taste.

  14. Z

    The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Livingstone, Steven R. (2024). The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1188975
    Explore at:
    Dataset updated
    Oct 19, 2024
    Dataset provided by
    Russo, Frank A.
    Livingstone, Steven R.
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Description

    The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) contains 7356 files (total size: 24.8 GB). The dataset contains 24 professional actors (12 female, 12 male), vocalizing two lexically-matched statements in a neutral North American accent. Speech includes calm, happy, sad, angry, fearful, surprise, and disgust expressions, and song contains calm, happy, sad, angry, and fearful emotions. Each expression is produced at two levels of emotional intensity (normal, strong), with an additional neutral expression. All conditions are available in three modality formats: Audio-only (16bit, 48kHz .wav), Audio-Video (720p H.264, AAC 48kHz, .mp4), and Video-only (no sound). Note, there are no song files for Actor_18.

    The RAVDESS was developed by Dr Steven R. Livingstone, who now leads the Affective Data Science Lab, and Dr Frank A. Russo who leads the SMART Lab.

    Citing the RAVDESS

    The RAVDESS is released under a Creative Commons Attribution license, so please cite the RAVDESS if it is used in your work in any form. Published academic papers should use the academic paper citation for our PLoS1 paper. Personal works, such as machine learning projects/blog posts, should provide a URL to this Zenodo page, though a reference to our PLoS1 paper would also be appreciated.

    Academic paper citation

    Livingstone SR, Russo FA (2018) The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): A dynamic, multimodal set of facial and vocal expressions in North American English. PLoS ONE 13(5): e0196391. https://doi.org/10.1371/journal.pone.0196391.

    Personal use citation

    Include a link to this Zenodo page - https://zenodo.org/record/1188976

    Commercial Licenses

    Commercial licenses for the RAVDESS can be purchased. For more information, please visit our license page of fees, or contact us at ravdess@gmail.com.

    Contact Information

    If you would like further information about the RAVDESS, to purchase a commercial license, or if you experience any issues downloading files, please contact us at ravdess@gmail.com.

    Example Videos

    Watch a sample of the RAVDESS speech and song videos.

    Emotion Classification Users

    If you're interested in using machine learning to classify emotional expressions with the RAVDESS, please see our new RAVDESS Facial Landmark Tracking data set [Zenodo project page].

    Construction and Validation

    Full details on the construction and perceptual validation of the RAVDESS are described in our PLoS ONE paper - https://doi.org/10.1371/journal.pone.0196391.

    The RAVDESS contains 7356 files. Each file was rated 10 times on emotional validity, intensity, and genuineness. Ratings were provided by 247 individuals who were characteristic of untrained adult research participants from North America. A further set of 72 participants provided test-retest data. High levels of emotional validity, interrater reliability, and test-retest intrarater reliability were reported. Validation data is open-access, and can be downloaded along with our paper from PLoS ONE.

    Contents

    Audio-only files

    Audio-only files of all actors (01-24) are available as two separate zip files (~200 MB each):

    Speech file (Audio_Speech_Actors_01-24.zip, 215 MB) contains 1440 files: 60 trials per actor x 24 actors = 1440.

    Song file (Audio_Song_Actors_01-24.zip, 198 MB) contains 1012 files: 44 trials per actor x 23 actors = 1012.

    Audio-Visual and Video-only files

    Video files are provided as separate zip downloads for each actor (01-24, ~500 MB each), and are split into separate speech and song downloads:

    Speech files (Video_Speech_Actor_01.zip to Video_Speech_Actor_24.zip) collectively contains 2880 files: 60 trials per actor x 2 modalities (AV, VO) x 24 actors = 2880.

    Song files (Video_Song_Actor_01.zip to Video_Song_Actor_24.zip) collectively contains 2024 files: 44 trials per actor x 2 modalities (AV, VO) x 23 actors = 2024.

    File Summary

    In total, the RAVDESS collection includes 7356 files (2880+2024+1440+1012 files).

    File naming convention

    Each of the 7356 RAVDESS files has a unique filename. The filename consists of a 7-part numerical identifier (e.g., 02-01-06-01-02-01-12.mp4). These identifiers define the stimulus characteristics: Filename identifiers

    Modality (01 = full-AV, 02 = video-only, 03 = audio-only).

    Vocal channel (01 = speech, 02 = song).

    Emotion (01 = neutral, 02 = calm, 03 = happy, 04 = sad, 05 = angry, 06 = fearful, 07 = disgust, 08 = surprised).

    Emotional intensity (01 = normal, 02 = strong). NOTE: There is no strong intensity for the 'neutral' emotion.

    Statement (01 = "Kids are talking by the door", 02 = "Dogs are sitting by the door").

    Repetition (01 = 1st repetition, 02 = 2nd repetition).

    Actor (01 to 24. Odd numbered actors are male, even numbered actors are female).

    Filename example: 02-01-06-01-02-01-12.mp4

    Video-only (02)

    Speech (01)

    Fearful (06)

    Normal intensity (01)

    Statement "dogs" (02)

    1st Repetition (01)

    12th Actor (12)

    Female, as the actor ID number is even.

    License information

    The RAVDESS is released under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, CC BY-NC-SA 4.0

    Commercial licenses for the RAVDESS can also be purchased. For more information, please visit our license fee page, or contact us at ravdess@gmail.com.

    Related Data sets

    RAVDESS Facial Landmark Tracking data set [Zenodo project page].

  15. f

    List and description of song features.

    • figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicole Geberzahn; Manfred Gahr (2023). List and description of song features. [Dataset]. http://doi.org/10.1371/journal.pone.0026485.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Nicole Geberzahn; Manfred Gahr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List and description of song features.

  16. h

    genius-lyrics

    • huggingface.co
    Updated Aug 25, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bruno Kreiner (2017). genius-lyrics [Dataset]. https://huggingface.co/datasets/brunokreiner/genius-lyrics
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 25, 2017
    Authors
    Bruno Kreiner
    Description

    Dataset Card for Dataset Name

      Dataset Description
    
    
    
    
    
      Dataset Summary
    

    This dataset consists of roughly 480k english (classified using nltk language classifier) lyrics with some more meta data. The id corresponds to the spotify id. The meta data was taken from the million playlist challenge @ AICrowd. The lyrics were crawled using "[song name] [artist name]" as string using the lyricsgenius python package which uses the genius.com search function. There is no… See the full description on the dataset page: https://huggingface.co/datasets/brunokreiner/genius-lyrics.

  17. A

    ‘500 Greatest Songs of All Time’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘500 Greatest Songs of All Time’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-500-greatest-songs-of-all-time-7e53/a52855b5/?iid=001-721&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘500 Greatest Songs of All Time’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/omarhanyy/500-greatest-songs-of-all-time on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    Rolling Stone is an American monthly magazine that focuses on popular culture. It was founded in San Francisco, California, in 1967 by Jann Wenner, and the music critic Ralph J. Gleason. And here are their top 500 picks as the greatest songs of all time.

    Content

    The dataset includes 500 song with attributes such as title, description, artist, etc.

    Acknowledgements

    The data was scraped from the publicly available website Rolling Stone

    --- Original source retains full ownership of the source dataset ---

  18. The Jackson 5 Songs

    • kaggle.com
    Updated Nov 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). The Jackson 5 Songs [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-jackson-5-songs/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 7, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    The Jackson 5 Songs

    A Discography of Their Recorded Songs From Big Boy to I Want You Back

    About this dataset

    The Jackson 5 is an American music group, which began forming around 1963-1965 by the Jackson family brothers Jackie, Jermaine, Marlon, Michael and Tito. In 1967, the quintet's first singles were recorded in Chicago and released by Steeltown Records, which was located in their hometown of Gary, Indiana. The songs released by Steeltown in 1968 included Big Boy (sung by Michael Jackson), You've Changed, and We Don't Have To Be Over 21 (to Fall in Love).[1] Although Steeltown is best known in Gary and northwest Indiana for giving the Jackson 5 their actual start in the music industry (and Motown) by releasing their first records,[2] music journalist Nelson George claimed that Michael Jackson and the Jackson 5's real recording history did not begin until their move to Motown Records in 1968.[1] The then Detroit-based company, led by Berry Gordy, housed established recording artists such as Stevie Wonder, Marvin Gaye and Diana Ross, as well as a producing-writing team known as The Corporation. In 1969 and 1970, the Jackson 5 hit singles such as I Want You Back, ABC,. Thanks for reading!

    Columns

    File: df_1.csv | Column name | Description | |:--------------|:--------------| | 0 | | | 1 | | | 2 | | | 3 | |

    File: df_4.csv

    File: df_3.csv | Column name | Description | |:--------------|:------------------------------------------------| | Song | The name of the song. (String) | | Artist | The artist who recorded the song. (String) | | Album | The album on which the song appears. (String) | | Notes | Additional information about the song. (String) | | Ref | A reference for the song. (String) |

    File: df_2.csv | Column name | Description | |:--------------|:------------------------------------------------| | Song | The name of the song. (String) | | Notes | Additional information about the song. (String) | | Ref | A reference for the song. (String) |

    File: df_6.csv | Column name | Description | |:----------------------------------------------|:-----------------------------------------------------------------------------------------------| | vteThe Jackson 5 / The Jacksons singles | The Jackson 5 / The Jacksons singles discography. | | vteThe Jackson 5 / The Jacksons singles.1 | The Jackson 5 / The Jacksons singles discography, including release date and label information |

    File: df_5.csv | Column name | Description | |:--------------|:--------------| | 0 | | | 1 | |

  19. f

    Description of the frequency and length of the four up-note variations.

    • plos.figshare.com
    xls
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wyatt J. Cummings; David D. L. Goodman; Craig D. Layne; Katherine I. Singer; M. Whitney Thomas (2025). Description of the frequency and length of the four up-note variations. [Dataset]. http://doi.org/10.1371/journal.pone.0312636.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Wyatt J. Cummings; David D. L. Goodman; Craig D. Layne; Katherine I. Singer; M. Whitney Thomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of the frequency and length of the four up-note variations.

  20. Brazil regional spotify charts

    • kaggle.com
    zip
    Updated Apr 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filipe Moura (2024). Brazil regional spotify charts [Dataset]. https://www.kaggle.com/datasets/filipeasm/brazil-regional-spotify-charts
    Explore at:
    zip(10117250 bytes)Available download formats
    Dataset updated
    Apr 14, 2024
    Authors
    Filipe Moura
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Brazil
    Description

    This dataset provides a regional detailed overview of the Brazil digital music consumption in Spotify between 2021-2023. It includes acoustic features and all genres/artists that are listened at least one time in those years. The data is provided by the Spotify API for Developers and the SpotifyCharts wich are used to collect the acoustic features and the summarized most listened songs in city, respectively.

    Data description

    It contemplates 17 cities of 16 different states in Brazil that achieved 5190 unique tracks, 487 different genres and 2056 artists. The covered cities are: Belém, Belo Horizonte, Brasília, Campinas, Campo Grande, Cuiabá, Curitiba, Florianópolis, Fortaleza, Goiânia, Manaus, Porto Alegre, Recife, Rio de Janeiro, Salvador, São Paulo and Uberlândia. Each city has 119 different weekly's charts wich the week period is described by the file name.

    Acoustic features

    The covered acoustic features are provided by Spotify and are described as: - Acousticness: Measures from 0.0 to 1.0 of wheter the track is acoustic; 1.0 indicates a totally acoustic song and 0.0 means a song without any acoustic element - Danceability: Describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. - Energy: is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. - Instrumentalness: Predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal". The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. - Key: The key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1. - Liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. - Loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db. - Mode: Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0. - Speechiness: Detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. - Tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. - Time Signature: An estimated time signature. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure). The time signature ranges from 3 to 7 indicating time signatures of "3/4", to "7/4". - Valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

    Data Science Applications:

    • Time Series Analysis: Identify seasonal behaviors and the deviation of each city during those 2 years
    • Trend Analysis: Identify patterns and trends in digital music consumption based in genres and/or acoustic features in each city to understand seasonal changes
    • Clustering Tasks: Group cities based on genre and/or acoustic features to identify different regional patterns between Brazil's regions and describe the difference between each group
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Renumics (2024). song-describer-dataset [Dataset]. https://huggingface.co/datasets/renumics/song-describer-dataset

song-describer-dataset

renumics/song-describer-dataset

Explore at:
34 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 2, 2024
Dataset authored and provided by
Renumics
Description

This is a mirror to the example dataset "The Song Describer Dataset: a Corpus of Audio Captions for Music-and-Language Evaluation" paper by Manco et al. Project page on Github: https://github.com/mulab-mir/song-describer-dataset Dataset on Zenodoo: https://zenodo.org/records/10072001 Explore the dataset on your local machine: import datasets from renumics import spotlight

ds = datasets.load_dataset('renumics/song-describer-dataset') spotlight.show(ds)

Search
Clear search
Close search
Google apps
Main menu