Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. Music Dataset: Song Information and Lyrics

    • kaggle.com
    zip
    Updated May 22, 2023
  2. m

    Music Dataset: Lyrics and Metadata from 1950 to 2019

    • data.mendeley.com
    Updated Aug 24, 2020
    + more versions
  3. MGD: Music Genre Dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated May 28, 2021
  4. Spotify Global Music Dataset (2009–2025)

    • kaggle.com
    zip
    Updated Nov 11, 2025
  5. MusicCaps

    • huggingface.co
    • kaggle.com
    Updated Jan 26, 2023
    + more versions
  6. Data from: MusicOSet: An Enhanced Open Dataset for Music Data Mining

    • zenodo.org
    • data-staging.niaid.nih.gov
    bin, zip
    Updated Jun 7, 2021
  7. Z

    MuMu: Multimodal Music Dataset

    • data.niaid.nih.gov
    Updated Dec 6, 2022
  8. Song Describer Dataset

    • zenodo.org
    • dataverse.csuc.cat
    • +2more
    csv, pdf, tsv, txt +1
    Updated Jul 10, 2024
    + more versions
  9. Spotify Music Analytics Dataset (2015–2025)

    • kaggle.com
    zip
    Updated Dec 4, 2025
  10. Indian Regional Music Dataset

    • zenodo.org
    bin
    Updated May 27, 2022
    + more versions
  11. h

    LP-MusicCaps-MSD

    • huggingface.co
    Updated Jul 31, 2023
    + more versions
  12. Music Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jan 6, 2017
  13. Music Dataset

    • kaggle.com
    zip
    Updated Dec 20, 2024
  14. lastfm Music Recommendation Dataset

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip
    Updated Feb 15, 2022
  15. AAM: Artificial Audio Multitracks Dataset

    • zenodo.org
    • data-staging.niaid.nih.gov
    zip
    Updated Jul 16, 2025
  16. h

    music-dataset-1

    • huggingface.co
    Updated Jan 21, 2024
  17. PDMX: A Large-Scale Public Domain MusicXML Dataset for Symbolic Music...

    • zenodo.org
    application/gzip
    Updated Jan 18, 2025
    + more versions
  18. Worldwide Music Artists Dataset (with image)

    • kaggle.com
    zip
    Updated Aug 18, 2024
  19. h

    MusicBench

    • huggingface.co
    Updated Nov 14, 2023
    + more versions
  20. Spotify Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 10, 2024
    + more versions
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Suraj (2023). Music Dataset: Song Information and Lyrics [Dataset]. https://www.kaggle.com/datasets/suraj520/music-dataset-song-information-and-lyrics
Organization logo

Music Dataset: Song Information and Lyrics

A Comprehensive Collection of Songs with Metadata and Lyrics for Research & Dev

Explore at:
zip(1992670 bytes)Available download formats
Dataset updated
May 22, 2023
Authors
Suraj
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

The Dataset's Purpose: This dataset's goal is to give a complete collection of music facts and lyrics for study and development. It aspires to be a useful resource for a variety of applications such as music analysis, natural language processing, sentiment analysis, recommendation systems, and others. This dataset, which combines song information and lyrics, can help academics, developers, and music fans examine and analyse the link between listeners' preferences and lyrical content.

Dataset Description:

The music dataset contains around 660 songs, each with its own set of characteristics. The following characteristics are included in the dataset:

Name: The title of the song. Lyrics: The lyrics of the song. Singer: The name of the singer or artist who performed the song. Movie: The movie or album associated with the song (if applicable). Genre: The genre or genres to which the song belongs. Rating: The rating or popularity score of the song from Spotify.

The dataset is intended to give a wide variety of songs from various genres, performers, and films. It includes popular songs from numerous ages and places, as well as a wide spectrum of musical styles. The lyrics were obtained from publically accessible services such as Spotify and Soundcloud, and were converted from audio to text using speech recognition algorithms. While every attempt has been taken to assure correctness, please keep in mind that owing to the limits of the data sources and voice recognition algorithms, there may be inaccuracies or missing lyrics encountered upon transcribing.

Use Cases in Research and Development:

This music dataset has several research and development applications. Among the possible applications are:

  1. Music Analysis: By analysing the links between song elements such as genre, vocalist, and rating, researchers can acquire insights into the features and patterns of various music genres.
  2. Natural Language Processing (NLP): NLP researchers may use the lyrics to create language models, sentiment analysis algorithms, topic modelling approaches, and other text-based music studies.
  3. Recommendation Systems: Using the information, developers may create recommendation systems that offer music based on user preferences, lyrics sentiment, or genre similarities.
  4. Music Generating Machine Learning Models: The dataset may be used to train machine learning models for generating new lyrics or making music compositions.
  5. Music Sentiment Analysis: To get insights into the emotional components of music and its influence on listeners, researchers might analyse the feelings conveyed in song lyrics.
  6. Movie Soundtracks Analysis: Researchers can explore the association between song attributes and their use in movie soundtracks by investigating the movie attribute.

Overall, the goal of this music dataset is to provide a rich resource for academics, developers, and music fans to investigate the complicated relationships between song features, lyrics, and numerous research and development applications in the music domain.

Search
Clear search
Close search
Google apps
Main menu