4 datasets found
  1. Spotify Tracks Attributes and Popularity

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Melissa Monfared
    Description

    About Dataset

    Overview:

    This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

    Dataset Details:

    This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

    Schema and Column Descriptions:

    Column NameDescription
    indexUnique index for each track (can be ignored for analysis)
    track_idSpotify's unique identifier for the track
    artistsName of the performing artist(s)
    album_nameTitle of the album the track belongs to
    track_nameTitle of the track
    popularityPopularity score on Spotify (0–100 scale)
    duration_msDuration of the track in milliseconds
    explicitIndicates whether the track contains explicit content
    danceabilityHow suitable the track is for dancing (0.0 to 1.0)
    energyIntensity and activity level of the track (0.0 to 1.0)
    keyMusical key (0 = C, 1 = C♯/D♭, …, 11 = B)
    loudnessOverall loudness of the track in decibels (dB)
    modeModality (major = 1, minor = 0)
    speechinessPresence of spoken words in the track (0.0 to 1.0)
    acousticnessConfidence measure of whether the track is acoustic (0.0 to 1.0)
    instrumentalnessPredicts whether the track contains no vocals (0.0 to 1.0)
    livenessPresence of an audience in the recording (0.0 to 1.0)
    valenceMusical positivity conveyed (0.0 = sad, 1.0 = happy)
    tempoEstimated tempo in beats per minute (BPM)
    time_signatureTime signature of the track (e.g., 4 = 4/4)
    track_genreAssigned genre label for the track

    Key Features:

    • Comprehensive Track Data: Metadata combined with detailed audio analysis.
    • Genre Diversity: Includes tracks from various music genres.
    • Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
    • Machine Learning Friendly: Clean and numerical format ideal for ML models.

    Usage:

    This dataset is valuable for:

    • 🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
    • 📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
    • 🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
    • 🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

    Data Maintenance:

    Additional Notes:

    • This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
    • Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.
  2. Tamil Music Artists and Their Top Spotify Tracks

    • kaggle.com
    Updated Sep 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QUIXOTIX (2024). Tamil Music Artists and Their Top Spotify Tracks [Dataset]. https://www.kaggle.com/datasets/akashsiddharth/spotify-tamil-artists-top-tracks-popularity/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    QUIXOTIX
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides insights into the most popular tracks of renowned Tamil music artists on Spotify, and popularity over time. The data includes information about artists such as Ilayaraja, A.R. Rahman, Yuvan Shankar Raja, Harris Jayaraj, Vijay Antony, G.V. Prakash, Santhosh Narayanan, Anirudh Ravichander, Hiphop Tamizha, Sean Roldan, Sam C.S., Devi Sri Prasad, Ghibhran and Thaman.

    The dataset includes:

    • Top Tracks: Track names, popularity scores, album details, and release dates.
    • Artist Popularity Over Time: Aggregated popularity scores for each artist per year, allowing analysis of trends and the identification of the top trending artist for each year.
    • Track Duration and Popularity: Explore the relationship between the duration of tracks and their popularity.

    This dataset is ideal for:

    • Data Analysis: Analyze trends in Tamil music over the years, identify top-performing artists, and examine the factors contributing to the popularity of specific tracks.
    • Visualization Projects: Create visualizations of artist popularity trends, track performance, and other key insights.
    • Music Industry Insights: Gain a better understanding of the success of Tamil music artists on global streaming platforms like Spotify.

    This dataset was collected using the Spotify API and is intended to support projects in data science, music analytics, and industry trend analysis.

  3. Spotify Song Attributes

    • kaggle.com
    Updated Aug 4, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GeorgeMcIntire (2017). Spotify Song Attributes [Dataset]. https://www.kaggle.com/forums/f/5360/spotify-song-attributes
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 4, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    GeorgeMcIntire
    Description

    Context

    A dataset of 2017 songs with attributes from Spotify's API. Each song is labeled "1" meaning I like it and "0" for songs I don't like. I used this to data to see if I could build a classifier that could predict whether or not I would like a song.

    I wrote an article about the project I used this data for. It includes code on how to grab this data from the Spotipy API wrapper and the methods behind my modeling. https://opendatascience.com/blog/a-machine-learning-deep-dive-into-my-spotify-data/

    Content

    Each row represents a song.

    There are 16 columns. 13 of which are song attributes, one column for song name, one for artist, and a column called "target" which is the label for the song.

    Here are the 13 track attributes: acousticness, danceability, duration_ms, energy, instrumentalness, key, liveness, loudness, mode, speechiness, tempo, time_signature, valence.

    Information on what those traits mean can be found here: https://developer.spotify.com/web-api/get-audio-features/

    Acknowledgements

    I would like to thank Spotify for providing this readily accessible data.

    Inspiration

    I'm a music lover who's curious about why I love the music that I love.

  4. Audio Features for Playlist Creation

    • kaggle.com
    Updated Mar 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aniruddha Achar (2017). Audio Features for Playlist Creation [Dataset]. https://www.kaggle.com/aniruddhaachar/audio-features/metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aniruddha Achar
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This data was compiled as part of our undergrad project that used machine learning to classify songs based on themes or activities songs are associated with. For the project we, four activities were choose.

    1. Dinner: Songs that sound good when played in a dinner setting or at a restaurant.
    2. Sleep: Songs that promote sleep when they are played.
    3. Party: Songs that sound good when played at a party.
    4. Workout: Songs that sound good when one is exercising/ working out.

    The collection of data started with collecting playlist details form Spotify. Spotify web API was used for the collection of the playlist of each category. Track title, album name and artist names were used to extract low level and high level Audio features like MFCC, Spectral centroid, Spectral Roll-off, Spectral Bandwidth, Tempo, Spectral Contrast and Root Mean Square Energy of the songs. For ease of computation, the mean of the values were calculated and added to the tables.

    Data was also curated using Spotify's audio analysis API. A larger set of songs is part of this data set.

    Content

    The data set has eight tables.

    1. Four tables with names playlist_audio_features have the signal processing features like MFCC, spectral centroid etc.
    2. Four more tables with names playlist_spotify_features have the data extracted from Spotify's audio feature API. These tables have larger number of features. The data set size is quite large.

    Description of the "playlist"_audio_features columns:

    1. The first column has the simple integer id if the track. (This id is local to that file).
    2. The second column has the name of the track.
    3. The third column name mfcc has the mean of the calculated MFCC for that track. 20 MFC coefficients were extracted from one frame of the track.
    4. The forth column is named scem: This is the mean of Spectral centroid. Spectral centroid was calculated for each frame.
    5. The fifth column is named scom: This is the mean of Spectral contrast. Spectral contrast was calculated for each frame.
    6. The sixth column is named srom: This is the mean of Spectral Roll-off. Spectral roll-off was calculated for each frame.
    7. The seventh column is named sbwm: This is the mean of Spectral Bandwidth. Spectral Bandwidth was calculated for each frame.
    8. The eight column is name tempo: This is the estimated tempo of the track.
    9. The ninth column is name rmse: This is the mean of the RSME was calculated for each frame.

    Description of the

    1. id: This is the Spotify id of the track.
    2. name: This is the name of the track.
    3. url: This is a Spotify uri of the track.
    4. artist: This is a one or more artists who worked on the track. 5-13: Description of each of the column can be found at https://developer.spotify.com/web-api/get-audio-features/

    Acknowledgements

    We would like to thank Librosa an opensource audio feature extraction library in python for developing a great tool. We would also thank the large research done on music genre classification using audio feature which helped us in developing this data set as well as the classification. A special thanks to Spotify

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity
Organization logo

Spotify Tracks Attributes and Popularity

A feature-rich dataset of Spotify tracks (audio characteristics & genre labels)

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 9, 2025
Dataset provided by
Kaggle
Authors
Melissa Monfared
Description

About Dataset

Overview:

This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

Dataset Details:

This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

Schema and Column Descriptions:

Column NameDescription
indexUnique index for each track (can be ignored for analysis)
track_idSpotify's unique identifier for the track
artistsName of the performing artist(s)
album_nameTitle of the album the track belongs to
track_nameTitle of the track
popularityPopularity score on Spotify (0–100 scale)
duration_msDuration of the track in milliseconds
explicitIndicates whether the track contains explicit content
danceabilityHow suitable the track is for dancing (0.0 to 1.0)
energyIntensity and activity level of the track (0.0 to 1.0)
keyMusical key (0 = C, 1 = C♯/D♭, …, 11 = B)
loudnessOverall loudness of the track in decibels (dB)
modeModality (major = 1, minor = 0)
speechinessPresence of spoken words in the track (0.0 to 1.0)
acousticnessConfidence measure of whether the track is acoustic (0.0 to 1.0)
instrumentalnessPredicts whether the track contains no vocals (0.0 to 1.0)
livenessPresence of an audience in the recording (0.0 to 1.0)
valenceMusical positivity conveyed (0.0 = sad, 1.0 = happy)
tempoEstimated tempo in beats per minute (BPM)
time_signatureTime signature of the track (e.g., 4 = 4/4)
track_genreAssigned genre label for the track

Key Features:

  • Comprehensive Track Data: Metadata combined with detailed audio analysis.
  • Genre Diversity: Includes tracks from various music genres.
  • Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
  • Machine Learning Friendly: Clean and numerical format ideal for ML models.

Usage:

This dataset is valuable for:

  • 🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
  • 📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
  • 🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
  • 🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

Data Maintenance:

Additional Notes:

  • This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
  • Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.
Search
Clear search
Close search
Google apps
Main menu