This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.
This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.
Column Name | Description |
---|---|
index | Unique index for each track (can be ignored for analysis) |
track_id | Spotify's unique identifier for the track |
artists | Name of the performing artist(s) |
album_name | Title of the album the track belongs to |
track_name | Title of the track |
popularity | Popularity score on Spotify (0–100 scale) |
duration_ms | Duration of the track in milliseconds |
explicit | Indicates whether the track contains explicit content |
danceability | How suitable the track is for dancing (0.0 to 1.0) |
energy | Intensity and activity level of the track (0.0 to 1.0) |
key | Musical key (0 = C, 1 = C♯/D♭, …, 11 = B) |
loudness | Overall loudness of the track in decibels (dB) |
mode | Modality (major = 1, minor = 0) |
speechiness | Presence of spoken words in the track (0.0 to 1.0) |
acousticness | Confidence measure of whether the track is acoustic (0.0 to 1.0) |
instrumentalness | Predicts whether the track contains no vocals (0.0 to 1.0) |
liveness | Presence of an audience in the recording (0.0 to 1.0) |
valence | Musical positivity conveyed (0.0 = sad, 1.0 = happy) |
tempo | Estimated tempo in beats per minute (BPM) |
time_signature | Time signature of the track (e.g., 4 = 4/4) |
track_genre | Assigned genre label for the track |
This dataset is valuable for:
key
, mode
, and explicit
may need to be mapped for better readability in visualization.https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides insights into the most popular tracks of renowned Tamil music artists on Spotify, and popularity over time. The data includes information about artists such as Ilayaraja, A.R. Rahman, Yuvan Shankar Raja, Harris Jayaraj, Vijay Antony, G.V. Prakash, Santhosh Narayanan, Anirudh Ravichander, Hiphop Tamizha, Sean Roldan, Sam C.S., Devi Sri Prasad, Ghibhran and Thaman.
The dataset includes:
This dataset is ideal for:
This dataset was collected using the Spotify API and is intended to support projects in data science, music analytics, and industry trend analysis.
A dataset of 2017 songs with attributes from Spotify's API. Each song is labeled "1" meaning I like it and "0" for songs I don't like. I used this to data to see if I could build a classifier that could predict whether or not I would like a song.
I wrote an article about the project I used this data for. It includes code on how to grab this data from the Spotipy API wrapper and the methods behind my modeling. https://opendatascience.com/blog/a-machine-learning-deep-dive-into-my-spotify-data/
Each row represents a song.
There are 16 columns. 13 of which are song attributes, one column for song name, one for artist, and a column called "target" which is the label for the song.
Here are the 13 track attributes: acousticness, danceability, duration_ms, energy, instrumentalness, key, liveness, loudness, mode, speechiness, tempo, time_signature, valence.
Information on what those traits mean can be found here: https://developer.spotify.com/web-api/get-audio-features/
I would like to thank Spotify for providing this readily accessible data.
I'm a music lover who's curious about why I love the music that I love.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data was compiled as part of our undergrad project that used machine learning to classify songs based on themes or activities songs are associated with. For the project we, four activities were choose.
The collection of data started with collecting playlist details form Spotify. Spotify web API was used for the collection of the playlist of each category. Track title, album name and artist names were used to extract low level and high level Audio features like MFCC, Spectral centroid, Spectral Roll-off, Spectral Bandwidth, Tempo, Spectral Contrast and Root Mean Square Energy of the songs. For ease of computation, the mean of the values were calculated and added to the tables.
Data was also curated using Spotify's audio analysis API. A larger set of songs is part of this data set.
The data set has eight tables.
We would like to thank Librosa an opensource audio feature extraction library in python for developing a great tool. We would also thank the large research done on music genre classification using audio feature which helped us in developing this data set as well as the classification. A special thanks to Spotify
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.
This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.
Column Name | Description |
---|---|
index | Unique index for each track (can be ignored for analysis) |
track_id | Spotify's unique identifier for the track |
artists | Name of the performing artist(s) |
album_name | Title of the album the track belongs to |
track_name | Title of the track |
popularity | Popularity score on Spotify (0–100 scale) |
duration_ms | Duration of the track in milliseconds |
explicit | Indicates whether the track contains explicit content |
danceability | How suitable the track is for dancing (0.0 to 1.0) |
energy | Intensity and activity level of the track (0.0 to 1.0) |
key | Musical key (0 = C, 1 = C♯/D♭, …, 11 = B) |
loudness | Overall loudness of the track in decibels (dB) |
mode | Modality (major = 1, minor = 0) |
speechiness | Presence of spoken words in the track (0.0 to 1.0) |
acousticness | Confidence measure of whether the track is acoustic (0.0 to 1.0) |
instrumentalness | Predicts whether the track contains no vocals (0.0 to 1.0) |
liveness | Presence of an audience in the recording (0.0 to 1.0) |
valence | Musical positivity conveyed (0.0 = sad, 1.0 = happy) |
tempo | Estimated tempo in beats per minute (BPM) |
time_signature | Time signature of the track (e.g., 4 = 4/4) |
track_genre | Assigned genre label for the track |
This dataset is valuable for:
key
, mode
, and explicit
may need to be mapped for better readability in visualization.