4 datasets found

Spotify Tracks Attributes and Popularity

kaggle.com

Updated Jul 9, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 9, 2025

Dataset provided by

Kaggle

Authors

Melissa Monfared

Description

About Dataset

Overview:

This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

Dataset Details:

This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

Schema and Column Descriptions:

Column Name	Description
`index`	Unique index for each track (can be ignored for analysis)
`track_id`	Spotify's unique identifier for the track
`artists`	Name of the performing artist(s)
`album_name`	Title of the album the track belongs to
`track_name`	Title of the track
`popularity`	Popularity score on Spotify (0–100 scale)
`duration_ms`	Duration of the track in milliseconds
`explicit`	Indicates whether the track contains explicit content
`danceability`	How suitable the track is for dancing (0.0 to 1.0)
`energy`	Intensity and activity level of the track (0.0 to 1.0)
`key`	Musical key (0 = C, 1 = C♯/D♭, …, 11 = B)
`loudness`	Overall loudness of the track in decibels (dB)
`mode`	Modality (major = 1, minor = 0)
`speechiness`	Presence of spoken words in the track (0.0 to 1.0)
`acousticness`	Confidence measure of whether the track is acoustic (0.0 to 1.0)
`instrumentalness`	Predicts whether the track contains no vocals (0.0 to 1.0)
`liveness`	Presence of an audience in the recording (0.0 to 1.0)
`valence`	Musical positivity conveyed (0.0 = sad, 1.0 = happy)
`tempo`	Estimated tempo in beats per minute (BPM)
`time_signature`	Time signature of the track (e.g., 4 = 4/4)
`track_genre`	Assigned genre label for the track

Key Features:

Comprehensive Track Data: Metadata combined with detailed audio analysis.
Genre Diversity: Includes tracks from various music genres.
Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
Machine Learning Friendly: Clean and numerical format ideal for ML models.

Usage:

This dataset is valuable for:

🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

Data Maintenance:

Source: https://huggingface.co/datasets/ConquestAce/spotify-songs
Last Updated: 2025/04/26
License: Check original source for usage terms.

Additional Notes:

This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.

Tamil Music Artists and Their Top Spotify Tracks
kaggle.com
Updated Sep 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
QUIXOTIX (2024). Tamil Music Artists and Their Top Spotify Tracks [Dataset]. https://www.kaggle.com/datasets/akashsiddharth/spotify-tamil-artists-top-tracks-popularity/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 1, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
QUIXOTIX
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset provides insights into the most popular tracks of renowned Tamil music artists on Spotify, and popularity over time. The data includes information about artists such as Ilayaraja, A.R. Rahman, Yuvan Shankar Raja, Harris Jayaraj, Vijay Antony, G.V. Prakash, Santhosh Narayanan, Anirudh Ravichander, Hiphop Tamizha, Sean Roldan, Sam C.S., Devi Sri Prasad, Ghibhran and Thaman.

The dataset includes:

Top Tracks: Track names, popularity scores, album details, and release dates.

Artist Popularity Over Time: Aggregated popularity scores for each artist per year, allowing analysis of trends and the identification of the top trending artist for each year.

Track Duration and Popularity: Explore the relationship between the duration of tracks and their popularity.

This dataset is ideal for:

Data Analysis: Analyze trends in Tamil music over the years, identify top-performing artists, and examine the factors contributing to the popularity of specific tracks.

Visualization Projects: Create visualizations of artist popularity trends, track performance, and other key insights.

Music Industry Insights: Gain a better understanding of the success of Tamil music artists on global streaming platforms like Spotify.

This dataset was collected using the Spotify API and is intended to support projects in data science, music analytics, and industry trend analysis.
Spotify Song Attributes
kaggle.com
Updated Aug 4, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GeorgeMcIntire (2017). Spotify Song Attributes [Dataset]. https://www.kaggle.com/forums/f/5360/spotify-song-attributes
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 4, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
GeorgeMcIntire
Description
Context

A dataset of 2017 songs with attributes from Spotify's API. Each song is labeled "1" meaning I like it and "0" for songs I don't like. I used this to data to see if I could build a classifier that could predict whether or not I would like a song.

I wrote an article about the project I used this data for. It includes code on how to grab this data from the Spotipy API wrapper and the methods behind my modeling. https://opendatascience.com/blog/a-machine-learning-deep-dive-into-my-spotify-data/

Content

Each row represents a song.

There are 16 columns. 13 of which are song attributes, one column for song name, one for artist, and a column called "target" which is the label for the song.

Here are the 13 track attributes: acousticness, danceability, duration_ms, energy, instrumentalness, key, liveness, loudness, mode, speechiness, tempo, time_signature, valence.

Information on what those traits mean can be found here: https://developer.spotify.com/web-api/get-audio-features/

Acknowledgements

I would like to thank Spotify for providing this readily accessible data.

Inspiration

I'm a music lover who's curious about why I love the music that I love.
Audio Features for Playlist Creation
kaggle.com
Updated Mar 3, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aniruddha Achar (2017). Audio Features for Playlist Creation [Dataset]. https://www.kaggle.com/aniruddhaachar/audio-features/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 3, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aniruddha Achar
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context

This data was compiled as part of our undergrad project that used machine learning to classify songs based on themes or activities songs are associated with. For the project we, four activities were choose.

Dinner: Songs that sound good when played in a dinner setting or at a restaurant.

Sleep: Songs that promote sleep when they are played.

Party: Songs that sound good when played at a party.

Workout: Songs that sound good when one is exercising/ working out.

The collection of data started with collecting playlist details form Spotify. Spotify web API was used for the collection of the playlist of each category. Track title, album name and artist names were used to extract low level and high level Audio features like MFCC, Spectral centroid, Spectral Roll-off, Spectral Bandwidth, Tempo, Spectral Contrast and Root Mean Square Energy of the songs. For ease of computation, the mean of the values were calculated and added to the tables.

Data was also curated using Spotify's audio analysis API. A larger set of songs is part of this data set.

Content

The data set has eight tables.

Four tables with names playlist_audio_features have the signal processing features like MFCC, spectral centroid etc.

Four more tables with names playlist_spotify_features have the data extracted from Spotify's audio feature API. These tables have larger number of features. The data set size is quite large.

Description of the "playlist"_audio_features columns:

The first column has the simple integer id if the track. (This id is local to that file).

The second column has the name of the track.

The third column name mfcc has the mean of the calculated MFCC for that track. 20 MFC coefficients were extracted from one frame of the track.

The forth column is named scem: This is the mean of Spectral centroid. Spectral centroid was calculated for each frame.

The fifth column is named scom: This is the mean of Spectral contrast. Spectral contrast was calculated for each frame.

The sixth column is named srom: This is the mean of Spectral Roll-off. Spectral roll-off was calculated for each frame.

The seventh column is named sbwm: This is the mean of Spectral Bandwidth. Spectral Bandwidth was calculated for each frame.

The eight column is name tempo: This is the estimated tempo of the track.

The ninth column is name rmse: This is the mean of the RSME was calculated for each frame.

Description of the

id: This is the Spotify id of the track.

name: This is the name of the track.

url: This is a Spotify uri of the track.

artist: This is a one or more artists who worked on the track. 5-13: Description of each of the column can be found at https://developer.spotify.com/web-api/get-audio-features/

Acknowledgements

We would like to thank Librosa an opensource audio feature extraction library in python for developing a great tool. We would also thank the large research done on music genre classification using audio feature which helped us in developing this data set as well as the classification. A special thanks to Spotify
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity

Spotify Tracks Attributes and Popularity

A feature-rich dataset of Spotify tracks (audio characteristics & genre labels)

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 9, 2025

Dataset provided by

Kaggle

Authors

Melissa Monfared

Description

About Dataset

Overview:

Dataset Details:

Schema and Column Descriptions:

Column Name	Description
`index`	Unique index for each track (can be ignored for analysis)
`track_id`	Spotify's unique identifier for the track
`artists`	Name of the performing artist(s)
`album_name`	Title of the album the track belongs to
`track_name`	Title of the track
`popularity`	Popularity score on Spotify (0–100 scale)
`duration_ms`	Duration of the track in milliseconds
`explicit`	Indicates whether the track contains explicit content
`danceability`	How suitable the track is for dancing (0.0 to 1.0)
`energy`	Intensity and activity level of the track (0.0 to 1.0)
`key`	Musical key (0 = C, 1 = C♯/D♭, …, 11 = B)
`loudness`	Overall loudness of the track in decibels (dB)
`mode`	Modality (major = 1, minor = 0)
`speechiness`	Presence of spoken words in the track (0.0 to 1.0)
`acousticness`	Confidence measure of whether the track is acoustic (0.0 to 1.0)
`instrumentalness`	Predicts whether the track contains no vocals (0.0 to 1.0)
`liveness`	Presence of an audience in the recording (0.0 to 1.0)
`valence`	Musical positivity conveyed (0.0 = sad, 1.0 = happy)
`tempo`	Estimated tempo in beats per minute (BPM)
`time_signature`	Time signature of the track (e.g., 4 = 4/4)
`track_genre`	Assigned genre label for the track

Key Features:

Comprehensive Track Data: Metadata combined with detailed audio analysis.
Genre Diversity: Includes tracks from various music genres.
Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
Machine Learning Friendly: Clean and numerical format ideal for ML models.

Usage:

This dataset is valuable for:

🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

Data Maintenance:

Source: https://huggingface.co/datasets/ConquestAce/spotify-songs
Last Updated: 2025/04/26
License: Check original source for usage terms.

Additional Notes:

This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.

Clear search

Close search

Google apps

Main menu

Spotify Tracks Attributes and Popularity

About Dataset

Overview:

Dataset Details:

Schema and Column Descriptions:

Key Features:

Usage:

Data Maintenance:

Additional Notes:

Tamil Music Artists and Their Top Spotify Tracks

Spotify Song Attributes

Context

Content

Acknowledgements

Inspiration

Audio Features for Playlist Creation

Context

Content

Description of the "playlist"_audio_features columns:

Description of the

Acknowledgements

Spotify Tracks Attributes and Popularity

A feature-rich dataset of Spotify tracks (audio characteristics & genre labels)

About Dataset

Overview:

Dataset Details:

Schema and Column Descriptions:

Key Features:

Usage:

Data Maintenance:

Additional Notes: