Facebook
Twitterts (Timestamp):
platform:
ms_played:
conn_country:
master_metadata_track_name:
master_metadata_album_artist_name:
master_metadata_album_album_name:
spotify_track_uri:
reason_start:
reason_end:
shuffle:
offline:
offline_timestamp:
incognito_mode:
This dataset is suitable for performing detailed Exploratory Data Analysis (EDA) to uncover patterns, trends, and insights into the user's music-listening behaviour. Potential analyses could include the distribution of listening durations, favourite artists and tracks, exploration of geographic listening patterns, and examination of usage patterns across different platforms.
Visualization tools such as Matplotlib and Seaborn could be utilized for a more in-depth analysis to create visual representations of the findings. This dataset aligns well with your interest in data science, offering opportunities to apply analytical techniques to real-world streaming data.
Facebook
Twitterhttps://brightdata.com/licensehttps://brightdata.com/license
Gain valuable insights into music trends, artist popularity, and streaming analytics with our comprehensive Spotify Dataset. Designed for music analysts, marketers, and businesses, this dataset provides structured and reliable data from Spotify to enhance market research, content strategy, and audience engagement.
Dataset Features
Track Information: Access detailed data on songs, including track name, artist, album, genre, and release date. Streaming Popularity: Extract track popularity scores, listener engagement metrics, and ranking trends. Artist & Album Insights: Analyze artist performance, album releases, and genre trends over time. Related Searches & Recommendations: Track related search terms and suggested content for deeper audience insights. Historical & Real-Time Data: Retrieve historical streaming data or access continuously updated records for real-time trend analysis.
Customizable Subsets for Specific Needs Our Spotify Dataset is fully customizable, allowing you to filter data based on track popularity, artist, genre, release date, or listener engagement. Whether you need broad coverage for industry analysis or focused data for content optimization, we tailor the dataset to your needs.
Popular Use Cases
Market Analysis & Trend Forecasting: Identify emerging music trends, genre popularity, and listener preferences. Artist & Label Performance Tracking: Monitor artist rankings, album success, and audience engagement. Competitive Intelligence: Analyze competitor music strategies, playlist placements, and streaming performance. AI & Machine Learning Applications: Use structured music data to train AI models for recommendation engines, playlist curation, and predictive analytics. Advertising & Sponsorship Insights: Identify high-performing tracks and artists for targeted advertising and sponsorship opportunities.
Whether you're optimizing music marketing, analyzing streaming trends, or enhancing content strategies, our Spotify Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Nilay Gaitonde
Released under CC0: Public Domain
Facebook
TwitterThis dataset was created by Imraan Virani
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
What makes a song popular on Spotify?
Do artist popularity and follower count influence track success more than audio features?
How do album types and release dates shape listening trends?
These were the questions that inspired me to build this dataset.
Using Spotify’s API, I collected data on over 8,700 tracks, capturing detailed metadata about songs, artists, and albums. This dataset is ideal for exploring the intersection of music analytics, artist influence, and streaming behavior.
This dataset contains one CSV file with over 8,700 rows. Each row represents a unique track and includes metadata across three dimensions: track, artist, and album.
| Column Name | Description |
|---|---|
track_id | Unique identifier for the track |
track_number | Track’s position on the album |
track_popularity | Spotify popularity score (0–100) |
track_duration_ms | Duration of the track in milliseconds |
explicit | Whether the track contains explicit content |
artist_name | Name of the performing artist |
artist_popularity | Spotify popularity score for the artist |
artist_followers | Number of Spotify followers for the artist |
album_id | Unique identifier for the album |
album_name | Name of the album |
album_release_date | Original release date of the album |
artist_genres | Genre tags associated with the artist |
album_total_tracks | Total number of tracks on the album |
album_type | Type of album (e.g., album, single, compilation) |
All data was collected using the Spotify Web API.
This dataset is intended for educational and research purposes only.
You can use this dataset to:
A cleaned version of the dataset (spotify_data_clean.csv) is now available. It includes:
The cleaned dataset (spotify_data_clean.csv) was generated through a multi-step SQL pipeline designed to ensure consistency, completeness, and usability for analysis. Below is a summary of the transformations applied:
track_name.artist_name, artist_popularity, artist_followers, and artist_genres using album-level joins (e.g., for albums like 1989).'N/A' for strings0 for numeric fields'[]' for genre arrays (temporary placeholder)track_name, artist_name, album_name, album_type, explicit.explicit values to uppercase (TRUE / FALSE).artist_genres using regex to remove brackets and quotes.2020), appended -06-30 to estimate mid-year.2020-07), appended -01 to complete the date.DATE format using STR_TO_DATE().track_duration_min by converting track_duration_ms to minutes.track_duration_ms column after conversion.artist_genres for well-known artists using manual overrides:
country, pop, indie, folkpop rock, alternative pop, pop punkalternative pop, electropop, dark pop'N/A'.ROW_NUMBER() over track_name, artist_name, album_name, and album_release_date to identify duplicates.row_num column.This SQL workflow ensures the dataset is clean, consistent, and ready for exploratory data analysis, genre modeling, and public sharing. All transformations were verified using sample queries and profiling tools.
Explore genre trends and usage patterns in this companion notebook:
👉 Top Genres Using Pandas
Feel free to fork the dataset or share your analyses!
If you clean, enrich, or expand the dataset, contributions are always welcome.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains my all Streaming History of Spotify yet from June-2020 to February-2025.
I have requested this dataset from Spotify itself.
Use case of the Dataset:
datetime column so you can also perform some time series analysis (at some extent).
Facebook
TwitterWelcome to the "Spotify Top Artists by Monthly Listeners" dataset! Dive into the world of music streaming with this comprehensive collection of data, which offers valuable insights into the artists dominating the digital airwaves on Spotify, one of the world's leading music streaming platforms.
Columns:
Artist: The "Artist" column contains the names or unique identifiers of musical artists. Each row in this column represents a specific artist. This column serves as the primary identifier for the artists in the dataset.
Listeners: The "Listeners" column represents the number of listeners or fans for each artist. It quantifies the artist's fan base or the total number of people who have engaged with their music.
Daily Trend: The "Daily Trend" column contains data that reflects the daily trend or change in popularity of each artist. It may include metrics or values indicating whether an artist is gaining or losing listeners on a daily basis. This metric helps to track an artist's current momentum and popularity.
Peak: The "Peak" column signifies the highest point of popularity or the peak level of engagement that an artist has achieved within a specific timeframe. It provides insights into the artist's historical performance and when they were most widely appreciated.
PkListeners: The "PkListeners" column represents the number of listeners an artist had at the peak of their popularity. This metric offers a specific quantitative measure of an artist's highest level of engagement with their audience.
This dataset is a goldmine for music enthusiasts, data analysts, and researchers eager to explore the dynamics of popularity and musical diversity on Spotify. It provides a rich source of information for tracking artist trends, analyzing genre preferences, and gaining a deeper understanding of the global music landscape.
Whether you're interested in uncovering emerging artists, studying the impact of genres, or simply exploring the musical tastes of Spotify's user base, this dataset offers a robust foundation for insightful analyses and engaging visualizations.
Join us on a data-driven journey through the Spotify music ecosystem as we uncover the artists captivating the ears of millions of listeners, and let the data guide your exploration of the vibrant and ever-evolving world of music. Happy analyzing!
Facebook
TwitterEminem is one of the most influential hip-hop artists of all time, and the Rap God. I acquired this data using Spotify APs and supplemented it with other research to add to my own analysis. You can find my original analysis here: https://kaivalyapowale.com/2020/01/25/eminems-album-trends-and-music-to-be-murdered-by-2020/
My analysis was also published by top hip-hop websites: HipHop 24x7 - Data analysis reveals M2BMB is the most negative album Eminem Pro - Album's data analysis Eminem Pro - Eminem's albums are getting shorter
You can also check out visualizations on Tableau Public for some ideas: https://public.tableau.com/profile/kaivalya.powale#!/
I have primarily used data from Spotify’s API using multiple endpoints for albums and tracks. I supplemented the data with stats from Billboard and calculations from this post.
Here's the explanation for all the audio features provided by Spotify!
I have researched data about album sales from multiple sources online. They are cited in my original analysis.
Here are the Spotify's Album endpoints. Charts data from Billboard. Swear data from this source.
I'd love to see new visualizations using this data or using the sales, swear, or duration for an analysis. It would be wonderful if someone compares this with other hip-hop greats.
Facebook
TwitterA dataset of 2017 songs with attributes from Spotify's API. Each song is labeled "1" meaning I like it and "0" for songs I don't like. I used this to data to see if I could build a classifier that could predict whether or not I would like a song.
I wrote an article about the project I used this data for. It includes code on how to grab this data from the Spotipy API wrapper and the methods behind my modeling. https://opendatascience.com/blog/a-machine-learning-deep-dive-into-my-spotify-data/
Each row represents a song.
There are 16 columns. 13 of which are song attributes, one column for song name, one for artist, and a column called "target" which is the label for the song.
Here are the 13 track attributes: acousticness, danceability, duration_ms, energy, instrumentalness, key, liveness, loudness, mode, speechiness, tempo, time_signature, valence.
Information on what those traits mean can be found here: https://developer.spotify.com/web-api/get-audio-features/
I would like to thank Spotify for providing this readily accessible data.
I'm a music lover who's curious about why I love the music that I love.
Facebook
TwitterBy Priyanka Dobhal [source]
This dataset contains information about the music of Billie Eilish on Spotify, including track name, acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness and tempo. It also includes data about the popularity of each song and the artist behind it. Each song is also uniquely identified using its URI. This dataset gives us insight into what characteristics make up Billie Eilish's music and how popular her songs are. With this dataset we can analyse what factors influence a song's popularity to better understand why some songs become hits while others don't get as much attention. We can also compare the features of her music to other artists' songs in order to find similarities and differences between them both in sound style and how much people listen to them
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains information about Billie Eilish's music on Spotify, including track name, acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness
- Analyzing the effect of musical attributes (e.g. acousticness, danceability, energy, etc.) on listeners' engagement with a specific artist's music.
- Exploring the relationship between lyrical content and popularity of an artist's songs to discover potential trends in songwriting approaches that increase or decrease a song's chances of success.
- Finding correlations between lyrical and musical elements to gain insights into popular music trends over time or within Billie Eilish’s discography specifically
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.
File: Billie_Eilish_Spotify.csv | Column name | Description | |:---------------------|:--------------------------------------------------------------| | album | The name of the album the song is from. (String) | | track_number | The track number of the song on the album. (Integer) | | uri | The unique identifier for the song. (String) | | acousticness | A measure of how acoustic the song is. (Float) | | danceability | A measure of how suitable a song is for dancing. (Float) | | energy | A measure of the intensity and activity of a song. (Float) | | instrumentalness | A measure of how much of the song is instrumental. (Float) | | liveness | A measure of how much the song was performed live. (Float) | | loudness | A measure of the volume of the song. (Float) | | speechiness | A measure of how much the song contains spoken words. (Float) | | tempo | The speed of the song. (Float) | | valence | A measure of the positivity of the song. (Float) | | popularity | A measure of how popular the song is. (Integer) | | artist | The artist who produced and performs the song. (String) |
File: Billie_Eilish_Lyrics_to_words.csv | Column name | Description | |:-----------------|:--------------------------------------------------------| | album | The name of the album the song is from. (String) | | track_number | The track number of the song on the album. (Integer) | | uri | The unique identifier for the song. (String) | | artist | The artist who produced and performs the song. (String) |
If you use this dataset in your research, please credit the original a...
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
For more in-depth information about audio features provided by Spotify: https://developer.spotify.com/documentation/web-api/reference/#/operations/get-audio-features
I reposted my old dataset as many people requested. I don't consider updating the dataset further.
Title: Spotify Dataset 1921-2020, 600k+ Tracks Subtitle: Audio features of 600k+ tracks, popularity metrics of 1M+ artists Source: Spotify Web API Creator: Yamac Eren Ay Release Date (of Last Version): April 2021 Link to this dataset: https://www.kaggle.com/yamaerenay/spotify-dataset-19212020-600k-tracks Link to the old dataset: https://www.kaggle.com/yamaerenay/spotify-dataset-1921-2020-160k-tracks
I am not posting here third-party Spotify data for arbitrary reasons or getting upvote.
The old dataset has been mentioned in tens of scientific papers using the old link which doesn't work anymore since July 2021, and most of the authors had some problems proving the validity of the dataset. You can cite the same dataset under the new link. I'll be posting more information regarding the old dataset.
If you have inquiries or complaints, please don't hesitate to reach out to me on LinkedIn or you can send me an email.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains a collection of the most popular songs on Spotify, along with various attributes that can be used for music analysis and recommendation systems. It includes audio features, lyrical details, and general metadata about each track, making it an excellent resource for machine learning, data science, and music analytics projects.
Each song in the dataset includes the following features:
🎧 Audio Features (Extracted from Spotify API):
📝 Lyrics-Based Features:
🎶 General Song Information:
This dataset is ideal for:
Data collected using the Spotify API and other sources. If you use this dataset, consider crediting it in your projects!
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Spotify Playlist Analysis Dataset is a collection of 3,232 different songs obtained from a personal music library and has been uploaded to Kaggle for analysis and exploration. The dataset offers a comprehensive overview of various tracks, enabling researchers, music enthusiasts, and data scientists to gain insights into the musical preferences and characteristics present in the library.
Facebook
TwitterI've been diving into the vibrant world of data for a solid two years, and guess what? I'm finally cracking the code on what it takes to soar in this industry! Early in my data adventures, I was like a kid on Limewire when I found Kaggle, downloading everything that caught my eye. But then, I stumbled upon Spotify's data and... let's just say, it was a bit of a reality check.
I found myself wrestling with duplicate records, scratching my head over inconsistent schemas, and feeling lost in the sauce without any guides. That experience was a game-changer for me. I made a promise to my future self: “When you've got the skills, create a dataset that's not just good, but legendary.” That time has come!
Introducing my unique Spotify dataset – a crystal clear reflection of dedication and clarity. What makes this set stand out? You're not just getting data; you're getting a story. You can literally trace my steps, unraveling the magic behind each table through my script on Github. It's like having a backstage pass to a data concert! (Yes, Swifties will love this dataset too 😉)
I'm all about transparency, and I believe it's the key to trust. With this dataset, I'm laying it all out there – no smoke and mirrors, just pure, unadulterated, CLEAN data. I want you to feel the same excitement I do when data just clicks into place. I encourage you all to checkout the Github repo I linked above to see how this dataset came to life!
If you have any questions, suggestions or simply want to network, reach out to me on LinkedIn
This dataset is created using data sourced from Spotify and adheres to their Terms of Use. The dataset is intended for non-commercial, academic purposes and does not infringe upon Spotify's intellectual property rights. For full details on Spotify's terms, please visit Spotify's Terms and Conditions of Use.
You can find documentation for Spotifys Web APIs here
As of 12/20/2023, this is V1 of my data and I'll most likely release a few more versions after working through kinks from former releases.
Other Datasets: - Zillow
Facebook
TwitterBy Sean Miller [source]
The dataset consists of two main files: Scrobble_Features.csv and My Streaming Activity.csv. The Scrobble_Features.csv file contains detailed information about the music tracks, including genre, duration, popularity, and various audio features. On the other hand, the My Streaming Activity.csv file offers 4 years' worth of music streaming data from multiple platforms.
Key columns in these files include: - Performer: The name of the performer or artist. - Song: The title of the song. - Album: The name of the album that each song belongs to. - spotify_genre: The genre(s) assigned to each song according to Spotify's classification. - spotify_track_preview_url: URLs providing previews for each song on Spotify. - spotify_track_duration_ms: The duration of each song in milliseconds. - spotify_track_popularity: A popularity score indicating how popular each track is on Spotify. - spotify_track_explicit: A boolean value indicating whether or not a track contains explicit content.
Further musical attributes are also included: - danceability: A measure determining how suitable a song is for dancing based on various musical elements. - energy: An indicator measuring the intensity and activity level present in a song's composition. - key: Identifies the key signature (e.g., C major) that each track is performed in - loudness: Reveals how loud or soft a given track is overall in decibels (dB). - mode : Indicates whether a given track is composed in major or minor scale/mode. These attributes aim to provide insights into different aspects of a song's overall composition and impact.
Additionally, this dataset offers information about the timestamps when streaming activities occurred in both Central Time Zone (TimeStamp_Central) and Coordinated Universal Time (UTC) (TimeStamp_UTC).
In this guide, we will walk you through how to effectively use this dataset for your analysis or projects. Let's get started!
Understanding the Columns
Before diving into analyzing the data, let's understand the meaning of each column in the dataset:
Performer: The name of the performer or artist of the song.Song: The title of the song.spotify_genre: The genre(s) of the song according to Spotify.spotify_track_preview_url: The URL of a preview of the song on Spotify.spotify_track_duration_ms: The duration of the song in milliseconds.spotify_track_popularity: The popularity score of the song on Spotify. (Numeric/Integer)spotify_track_explicit: Indicates whether the song contains explicit content. (Boolean)danceability: A measure of how suitable a song is for dancing based on a combination of musical elements. (Numeric/Float)energy: A measure o fthe intensity and activity level present in a track.(Alternatively it can also represent acoustic as well). (Numeric/Float)
- 'key'- represents grouping.of songs based on keys found within that specific set pf songs
- 'loundess' represents how loud or.silent that particular tract is usually defines by Clown Circle Diameter'.(diameter varies with loudness(sound pressure level). -'mode':defines what type/modeis represented(i.e If Major mode denoted by '1',If minor mood is denoted.by value '0') -'Speechiness':Detecting spoken words(actually presence/removal of spoken dialects.song verses). -Acousticness:Probability of track being acoustic,concerted,edt. -instrumentalness-instrumental.also calcylates effectively considering odds and ends ( for example; Intensity of beat.Solo drumming. -'liveness':a sentiment reflecting the probability that a song was performed since the recording being analysed 'valence'-The musical positivity/cheerfulness conveyed by a track.'1'represents most positive ;'0'mostly one(most presumably sad) -tempo:'Rate at which particular beats re occur in.oncluding beats); BPM (
- Music Recommendation System: This dataset can be used to develop a music recommendation system by analyzing the streaming activity and audio features of different songs. By understanding the preferences and listening habits of users, personalized music recommendations can be generated for individuals or households.
- Genre Analysis and Trends: The dataset provides information about the performer, genre, and popularity of songs. This data can be utilized to analyze trends in music genres over the years, identify popular artists in different genres, and understand the ...
Facebook
TwitterThis dataset aims to get data from the Spotify API and do a data analysis on the acquired data. The inspiration behind this is to analyse the tracks of my favourite artist (A.R.Rahman) , which is done using the features provided by Spotify for each track.
The dataset is created using python by utilizing the the spotipy module to get data from the Spotify API and process the data using the pandas module.
For more information and source code, please refer the following Github repository link:
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
🎧 Can you predict who’s about to hit “unsubscribe”?
This dynamic Spotify churn dataset gives you the chance to find out. Packed with real-world user behavior — from listening time, skips, and ads seen to premium usage, plan types, and login activity — this dataset is your backstage pass into how users interact with a top music streaming platform.
With key demographic info (age, gender, country) and a clear churn indicator, it’s perfect for building powerful machine learning models, uncovering retention trends, or exploring the secret sauce behind loyal listeners.
Whether you're aiming to reduce churn, boost engagement, or showcase your data science skills, this dataset hits all the right notes. 🎶 Ready to turn data into decisions? Hit play.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By maharshipandya (From Huggingface) [source]
This dataset provides comprehensive information about Spotify tracks encompassing a diverse collection of 125 genres. It has been compiled and cleaned using Spotify's Web API and Python. Presented in CSV format, this dataset is easily accessible and amenable to analysis. The dataset comprises multiple columns, each representing distinctive audio features associated with individual tracks.
The columns include: artists (the name of the artist or artists who performed the track), album_name (the title of the album to which the track belongs), track_name (the specific name of each track), popularity (a numerical score indicating the popularity of a song on Spotify ranging from 0 to 100), duration_ms (the duration of each track measured in milliseconds), explicit (a boolean value denoting whether a song contains explicit content or not).
Furthermore, there are various audio features that provide deep insights into the musical characteristics of each track. These features include danceability, energy, key, loudness, mode, speechiness (indicating whether spoken words are present in a song), acousticness (measuring how much a song leans towards acoustic sounds rather than electric ones), instrumentalness (indicating how likely it is for a song to be instrumental rather than vocal-oriented).
Additional audio attributes encompass liveness, reflecting the presence or absence of live audience elements within tracks; valence quantifying musical positiveness conveyed by a song; tempo denoting beats per minute; and time_signature revealing details about bar structures within tracks.
The dataset enables users to discern patterns across multiple genres while also facilitating genre prediction based on perceptible audio nuances derived through machine learning models.
Aspiring audiophiles, music enthusiasts,and data scientists can effectively harness this repository for research purposes—fostering extensive exploration into genre dynamics and comprehending nuanced relationships between various musical attributes featured in these Spotify masterpieces
Introduction:
Download and Load the Dataset: Start by downloading the dataset from Kaggle in CSV format. Once downloaded, load the dataset into your preferred programming environment or tool such as Python, R, or Excel.
Familiarize Yourself with the Columns: Take some time to understand the meaning of each column in the dataset:
- artists: The name of the artist(s) who performed the track.
- album_name: The name of at album that contains a given track.
- track_name: The name of a specific track.
- popularity: A score indicating how popular a track is on Spotify (ranging from 0 to 100).
- duration_ms: The duration of a track in milliseconds.
- explicit: Indicates whether a track contains explicit content (True or False).
Explore Audio Features: This dataset includes various audio features associated with each track. Here are some notable ones:
A. Danceability: Danceability measures how suitable a track is for dancing, ranging from 0 to 1. Tracks with high danceability scores are more energetic and rhythmic, making them ideal for dancing.
B. Energy: Energy represents intensity and activity within a song on a scale from 0 to 1. Tracks with high energy tend to be more fast-paced and intense.
C.Loudness: Loudness indicates how loud or quiet an entire song is in decibels (dB). Positive values represent louder songs while negative values suggest quieter ones.
D.Key: Key refers to different musical keys assigned integers ranging from 0-11, with each number representing a different key. Knowing the key can provide insights into the mood and tone of a song.
E.Valence: Valence measures the musical positiveness conveyed by a track, ranging from 0 to 1. High valence values indicate more positive or happy tracks, while lower values suggest more negative or sad ones.
F.Tempo: Tempo is the speed or pace of a song in beats per minute (BPM). It gives an idea about how fast or slow a track is.
Data Analysis and Visualization: Utilize various data analysis techniques and visualization tools to gain insights into the
- Music Recommendation System: With multiple audio features such as danceability, energy, and valence, this dataset can be used to bu...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Context and Inspiration Your dataset creation is driven by the desire to understand the evolving landscape of music genres through the lens of data analysis. By examining the characteristics of songs across various genres, you aimed to investigate the boundaries that define these genres and explore the potential of machine learning models to classify songs in a manner akin to human perception. This exploration touches on the intersection of musicology and data science, aiming to reveal insights about the music we enjoy and how it's structured and perceived on a technical level. Through sentiment analysis, you're also diving into the emotional depth of music, connecting the sonic features of songs to the emotions conveyed in their lyrics, an ambitious and fascinating endeavor that bridges the gap between the quantitative and the qualitative aspects of music.
Under the data.csv, spotify_tracks_top50.csv - track_id: Unique identifier for each track on Spotify. - playlist_id: Unique identifier for the playlist from which the track was collected. - date_added: Date the track was added to the playlist. - track_name: Name of the track. - first_artist: Name of the primary artist of the track. - artist_id: Unique identifier for the primary artist on Spotify. - track_preview: URL for the 30-second preview mp3 of the track. - album_name: Name of the album the track is from. - danceability: A measure of how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. - energy: A perceptual measure of intensity and activity. - key: The key the track is in. Integers map to pitches using standard Pitch Class notation. - loudness: The overall loudness of a track in decibels (dB). - mode: Modality of the track (major or minor). - speechiness: Measures the presence of spoken words in a track. - acousticness: A confidence measure of whether the track is acoustic. - instrumentalness: Predicts whether a track contains no vocals. - liveness: Detects the presence of an audience in the recording. - valence: Measures the musical positiveness conveyed by a track. - tempo: The overall estimated tempo of a track in beats per minute (BPM). - type, id, uri, track_href, analysis_url: Various identifiers that provide detailed information about the track or facilitate accessing more data about the track through the Spotify Web API. - duration_ms: The duration of the track in milliseconds. - time_signature: An estimated overall time signature of a track.
Spotify Developer Link - https://developer.spotify.com/documentation/web-api/reference/get-an-artist
10000 Spotify Playlist - https://open.spotify.com/playlist/1YL4XoegERoragv0RK2RC9?si=569f0a0b1149489b
Refer to the Spotify License and Agreement Terms https://developer.spotify.com/terms
Facebook
TwitterThis dataset contains a comprehensive list of the most famous songs of 2023 as listed on Spotify. The dataset offers a wealth of features beyond what is typically available in similar datasets. It provides insights into each song's attributes, popularity, and presence on various music platforms. The dataset includes information such as track name, artist(s) name, release date, Spotify playlists and charts, streaming statistics, Apple Music presence, Deezer presence, Shazam charts, and various audio features.
Here is the link for the 2024 data: "https://www.kaggle.com/datasets/nelgiriyewithana/most-streamed-spotify-songs-2024">Most Streamed Spotify Songs 2024 🟢
- track_name: Name of the song
- artist(s)_name: Name of the artist(s) of the song
- artist_count: Number of artists contributing to the song
- released_year: Year when the song was released
- released_month: Month when the song was released
- released_day: Day of the month when the song was released
- in_spotify_playlists: Number of Spotify playlists the song is included in
- in_spotify_charts: Presence and rank of the song on Spotify charts
- streams: Total number of streams on Spotify
- in_apple_playlists: Number of Apple Music playlists the song is included in
- in_apple_charts: Presence and rank of the song on Apple Music charts
- in_deezer_playlists: Number of Deezer playlists the song is included in
- in_deezer_charts: Presence and rank of the song on Deezer charts
- in_shazam_charts: Presence and rank of the song on Shazam charts
- bpm: Beats per minute, a measure of song tempo
- key: Key of the song
- mode: Mode of the song (major or minor)
- danceability_%: Percentage indicating how suitable the song is for dancing
- valence_%: Positivity of the song's musical content
- energy_%: Perceived energy level of the song
- acousticness_%: Amount of acoustic sound in the song
- instrumentalness_%: Amount of instrumental content in the song
- liveness_%: Presence of live performance elements
- speechiness_%: Amount of spoken words in the song
- Music analysis: Explore patterns in audio features to understand trends and preferences in popular songs.
- Platform comparison: Compare the song's popularity across different music platforms.
- Artist impact: Analyze how artist involvement and attributes relate to a song's success.
- Temporal trends: Identify any shifts in music attributes and preferences over time.
- Cross-platform presence: Investigate how songs perform across different streaming services.
If you find this dataset useful, your support through an upvote would be greatly appreciated ❤️🙂
Thank you
Facebook
Twitterts (Timestamp):
platform:
ms_played:
conn_country:
master_metadata_track_name:
master_metadata_album_artist_name:
master_metadata_album_album_name:
spotify_track_uri:
reason_start:
reason_end:
shuffle:
offline:
offline_timestamp:
incognito_mode:
This dataset is suitable for performing detailed Exploratory Data Analysis (EDA) to uncover patterns, trends, and insights into the user's music-listening behaviour. Potential analyses could include the distribution of listening durations, favourite artists and tracks, exploration of geographic listening patterns, and examination of usage patterns across different platforms.
Visualization tools such as Matplotlib and Seaborn could be utilized for a more in-depth analysis to create visual representations of the findings. This dataset aligns well with your interest in data science, offering opportunities to apply analytical techniques to real-world streaming data.