How many paid subscribers does Spotify have? As of the fourth quarter of 2024, Spotify had 263 million premium subscribers worldwide, up from 236 million in the corresponding quarter of 2023. Spotify’s subscriber base has increased dramatically in the last few years and has more than doubled since early 2019. Spotify and competitors Spotify is a music streaming service originally founded in 2006 in Sweden. The platform can be used from various devices and allows users to browse through a catalogue of music licensed through multiple record labels, as well as creating and sharing playlists with other users. Additionally, listeners are able to enjoy music for free with advertisements or are also given the option to purchase a subscription to allow for unlimited ad-free music streaming. Spotify’s largest competitors are Pandora, a company that offers a similar service and remains popular in the United States, and Apple Music, which was launched in 2015. While Pandora was once among the highest-grossing music apps in the Apple App Store, recent rankings show that global services like QQ Music, NetEase Cloud Music, and YouTube Music now generate higher monthly revenues.Users are also able to register Spotify accounts using Facebook directly through the website using an app. This enables them to connect with other Facebook friends and explore their music tastes and playlists. Spotify is a popular source for keeping up-to-date with music, and the ability to enjoy Spotify anywhere at any time allows consumers to shape their music consumption around their lifestyles and preferences.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
Each of the aforementioned entities are represented by unique IDs (Spotify URI).
Data is stored into following tables:
album
| id | name | uri |
id: Album ID as provided by Spotify
name: Album Name as provided by Spotify
uri: Album URI as provided by Spotify
artist
| id | name | uri |
id: Artist ID as provided by Spotify
name: Artist Name as provided by Spotify
uri: Artist URI as provided by Spotify
track
| id | name | duration | popularity | explicit | preview_url | uri | album_id |
id: Track ID as provided by Spotify
name: Track Name as provided by Spotify
duration: Track Duration (in milliseconds) as provided by Spotify
popularity: Track Popularity as provided by Spotify
explicit: Whether the track has explicit lyrics or not. (true or false)
preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
uri: Track Uri as provided by Spotify
album_id: Album Id to which the track belongs
playlist
| id | name | followers | uri | total_tracks |
id: Playlist ID as provided by Spotify
name: Playlist Name as provided by Spotify
followers: Playlist Followers as provided by Spotify
uri: Playlist Uri as provided by Spotify
total_tracks: Total number of tracks in the playlist.
track_artist1
| track_id | artist_id |
Track-Artist association table
track_playlist1
| track_id | playlist_id |
Track-Playlist association table
- - - - - SETUP - - - - -
The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.
spotifydbdumpschemashare.sql contains the schema for the database (for reference):
spotifydbdumpshare.sql is the actual data dump.
Setup steps:
1. Create database
- - - - - PAPER - - - - -
The description of this dataset can be found in the following paper:
Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham
The Spotify Music Streaming Sessions Dataset (MSSD) consists of 160 million streaming sessions with associated user interactions, audio features and metadata describing the tracks streamed during the sessions, and snapshots of the playlists listened to during the sessions.
This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations.
Spotify Million Playlist Dataset Challenge Summary The Spotify Million Playlist Dataset Challenge consists of a dataset and evaluation to enable research in music recommendations. It is a continuation of the RecSys Challenge 2018, which ran from January to July 2018. The dataset contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform between January 2010 and October 2017. The evaluation task is automatic playlist continuation: given a seed playlist title and/or initial set of tracks in a playlist, to predict the subsequent tracks in that playlist. This is an open-ended challenge intended to encourage research in music recommendations, and no prizes will be awarded (other than bragging rights). Background Playlists like Today’s Top Hits and RapCaviar have millions of loyal followers, while Discover Weekly and Daily Mix are just a couple of our personalized playlists made especially to match your unique musical tastes. Our users love playlists too. In fact, the Digital Music Alliance, in their 2018 Annual Music Report, state that 54% of consumers say that playlists are replacing albums in their listening habits. But our users don’t love just listening to playlists, they also love creating them. To date, over 4 billion playlists have been created and shared by Spotify users. People create playlists for all sorts of reasons: some playlists group together music categorically (e.g., by genre, artist, year, or city), by mood, theme, or occasion (e.g., romantic, sad, holiday), or for a particular purpose (e.g., focus, workout). Some playlists are even made to land a dream job, or to send a message to someone special. The other thing we love here at Spotify is playlist research. By learning from the playlists that people create, we can learn all sorts of things about the deep relationship between people and music. Why do certain songs go together? What is the difference between “Beach Vibes” and “Forest Vibes”? And what words do people use to describe which playlists? By learning more about nature of playlists, we may also be able to suggest other tracks that a listener would enjoy in the context of a given playlist. This can make playlist creation easier, and ultimately help people find more of the music they love. Dataset To enable this type of research at scale, in 2018 we sponsored the RecSys Challenge 2018, which introduced the Million Playlist Dataset (MPD) to the research community. Sampled from the over 4 billion public playlists on Spotify, this dataset of 1 million playlists consist of over 2 million unique tracks by nearly 300,000 artists, and represents the largest public dataset of music playlists in the world. The dataset includes public playlists created by US Spotify users between January 2010 and November 2017. The challenge ran from January to July 2018, and received 1,467 submissions from 410 teams. A summary of the challenge and the top scoring submissions was published in the ACM Transactions on Intelligent Systems and Technology. In September 2020, we re-released the dataset as an open-ended challenge on AIcrowd.com. The dataset can now be downloaded by registered participants from the Resources page. Each playlist in the MPD contains a playlist title, the track list (including track IDs and metadata), and other metadata fields (last edit time, number of playlist edits, and more). All data is anonymized to protect user privacy. Playlists are sampled with some randomization, are manually filtered for playlist quality and to remove offensive content, and have some dithering and fictitious tracks added to them. As such, the dataset is not representative of the true distribution of playlists on the Spotify platform, and must not be interpreted as such in any research or analysis performed on the dataset. Dataset Contains 1000 examples of each scenario: Title only (no tracks) Title and first track Title and first 5 tracks First 5 tracks only Title and first 10 tracks First 10 tracks only Title and first 25 tracks Title and 25 random tracks Title and first 100 tracks Title and 100 random tracks Download Link Full Details: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge Download Link: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge/dataset_files {"references": ["C.W. Chen, P. Lamere, M. Schedl, and H. Zamani. Recsys Challenge 2018: Automatic Music Playlist Continuation. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18), 2018."]}
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists and the tracks contained in these playlists.
The csv-file holding the dataset contains the following columns: "user_id", "artistname", "trackname", "playlistname", where
The separator used is , each entry is enclosed by double quotes and the escape character used is \.
A description of the generation of the dataset and the dataset itself can be found in the following paper:
Pichl, Martin; Zangerle, Eva; Specht, Günther: "Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?" in 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pp. 1360-1365, IEEE, Atlantic City, 2015.
https://brightdata.com/licensehttps://brightdata.com/license
Gain valuable insights into music trends, artist popularity, and streaming analytics with our comprehensive Spotify Dataset. Designed for music analysts, marketers, and businesses, this dataset provides structured and reliable data from Spotify to enhance market research, content strategy, and audience engagement.
Dataset Features
Track Information: Access detailed data on songs, including track name, artist, album, genre, and release date. Streaming Popularity: Extract track popularity scores, listener engagement metrics, and ranking trends. Artist & Album Insights: Analyze artist performance, album releases, and genre trends over time. Related Searches & Recommendations: Track related search terms and suggested content for deeper audience insights. Historical & Real-Time Data: Retrieve historical streaming data or access continuously updated records for real-time trend analysis.
Customizable Subsets for Specific Needs Our Spotify Dataset is fully customizable, allowing you to filter data based on track popularity, artist, genre, release date, or listener engagement. Whether you need broad coverage for industry analysis or focused data for content optimization, we tailor the dataset to your needs.
Popular Use Cases
Market Analysis & Trend Forecasting: Identify emerging music trends, genre popularity, and listener preferences. Artist & Label Performance Tracking: Monitor artist rankings, album success, and audience engagement. Competitive Intelligence: Analyze competitor music strategies, playlist placements, and streaming performance. AI & Machine Learning Applications: Use structured music data to train AI models for recommendation engines, playlist curation, and predictive analytics. Advertising & Sponsorship Insights: Identify high-performing tracks and artists for targeted advertising and sponsorship opportunities.
Whether you're optimizing music marketing, analyzing streaming trends, or enhancing content strategies, our Spotify Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
34% of Spotify’s monthly active users live in Europe. That means that Spotify has 147.22 million users in the EU regions alone. Here’s the breakdown of regions that contribute the most users to Spotify:
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Spotify Playlist-ORIGINS Dataset is a dataset of Spotify playlists called ORIGINS, which individuals have made with their favorite songs since 2014.
2) Data Utilization (1) Spotify Playlist-ORIGINS Dataset has characteristics that: • This dataset contains detailed music information for each playlist, including song name, artist, album, genre, release year, track ID, and structured metadata such as name, description, and song order for each playlist. (2) Spotify Playlist-ORIGINS Dataset can be used to: • Playlist-based music recommendation and user preference analysis: It can be used to develop a machine learning/deep learning-based music recommendation system or to study user preference analysis using playlist and song information. • Music Trend and Genre Popularity Analysis: It analyzes release year, genre, and artist data and can be used to study the music industry and culture, including music trends by period and genre, and changes in popular artists and songs.
https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/
Content
This is a dataset of Spotify tracks over a range of 125 different genres. Each track has some audio features associated with it. The data is in CSV format which is tabular and can be loaded quickly.
Usage
The dataset can be used for:
Building a Recommendation System based on some user input or preference Classification purposes based on audio features and available genres Any other application that you can think of. Feel free to discuss!
Column… See the full description on the dataset page: https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Did We Solve the Problem? The objective of this analysis was to predict high streaming counts on Spotify and perform a detailed cluster analysis to understand user behavior. Here’s a summary of how we addressed each part of the objective:
Prediction of High Streaming Counts:
Implemented Multiple Models: We utilized several machine learning models including Decision Tree, Random Forest, Gradient Boosting, Support Vector Machine (SVM), and k-Nearest Neighbors (k-NN). Comparison and Evaluation: These models were evaluated based on classification metrics like accuracy, precision, recall, and F1-score. The Gradient Boosting and Random Forest models were found to be the most effective in predicting high streaming counts. Cluster Analysis:
K-means Clustering: We applied K-means clustering to segment users into three clusters based on their listening behavior. Detailed Characterization: Each cluster was analyzed to understand the distinct characteristics, such as average playtime, skip rate, offline usage, and shuffle usage. Visualizations: Histograms and scatter plots were used to visualize the distributions and relationships within each cluster. Results and Insights Effective Models: The Gradient Boosting and Random Forest models provided the highest accuracy and balanced performance for predicting high streaming counts. User Segmentation: The cluster analysis revealed three distinct user segments: Cluster 1: Users with longer playtimes and lower skip rates. Cluster 2: Users with moderate playtimes and skip rates. Cluster 3: Users with shorter playtimes and higher skip rates. These insights can be leveraged for targeted marketing, personalized recommendations, and improving user engagement on Spotify.
Conclusion Yes, we solved the problem. We successfully predicted high streaming counts using effective machine learning models and provided a detailed cluster analysis to understand user behavior. The analysis offers valuable insights for enhancing Spotify’s recommendation system and user experience.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This is my Spotify Streaming History and Playlist data which I was got from Spotify itself by requesting it.
This dataset contains two streaming history JSON files and a Playlist JSON file.
Usage: This data is used for EDA and draw some insights about the user like his favourite artists, songs or albums.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As of January 2025, Spotify has over 640 million monthly active users. Here is the full breakdown of Spotify users by year since 2015:
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset was extracted from the Spotify platform using the Python library "Spotipy", which allows users to access music data provided via APIs. The dataset collected includes about 1 Million tracks with 19 features between 2000 and 2023. Also, there is a total of 61,445 unique artists and 82 genres in the data.
This clean data has been prepared and utilized for research purposes. Its significance lies in its potential to unravel patterns and predict song popularity prior to its release. This dataset could be used to create various predictive models with machine-learning/deep-learning techniques.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F5006553%2F317c91feba2aa097559fd59bf408e944%2Ftable_desc_spot.png?generation=1687699189506270&alt=media" alt="">
This dataset contains customer reviews and ratings for the Spotify application, collected directly from the Google Play Store. It offers insight into user experiences and satisfaction with one of the world's largest music streaming services, which serves over 422 million monthly active users [1]. The data provides a valuable resource for understanding public sentiment towards the application and identifying factors that contribute to user satisfaction or dissatisfaction [1]. The data was collected by scraping Spotify reviews from the Google Play Store [1].
The dataset comprises 61,594 rows (or records) of Spotify app reviews [2]. The data is typically available in a CSV file format [3]. Review ratings range from 1 to 5, with 22,095 reviews rated 4.80 - 5.00, and 17,653 reviews rated 1.00 - 1.20 [4]. The number of 'thumbs up' for reviews ranges from 0 to 8195, with the vast majority (61,378 reviews) having between 0 and 409.75 thumbs up [5].
This dataset is ideal for various analytical applications, including: * Sentiment analysis to gauge overall user sentiment towards the Spotify app [1]. * Identifying key themes and reasons behind 1-star and 5-star reviews to understand user satisfaction drivers and pain points [1]. * Analysing trends in user feedback over time. * Developing machine learning models for text classification or natural language processing (NLP). * Understanding the impact of app updates or changes on user perception.
The dataset covers reviews submitted between 1st January 2022 and 9th July 2022 [2]. The reviews were collected globally from the Google Play Store [2, 6]. There are no specific demographic notes on data availability, as it is based on publicly available app store reviews.
CC-BY
Original Data Source: Spotify App Reviews
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Spotify Tracks Dataset contains information on tracks from over 125 music genres, including both audio features (e.g., danceability, energy, valence) and metadata (e.g., title, artist, genre).
2) Data Utilization (1) Characteristics of the Spotify Tracks Dataset: • The data is structured in a tabular format at the track level, where each column represents numerical or categorical features based on musical properties. This makes it suitable for recommendation systems, genre classification, and emotion analysis. • It includes multi-dimensional attributes grounded in music theory such as track duration, time signature, energy, loudness, tempo, and speechiness—enabling its use in music classification and clustering tasks.
(2) Applications of the Spotify Tracks Dataset: • Design of Music Recommendation Systems: It can be used to build content-based filtering systems or hybrid recommendation algorithms based on user preferences.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Spotify Recommendation’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/bricevergnou/spotify-recommendation on 28 January 2022.
--- Dataset description provided by original source is as follows ---
( You can check how I used this dataset on my github repository )
I am basically a HUGE fan of music ( mostly French rap though with some exceptions but I love music ). And someday , while browsing stuff on Internet , I found the Spotify's API . I knew I had to use it when I found out you could get information like danceability about your favorite songs just with their id's.
https://user-images.githubusercontent.com/86613710/127216769-745ac143-7456-4464-bbe3-adc53872c133.png" alt="image">
Once I saw that , my machine learning instincts forced me to work on this project.
I collected 100 liked songs and 95 disliked songs
For those I like , I made a playlist of my favorite 100 songs. It is mainly French Rap , sometimes American rap , rock or electro music.
For those I dislike , I collected songs from various kind of music so the model will have a broader view of what I don't like
There is : - 25 metal songs ( Cannibal Corps ) - 20 " I don't like " rap songs ( PNL ) - 25 classical songs - 25 Disco songs
I didn't include any Pop song because I'm kinda neutral about it
From the Spotify's API "Get a playlist's Items" , I turned the playlists into json formatted data which cointains the ID and the name of each track ( ids/yes.py and ids/no.py ). NB : on the website , specify "items(track(id,name))" in the fields format , to avoid being overwhelmed by useless data.
With a script ( ids/ids_to_data.py ) , I turned the json data into a long string with each ID separated with a comma.
Now I just had to enter the strings into the Spotify API "Get Audio Features from several tracks" and get my data files ( data/good.json and data/dislike.json )
From Spotify's API documentation :
And the variable that has to be predicted :
--- Original source retains full ownership of the source dataset ---
Overview Spotify is one of the largest music streaming service providers, with over 422 million monthly active users, including 182 million paying subscribers, as of March 2022. Some of them don't hesitate to share their experience using this application along with the given rating to denote how satisfied they are with the Application
The way data was collected Scraping Spotify reviews on Google Play Store
Ideas for using this dataset Sentiment analysis What makes the application receive 1-star and 5-star
Original Data Source: Spotify App Reviews
https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/
Context
This dataset consists of 24000 tracks from 30 genres, and is a shrunk version of maharshipandya/spotify-tracks-dataset dataset. All non-heuristic data is cut and cleaned for better usability and performance. All data taken from Spotify API and is open source. This dataset can be used to train prediction models based on user preferences, or categorise tracks by corresponding heuristic.
Column Description
danceability: Danceability describes how suitable a track is… See the full description on the dataset page: https://huggingface.co/datasets/engels/spotify-tracks-lite.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The latest Spotify statistics from the company’s annual report show that 69% of Spotify premium subscribers are located in Europe and North America.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
North American Spotify users spend the most time on the platform steaming an average of 140 minutes of content on the Spotify app daily.
How many paid subscribers does Spotify have? As of the fourth quarter of 2024, Spotify had 263 million premium subscribers worldwide, up from 236 million in the corresponding quarter of 2023. Spotify’s subscriber base has increased dramatically in the last few years and has more than doubled since early 2019. Spotify and competitors Spotify is a music streaming service originally founded in 2006 in Sweden. The platform can be used from various devices and allows users to browse through a catalogue of music licensed through multiple record labels, as well as creating and sharing playlists with other users. Additionally, listeners are able to enjoy music for free with advertisements or are also given the option to purchase a subscription to allow for unlimited ad-free music streaming. Spotify’s largest competitors are Pandora, a company that offers a similar service and remains popular in the United States, and Apple Music, which was launched in 2015. While Pandora was once among the highest-grossing music apps in the Apple App Store, recent rankings show that global services like QQ Music, NetEase Cloud Music, and YouTube Music now generate higher monthly revenues.Users are also able to register Spotify accounts using Facebook directly through the website using an app. This enables them to connect with other Facebook friends and explore their music tastes and playlists. Spotify is a popular source for keeping up-to-date with music, and the ability to enjoy Spotify anywhere at any time allows consumers to shape their music consumption around their lifestyles and preferences.