Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
Each of the aforementioned entities are represented by unique IDs (Spotify URI).
Data is stored into following tables:
album
| id | name | uri |
id: Album ID as provided by Spotify
name: Album Name as provided by Spotify
uri: Album URI as provided by Spotify
artist
| id | name | uri |
id: Artist ID as provided by Spotify
name: Artist Name as provided by Spotify
uri: Artist URI as provided by Spotify
track
| id | name | duration | popularity | explicit | preview_url | uri | album_id |
id: Track ID as provided by Spotify
name: Track Name as provided by Spotify
duration: Track Duration (in milliseconds) as provided by Spotify
popularity: Track Popularity as provided by Spotify
explicit: Whether the track has explicit lyrics or not. (true or false)
preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
uri: Track Uri as provided by Spotify
album_id: Album Id to which the track belongs
playlist
| id | name | followers | uri | total_tracks |
id: Playlist ID as provided by Spotify
name: Playlist Name as provided by Spotify
followers: Playlist Followers as provided by Spotify
uri: Playlist Uri as provided by Spotify
total_tracks: Total number of tracks in the playlist.
track_artist1
| track_id | artist_id |
Track-Artist association table
track_playlist1
| track_id | playlist_id |
Track-Playlist association table
- - - - - SETUP - - - - -
The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.
spotifydbdumpschemashare.sql contains the schema for the database (for reference):
spotifydbdumpshare.sql is the actual data dump.
Setup steps:
1. Create database
- - - - - PAPER - - - - -
The description of this dataset can be found in the following paper:
Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this blog are the latest Spotify statistics that paint a picture of how the company has succeeded so far and what’s likely to happen in the future.
Spotify Million Playlist Dataset Challenge
Summary
The Spotify Million Playlist Dataset Challenge consists of a dataset and evaluation to enable research in music recommendations. It is a continuation of the RecSys Challenge 2018, which ran from January to July 2018. The dataset contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform between January 2010 and October 2017. The evaluation task is automatic playlist continuation: given a seed playlist title and/or initial set of tracks in a playlist, to predict the subsequent tracks in that playlist. This is an open-ended challenge intended to encourage research in music recommendations, and no prizes will be awarded (other than bragging rights).
Background
Playlists like Today’s Top Hits and RapCaviar have millions of loyal followers, while Discover Weekly and Daily Mix are just a couple of our personalized playlists made especially to match your unique musical tastes.
Our users love playlists too. In fact, the Digital Music Alliance, in their 2018 Annual Music Report, state that 54% of consumers say that playlists are replacing albums in their listening habits.
But our users don’t love just listening to playlists, they also love creating them. To date, over 4 billion playlists have been created and shared by Spotify users. People create playlists for all sorts of reasons: some playlists group together music categorically (e.g., by genre, artist, year, or city), by mood, theme, or occasion (e.g., romantic, sad, holiday), or for a particular purpose (e.g., focus, workout). Some playlists are even made to land a dream job, or to send a message to someone special.
The other thing we love here at Spotify is playlist research. By learning from the playlists that people create, we can learn all sorts of things about the deep relationship between people and music. Why do certain songs go together? What is the difference between “Beach Vibes” and “Forest Vibes”? And what words do people use to describe which playlists?
By learning more about nature of playlists, we may also be able to suggest other tracks that a listener would enjoy in the context of a given playlist. This can make playlist creation easier, and ultimately help people find more of the music they love.
Dataset
To enable this type of research at scale, in 2018 we sponsored the RecSys Challenge 2018, which introduced the Million Playlist Dataset (MPD) to the research community. Sampled from the over 4 billion public playlists on Spotify, this dataset of 1 million playlists consist of over 2 million unique tracks by nearly 300,000 artists, and represents the largest public dataset of music playlists in the world. The dataset includes public playlists created by US Spotify users between January 2010 and November 2017. The challenge ran from January to July 2018, and received 1,467 submissions from 410 teams. A summary of the challenge and the top scoring submissions was published in the ACM Transactions on Intelligent Systems and Technology.
In September 2020, we re-released the dataset as an open-ended challenge on AIcrowd.com. The dataset can now be downloaded by registered participants from the Resources page.
Each playlist in the MPD contains a playlist title, the track list (including track IDs and metadata), and other metadata fields (last edit time, number of playlist edits, and more). All data is anonymized to protect user privacy. Playlists are sampled with some randomization, are manually filtered for playlist quality and to remove offensive content, and have some dithering and fictitious tracks added to them. As such, the dataset is not representative of the true distribution of playlists on the Spotify platform, and must not be interpreted as such in any research or analysis performed on the dataset.
Dataset Contains
1000 examples of each scenario:
Title only (no tracks) Title and first track Title and first 5 tracks First 5 tracks only Title and first 10 tracks First 10 tracks only Title and first 25 tracks Title and 25 random tracks Title and first 100 tracks Title and 100 random tracks
Download Link
Full Details: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge
Download Link: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge/dataset_files
This dataset was created by Vatsal Mavani
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists and the tracks contained in these playlists.
The csv-file holding the dataset contains the following columns: "user_id", "artistname", "trackname", "playlistname", where
user_id is a hash of the user's Spotify user name
artistname is the name of the artist
trackname is the title of the track and
playlistname is the name of the playlist that contains this track.
The separator used is , each entry is enclosed by double quotes and the escape character used is .
A description of the generation of the dataset and the dataset itself can be found in the following paper:
Pichl, Martin; Zangerle, Eva; Specht, Günther: "Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?" in 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pp. 1360-1365, IEEE, Atlantic City, 2015.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Despite not being very profitable, Spotify has maintained strong subscriber and revenue growth. Here are the key Spotify Statistics you need to know.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spotify has about 80 million individual tracks on the platform.
As of March 2018, Spotify’s user base was dominated by Millennials, with 29 percent of its users aged 25 to 34 and 26 percent aged between 18 and 24 years old. The streaming giant has permanently altered how consumers discover, engage with and share music, and according to a 2018 survey, Spotify reaches almost half of 16 to 24 year olds in the United States each week.
The power of Spotify
Spotify’s popularity is undeniable, accumulating millions of premium subscribers worldwide each quarter and hundreds of millions of unique visitors to Spotify.com every month. In the United States, Spotify is one of the most commonly used apps for listening to podcasts, and despite being in constant competition with Apple Music, remains a large part of U.S. music listeners’ lives.
A survey revealed that Spotify is also the preferred music streaming service among 18 to 29-year-olds, which may seem unremarkable given the data on Spotify’s user base, but serves as further evidence of Spotify’s popularity among younger users. Whether Spotify’s growth will last forever, only time will tell, particularly as Apple Music continues to put up a good fight and smaller but increasingly popular services such as Deezer begin to make their mark. But with the company recording a profit in early 2019 for the first time since its inception, Spotify remains very much a market leader and firmly on the path to future success.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MGD: Music Genre Dataset
Over recent years, the world has seen a dramatic change in the way people consume music, moving from physical records to streaming services. Since 2017, such services have become the main source of revenue within the global recorded music market. Therefore, this dataset is built by using data from Spotify. It provides a weekly chart of the 200 most streamed songs for each country and territory it is present, as well as an aggregated global chart.
Considering that countries behave differently when it comes to musical tastes, we use chart data from global and regional markets from January 2017 to December 2019, considering eight of the top 10 music markets according to IFPI: United States (1st), Japan (2nd), United Kingdom (3rd), Germany (4th), France (5th), Canada (8th), Australia (9th), and Brazil (10th).
We also provide information about the hit songs and artists present in the charts, such as all collaborating artists within a song (since the charts only provide the main ones) and their respective genres, which is the core of this work. MGD also provides data about musical collaboration, as we build collaboration networks based on artist partnerships in hit songs. Therefore, this dataset contains:
Genre Networks: Success-based genre collaboration networks
Genre Mapping: Genre mapping from Spotify genres to super-genres
Artist Networks: Success-based artist collaboration networks
Artists: Some artist data
Hit Songs: Hit Song data and features
Charts: Enhanced data from Spotify Weekly Top 200 Charts
This dataset was originally built for a conference paper at ISMIR 2020. If you make use of the dataset, please also cite the following paper:
Gabriel P. Oliveira, Mariana O. Silva, Danilo B. Seufitelli, Anisio Lacerda, and Mirella M. Moro. Detecting Collaboration Profiles in Success-based Music Genre Networks. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020), 2020.
@inproceedings{ismir/OliveiraSSLM20, title = {Detecting Collaboration Profiles in Success-based Music Genre Networks}, author = {Gabriel P. Oliveira and Mariana O. Silva and Danilo B. Seufitelli and Anisio Lacerda and Mirella M. Moro}, booktitle = {21st International Society for Music Information Retrieval Conference} pages = {726--732}, year = {2020} }
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Paying subscribers account for about half of Spotify’s monthly active users. This is the number of paying subscribers by year that Spotify has had since 2015.
According to the most recently available data, the average amount of time spent listening to Spotify by the platform’s active users was 25 hours per month. The idea of spending more than an entire day per month streaming Spotify content might sound unbelievable, but internet usage is higher than ever and vastly supersedes time spent with radio, print media and even television. Streaming has become one of the most popular media activities in the world, and Spotify is the global market leader when it comes to digital music streaming.
Spotify and the music market
Spotify dominates the global music streaming market and offers artists serious amounts of exposure, with some musicians garnering audiences of several million each month. The company’s user base has increased dramatically, even in the face of antitrust complaints and competition from Apple Music, as well as users expressing concern over how little Spotify pays its artists.
In 2018, music streaming revenue amounted to 8.9 billion U.S. dollars worldwide, more than six times the figure recorded for 2013. Whilst revenue growth has slowed, this is not the case when it comes to subscriber numbers, which in Spotify’s case have been increasing at a quarterly rate. Spotify reported 96 million premium subscribers worldwide at the end of 2018, finishing the year with a flourish after regular growth and accumulating 25 million more paid users in just twelve months.
The company also commands more than 30 percent of music streaming subscribers worldwide, outperforming both Apple Music and Amazon.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As of February 2024, Spotify has 574 million active users.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains information regarding the artist and the songs on the Spotify global weekly chart. This includes the position in the chart, number of streams, artist name, song title and much more.
How many paid subscribers does Spotify have? As of the fourth quarter of 2024, Spotify had 263 million premium subscribers worldwide, up from 236 million in the corresponding quarter of 2023. Spotify’s subscriber base has increased dramatically in the last few years and has more than doubled since early 2019. Spotify and competitors Spotify is a music streaming service originally founded in 2006 in Sweden. The platform can be used from various devices and allows users to browse through a catalogue of music licensed through multiple record labels, as well as creating and sharing playlists with other users. Additionally, listeners are able to enjoy music for free with advertisements or are also given the option to purchase a subscription to allow for unlimited ad-free music streaming. Spotify’s largest competitors are Pandora, a company that offers a similar service and remains popular in the United States, and Apple Music, which was launched in 2015. While Pandora was once among the highest-grossing music apps in the Apple App Store, recent rankings show that global services like QQ Music, NetEase Cloud Music, and YouTube Music now generate higher monthly revenues.Users are also able to register Spotify accounts using Facebook directly through the website using an app. This enables them to connect with other Facebook friends and explore their music tastes and playlists. Spotify is a popular source for keeping up-to-date with music, and the ability to enjoy Spotify anywhere at any time allows consumers to shape their music consumption around their lifestyles and preferences.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
34% of Spotify’s monthly active users live in Europe. That means that Spotify has 147.22 million users in the EU regions alone.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spotify has about 11 million artists and creators on the platform.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
29% of all Spotify users fall into the 25 to 34 age range. This is closely followed by 26% of users in the 18 to 24-year-old age.
The most successful music streaming service in the United States was Apple Music as of September, with the most up to date information showing that 49.5 million users accessed the platform each month. Spotify closely followed, with a similarly impressive 47.7 million monthly users.
What is a music streaming service?
Music streaming services provide their users with a database compiled of songs, playlists, albums and videos, where content can be accessed online, downloaded, shared, bookmarked and organized.
The music streaming business is huge, and has sometimes been lauded as the savior of the music industry. The biggest two services are in constant competition for the monopoly of the market. Apple Music was launched in 2015, whereas Spotify has been around since 2008. Other popular streaming services include Deezer, SoundCloud and iHeartRadio.
Do artists make a lot of money from streaming services?
In short, unfortunately not. Both Apple Music and Spotify have been frequently criticized for the tiny royalty payments they offer artists. Particularly for emerging talent, streaming services are far from a lucrative source of income. Bigger, established stars like Taylor Swift are more likely to regularly make a good amount of money this way. But either way, a track needs to go viral or be streamed several million times before it earns any real cash.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Results of the two samples’ Welch’s t-tests, Cohen’s d and Kolmogorov-Smirnov tests between the dance and baseline music datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There are currently more than 4 million podcast titles on the platform today.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
Each of the aforementioned entities are represented by unique IDs (Spotify URI).
Data is stored into following tables:
album
| id | name | uri |
id: Album ID as provided by Spotify
name: Album Name as provided by Spotify
uri: Album URI as provided by Spotify
artist
| id | name | uri |
id: Artist ID as provided by Spotify
name: Artist Name as provided by Spotify
uri: Artist URI as provided by Spotify
track
| id | name | duration | popularity | explicit | preview_url | uri | album_id |
id: Track ID as provided by Spotify
name: Track Name as provided by Spotify
duration: Track Duration (in milliseconds) as provided by Spotify
popularity: Track Popularity as provided by Spotify
explicit: Whether the track has explicit lyrics or not. (true or false)
preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
uri: Track Uri as provided by Spotify
album_id: Album Id to which the track belongs
playlist
| id | name | followers | uri | total_tracks |
id: Playlist ID as provided by Spotify
name: Playlist Name as provided by Spotify
followers: Playlist Followers as provided by Spotify
uri: Playlist Uri as provided by Spotify
total_tracks: Total number of tracks in the playlist.
track_artist1
| track_id | artist_id |
Track-Artist association table
track_playlist1
| track_id | playlist_id |
Track-Playlist association table
- - - - - SETUP - - - - -
The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.
spotifydbdumpschemashare.sql contains the schema for the database (for reference):
spotifydbdumpshare.sql is the actual data dump.
Setup steps:
1. Create database
- - - - - PAPER - - - - -
The description of this dataset can be found in the following paper:
Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham