32 datasets found
  1. P

    Data from: MSSD Dataset

    • paperswithcode.com
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Brost; Rishabh Mehrotra; Tristan Jehan (2025). MSSD Dataset [Dataset]. https://paperswithcode.com/dataset/mssd
    Explore at:
    Dataset updated
    May 13, 2025
    Authors
    Brian Brost; Rishabh Mehrotra; Tristan Jehan
    Description

    The Spotify Music Streaming Sessions Dataset (MSSD) consists of 160 million streaming sessions with associated user interactions, audio features and metadata describing the tracks streamed during the sessions, and snapshots of the playlists listened to during the sessions.

    This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations.

  2. Spotify's monthly active users 2015-2024

    • statista.com
    • ai-chatbox.pro
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Spotify's monthly active users 2015-2024 [Dataset]. https://www.statista.com/statistics/367739/spotify-global-mau/
    Explore at:
    Dataset updated
    Mar 21, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In the fourth quarter of 2024, the music streaming service Spotify reached an all-time high with 675 million active users worldwide. This marked an increase of around 12 percent in just one year. What is Spotify? Spotify is a music streaming service that offers digital audio content. Basic audio content can be accessed for free whereas premium user subscriptions enable users to access offline mobile content as well as listen to music without advertising. In the fourth quarter of 2024, the company reported 263 million paying subscribers. Launched in 2008, Spotify originated in Sweden before expanding to European markets and the United States in 2011. Spotify’s U.S. launch was strongly marketed through Facebook, with the music streaming app profiting from the social listening integration via social media. Part of Spotify’s appeal can be attributed to the user- and brand-curated playlists, which can be shared publicly or between friends. Fans may choose what to listen to based on their current mood or preference, and the ability to share such content provides an element of social connectivity ordinarily reserved for networking sites.

  3. h

    spotify-tracks-dataset

    • huggingface.co
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maharshipandya (2023). spotify-tracks-dataset [Dataset]. https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 30, 2023
    Authors
    maharshipandya
    License

    https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/

    Description

    Content

    This is a dataset of Spotify tracks over a range of 125 different genres. Each track has some audio features associated with it. The data is in CSV format which is tabular and can be loaded quickly.

      Usage
    

    The dataset can be used for:

    Building a Recommendation System based on some user input or preference Classification purposes based on audio features and available genres Any other application that you can think of. Feel free to discuss!

      Column… See the full description on the dataset page: https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset.
    
  4. Playlist2vec: Spotify Million Playlist Dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jun 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush Papreja; Piyush Papreja (2021). Playlist2vec: Spotify Million Playlist Dataset [Dataset]. http://doi.org/10.5281/zenodo.5002584
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 22, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Piyush Papreja; Piyush Papreja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
    The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
    The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
    Each of the aforementioned entities are represented by unique IDs (Spotify URI).
    Data is stored into following tables:

    • album
    • artist
    • track
    • playlist
    • track_artist1
    • track_playlist1

    album

    | id | name | uri |

    id: Album ID as provided by Spotify
    name: Album Name as provided by Spotify
    uri: Album URI as provided by Spotify


    artist

    | id | name | uri |

    id: Artist ID as provided by Spotify
    name: Artist Name as provided by Spotify
    uri: Artist URI as provided by Spotify


    track

    | id | name | duration | popularity | explicit | preview_url | uri | album_id |

    id: Track ID as provided by Spotify
    name: Track Name as provided by Spotify
    duration: Track Duration (in milliseconds) as provided by Spotify
    popularity: Track Popularity as provided by Spotify
    explicit: Whether the track has explicit lyrics or not. (true or false)
    preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
    uri: Track Uri as provided by Spotify
    album_id: Album Id to which the track belongs


    playlist

    | id | name | followers | uri | total_tracks |

    id: Playlist ID as provided by Spotify
    name: Playlist Name as provided by Spotify
    followers: Playlist Followers as provided by Spotify
    uri: Playlist Uri as provided by Spotify
    total_tracks: Total number of tracks in the playlist.

    track_artist1

    | track_id | artist_id |

    Track-Artist association table

    track_playlist1

    | track_id | playlist_id |

    Track-Playlist association table

    - - - - - SETUP - - - - -


    The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.

    spotifydbdumpschemashare.sql contains the schema for the database (for reference):
    spotifydbdumpshare.sql is the actual data dump.


    Setup steps:
    1. Create database

    - - - - - PAPER - - - - -


    The description of this dataset can be found in the following paper:

    Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham

  5. Spotify Tracks Attributes and Popularity

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Melissa Monfared
    Description

    About Dataset

    Overview:

    This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

    Dataset Details:

    This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

    Schema and Column Descriptions:

    Column NameDescription
    indexUnique index for each track (can be ignored for analysis)
    track_idSpotify's unique identifier for the track
    artistsName of the performing artist(s)
    album_nameTitle of the album the track belongs to
    track_nameTitle of the track
    popularityPopularity score on Spotify (0–100 scale)
    duration_msDuration of the track in milliseconds
    explicitIndicates whether the track contains explicit content
    danceabilityHow suitable the track is for dancing (0.0 to 1.0)
    energyIntensity and activity level of the track (0.0 to 1.0)
    keyMusical key (0 = C, 1 = C♯/D♭, …, 11 = B)
    loudnessOverall loudness of the track in decibels (dB)
    modeModality (major = 1, minor = 0)
    speechinessPresence of spoken words in the track (0.0 to 1.0)
    acousticnessConfidence measure of whether the track is acoustic (0.0 to 1.0)
    instrumentalnessPredicts whether the track contains no vocals (0.0 to 1.0)
    livenessPresence of an audience in the recording (0.0 to 1.0)
    valenceMusical positivity conveyed (0.0 = sad, 1.0 = happy)
    tempoEstimated tempo in beats per minute (BPM)
    time_signatureTime signature of the track (e.g., 4 = 4/4)
    track_genreAssigned genre label for the track

    Key Features:

    • Comprehensive Track Data: Metadata combined with detailed audio analysis.
    • Genre Diversity: Includes tracks from various music genres.
    • Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
    • Machine Learning Friendly: Clean and numerical format ideal for ML models.

    Usage:

    This dataset is valuable for:

    • 🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
    • 📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
    • 🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
    • 🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

    Data Maintenance:

    Additional Notes:

    • This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
    • Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.
  6. Spotify's premium subscribers 2015-2025

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Spotify's premium subscribers 2015-2025 [Dataset]. https://www.statista.com/statistics/244995/number-of-paying-spotify-subscribers/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    How many paid subscribers does Spotify have? As of the first quarter of 2025, Spotify had 268 million premium subscribers worldwide, up from 239 million in the corresponding quarter of 2024. Spotify’s subscriber base has increased dramatically in the last few years and has more than doubled since early 2019. Spotify and competitors Spotify is a music streaming service originally founded in 2006 in Sweden. The platform can be used from various devices and allows users to browse through a catalogue of music licensed through multiple record labels, as well as creating and sharing playlists with other users. Additionally, listeners are able to enjoy music for free with advertisements or are also given the option to purchase a subscription to allow for unlimited ad-free music streaming. Spotify’s largest competitors are Pandora, a company that offers a similar service and remains popular in the United States, and Apple Music, which was launched in 2015. While Pandora was once among the highest-grossing music apps in the Apple App Store, recent rankings show that global services like QQ Music, NetEase Cloud Music, and YouTube Music now generate higher monthly revenues.Users are also able to register Spotify accounts using Facebook directly through the website using an app. This enables them to connect with other Facebook friends and explore their music tastes and playlists. Spotify is a popular source for keeping up-to-date with music, and the ability to enjoy Spotify anywhere at any time allows consumers to shape their music consumption around their lifestyles and preferences.

  7. s

    Distribution Of Spotify Monthly Active Users By Region

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Distribution Of Spotify Monthly Active Users By Region [Dataset]. https://www.searchlogistics.com/learn/statistics/spotify-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    34% of Spotify’s monthly active users live in Europe. That means that Spotify has 147.22 million users in the EU regions alone. Here’s the breakdown of regions that contribute the most users to Spotify:

  8. o

    Google Play Store Spotify User Reviews

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Google Play Store Spotify User Reviews [Dataset]. https://www.opendatabay.com/data/dataset/38b8af43-8609-485a-b332-0d8257e530ec
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Reviews & Ratings
    Description

    This dataset contains customer reviews and ratings for the Spotify application, collected directly from the Google Play Store. It offers insight into user experiences and satisfaction with one of the world's largest music streaming services, which serves over 422 million monthly active users [1]. The data provides a valuable resource for understanding public sentiment towards the application and identifying factors that contribute to user satisfaction or dissatisfaction [1]. The data was collected by scraping Spotify reviews from the Google Play Store [1].

    Columns

    • Time_submitted: Records the specific time and date when a review was submitted by a user [2].
    • Review: Contains the full text of the user's review [2].
    • Rating: Represents the numerical score given by the user, ranging from 1 to 5, indicating their satisfaction level [2].
    • Total_thumbsup: Indicates how many other users found the particular review helpful [2].
    • Reply: Shows any direct reply to the user's review, often from the application developer or support [2].

    Distribution

    The dataset comprises 61,594 rows (or records) of Spotify app reviews [2]. The data is typically available in a CSV file format [3]. Review ratings range from 1 to 5, with 22,095 reviews rated 4.80 - 5.00, and 17,653 reviews rated 1.00 - 1.20 [4]. The number of 'thumbs up' for reviews ranges from 0 to 8195, with the vast majority (61,378 reviews) having between 0 and 409.75 thumbs up [5].

    Usage

    This dataset is ideal for various analytical applications, including: * Sentiment analysis to gauge overall user sentiment towards the Spotify app [1]. * Identifying key themes and reasons behind 1-star and 5-star reviews to understand user satisfaction drivers and pain points [1]. * Analysing trends in user feedback over time. * Developing machine learning models for text classification or natural language processing (NLP). * Understanding the impact of app updates or changes on user perception.

    Coverage

    The dataset covers reviews submitted between 1st January 2022 and 9th July 2022 [2]. The reviews were collected globally from the Google Play Store [2, 6]. There are no specific demographic notes on data availability, as it is based on publicly available app store reviews.

    License

    CC-BY

    Who Can Use It

    • Data Scientists and Analysts: For conducting sentiment analysis, natural language processing, and building predictive models.
    • Product Managers: To gain insights into user feedback, identify features that users appreciate or dislike, and inform product development decisions.
    • Market Researchers: To understand consumer perception of streaming services and competitive analysis.
    • Academic Researchers: For studies on user behaviour, app reviews, and digital product success.
    • App Developers: To monitor user satisfaction and identify areas for improvement in their applications.

    Dataset Name Suggestions

    • Spotify App Reviews 2022 (Google Play Store)
    • Google Play Store Spotify User Reviews
    • Spotify Mobile App Ratings and Feedback
    • Spotify User Sentiment Data
    • Global Spotify App Reviews (Jan-Jul 2022)

    Attributes

    Original Data Source: Spotify App Reviews

  9. Data from: Spotify Playlists Dataset

    • zenodo.org
    • explore.openaire.eu
    • +1more
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Pichl; Eva Zangerle; Eva Zangerle; Martin Pichl (2020). Spotify Playlists Dataset [Dataset]. http://doi.org/10.5281/zenodo.2594557
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Pichl; Eva Zangerle; Eva Zangerle; Martin Pichl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description


    This dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists and the tracks contained in these playlists.

    The csv-file holding the dataset contains the following columns: "user_id", "artistname", "trackname", "playlistname", where

    • user_id is a hash of the user's Spotify user name
    • artistname is the name of the artist
    • trackname is the title of the track and
    • playlistname is the name of the playlist that contains this track.

    The separator used is , each entry is enclosed by double quotes and the escape character used is \.

    A description of the generation of the dataset and the dataset itself can be found in the following paper:

    Pichl, Martin; Zangerle, Eva; Specht, Günther: "Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?" in 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pp. 1360-1365, IEEE, Atlantic City, 2015.

  10. o

    Spotify Million Playlist: Recsys Challenge 2018 Dataset

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Sep 27, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AIcrowd (2018). Spotify Million Playlist: Recsys Challenge 2018 Dataset [Dataset]. http://doi.org/10.5281/zenodo.6425593
    Explore at:
    Dataset updated
    Sep 27, 2018
    Authors
    AIcrowd
    Description

    Spotify Million Playlist Dataset Challenge Summary The Spotify Million Playlist Dataset Challenge consists of a dataset and evaluation to enable research in music recommendations. It is a continuation of the RecSys Challenge 2018, which ran from January to July 2018. The dataset contains 1,000,000 playlists, including playlist titles and track titles, created by users on the Spotify platform between January 2010 and October 2017. The evaluation task is automatic playlist continuation: given a seed playlist title and/or initial set of tracks in a playlist, to predict the subsequent tracks in that playlist. This is an open-ended challenge intended to encourage research in music recommendations, and no prizes will be awarded (other than bragging rights). Background Playlists like Today’s Top Hits and RapCaviar have millions of loyal followers, while Discover Weekly and Daily Mix are just a couple of our personalized playlists made especially to match your unique musical tastes. Our users love playlists too. In fact, the Digital Music Alliance, in their 2018 Annual Music Report, state that 54% of consumers say that playlists are replacing albums in their listening habits. But our users don’t love just listening to playlists, they also love creating them. To date, over 4 billion playlists have been created and shared by Spotify users. People create playlists for all sorts of reasons: some playlists group together music categorically (e.g., by genre, artist, year, or city), by mood, theme, or occasion (e.g., romantic, sad, holiday), or for a particular purpose (e.g., focus, workout). Some playlists are even made to land a dream job, or to send a message to someone special. The other thing we love here at Spotify is playlist research. By learning from the playlists that people create, we can learn all sorts of things about the deep relationship between people and music. Why do certain songs go together? What is the difference between “Beach Vibes” and “Forest Vibes”? And what words do people use to describe which playlists? By learning more about nature of playlists, we may also be able to suggest other tracks that a listener would enjoy in the context of a given playlist. This can make playlist creation easier, and ultimately help people find more of the music they love. Dataset To enable this type of research at scale, in 2018 we sponsored the RecSys Challenge 2018, which introduced the Million Playlist Dataset (MPD) to the research community. Sampled from the over 4 billion public playlists on Spotify, this dataset of 1 million playlists consist of over 2 million unique tracks by nearly 300,000 artists, and represents the largest public dataset of music playlists in the world. The dataset includes public playlists created by US Spotify users between January 2010 and November 2017. The challenge ran from January to July 2018, and received 1,467 submissions from 410 teams. A summary of the challenge and the top scoring submissions was published in the ACM Transactions on Intelligent Systems and Technology. In September 2020, we re-released the dataset as an open-ended challenge on AIcrowd.com. The dataset can now be downloaded by registered participants from the Resources page. Each playlist in the MPD contains a playlist title, the track list (including track IDs and metadata), and other metadata fields (last edit time, number of playlist edits, and more). All data is anonymized to protect user privacy. Playlists are sampled with some randomization, are manually filtered for playlist quality and to remove offensive content, and have some dithering and fictitious tracks added to them. As such, the dataset is not representative of the true distribution of playlists on the Spotify platform, and must not be interpreted as such in any research or analysis performed on the dataset. Dataset Contains 1000 examples of each scenario: Title only (no tracks) Title and first track Title and first 5 tracks First 5 tracks only Title and first 10 tracks First 10 tracks only Title and first 25 tracks Title and 25 random tracks Title and first 100 tracks Title and 100 random tracks Download Link Full Details: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge Download Link: https://www.aicrowd.com/challenges/spotify-million-playlist-dataset-challenge/dataset_files {"references": ["C.W. Chen, P. Lamere, M. Schedl, and H. Zamani. Recsys Challenge 2018: Automatic Music Playlist Continuation. In Proceedings of the 12th ACM Conference on Recommender Systems (RecSys '18), 2018."]}

  11. o

    Spotify App User Sentiment Reviews

    • opendatabay.com
    .undefined
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Spotify App User Sentiment Reviews [Dataset]. https://www.opendatabay.com/data/dataset/12b0af3f-2a23-4d46-882e-92694771c721
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Datasimple
    Area covered
    Reviews & Ratings
    Description

    This dataset features over 51,000 user reviews for the Spotify application, collected from the Google Play Store between January and July 2022. Its primary purpose is to facilitate the analysis of user sentiments and feedback regarding the app. Each review has been carefully labelled as either "Positive" or "Negative" based on its sentiment, offering a clear basis for sentiment analysis. The collection is well-documented, well-maintained, and contains clean data, making it a valuable resource for understanding user experience and satisfaction.

    Columns

    The dataset contains user reviews and their corresponding sentiment labels. While specific column names are not detailed in the available information, it can be inferred that it includes columns for the review text itself and a sentiment classification (e.g., 'Review_Text', 'Sentiment_Label').

    Distribution

    The dataset comprises over 51,000 user reviews. Data files are typically in CSV format. The sentiment distribution within the dataset is notable: 56% of reviews are labelled as Positive, while 44% are Negative. The exact number of rows or records is more than 51,000.

    Usage

    This dataset is highly versatile and can be applied to various analytical tasks. Ideal applications include sentiment analysis to gauge public opinion, trend analysis over time to observe shifts in user perception, and feature extraction to pinpoint specific aspects of the Spotify app that users commend or criticise. It is particularly useful for gaining deeper insights into user experience and satisfaction, and for identifying areas where the application could be improved. It has been used for learning, research, and application development.

    Coverage

    The data was collected from the Google Play Store, making its regional coverage global, reflecting reviews from users worldwide. The time range for the reviews spans from January to July 2022. The scope focuses exclusively on user feedback pertaining to the Spotify application.

    License

    CC-BY

    Who Can Use It

    This dataset is particularly beneficial for researchers and developers seeking to explore user perceptions and pinpoint areas for application enhancement. It is also suitable for individuals and organisations engaged in learning, academic research, and the development of various applications, especially those involving text analysis or machine learning.

    Dataset Name Suggestions

    • Spotify App User Sentiment Reviews
    • Google Play Spotify Reviews 2022
    • Spotify User Feedback Dataset
    • Spotify Mobile App Sentiment Analysis Data

    Attributes

    Original Data Source: Spotify User Reviews

  12. s

    Spotify Monthly Active Users

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Spotify Monthly Active Users [Dataset]. https://www.searchlogistics.com/learn/statistics/spotify-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    As of January 2025, Spotify has over 640 million monthly active users. Here is the full breakdown of Spotify users by year since 2015:

  13. Taylor Swift | The Eras Tour Official Setlist Data

    • kaggle.com
    Updated May 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    yuka_with_data (2024). Taylor Swift | The Eras Tour Official Setlist Data [Dataset]. https://www.kaggle.com/datasets/yukawithdata/taylor-swift-the-eras-tour-official-setlist-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    yuka_with_data
    Description

    💁‍♀️Please take a moment to carefully read through this description and metadata to better understand the dataset and its nuances before proceeding to the Suggestions and Discussions section.

    Dataset Description:

    This dataset provides a comprehensive collection of setlists from Taylor Swift’s official era tours, curated expertly by Spotify. The playlist, available on Spotify under the title "Taylor Swift The Eras Tour Official Setlist," encompasses a diverse range of songs that have been performed live during the tour events of this global artist. Each dataset entry corresponds to a song featured in the playlist.

    Taylor Swift, a pivotal figure in both country and pop music scenes, has had a transformative impact on the music industry. Her tours are celebrated not just for their musical variety but also for their theatrical elements, narrative style, and the deep emotional connection they foster with fans worldwide. This dataset aims to provide fans and researchers an insight into the evolution of Swift's musical and performance style through her tours, capturing the essence of what makes her tour unique.

    Data Collection and Processing:

    Obtaining the Data: The data was obtained directly from the Spotify Web API, specifically focusing on the setlist tracks by the artist. The Spotify API provides detailed information about tracks, artists, and albums through various endpoints.

    Data Processing: To process and structure the data, Python scripts were developed using data science libraries such as pandas for data manipulation and spotipy for API interactions, specifically for Spotify data retrieval.

    Workflow:

    Authentication API Requests Data Cleaning and Transformation Saving the Data

    Attribute Descriptions:

    • artist_name: the name of the artist (Taylor Swift)
    • track_name: the title of the track
    • is_explicit: Indicates whether the track contains explicit content
    • album_release_date: The date when the track was released
    • genres: A list of genres associated with Beyoncé
    • danceability: A measure from 0.0 to 1.0 indicating how suitable a track is for - dancing based on a combination of musical elements
    • valence: A measure from 0.0 to 1.0 indicating the musical positiveness conveyed by a track
    • energy: A measure from 0.0 to 1.0 representing a perceptual measure of intensity and activity
    • loudness: The overall loudness of a track in decibels (dB)
    • acousticness: A measure from 0.0 to 1.0 whether the track is acoustic
    • instrumentalness: Predicts whether a track contains no vocals
    • liveness: Detects the presence of an audience in the recordings speechiness: Detects the presence of spoken words in a track
    • key: The key the track is in. Integers map to pitches using standard Pitch Class notation
    • tempo: The overall estimated tempo of a track in beats per minute (BPM)
    • mode: Modality of the track
    • duration_ms: The length of the track in milliseconds
    • time_signature: An estimated overall time signature of a track
    • popularity: A score between 0 and 100, with 100 being the most popular

    Note: Popularity score reflects the score recorded on the day that retrieves this dataset. The popularity score could fluctuate daily.

    Potential Applications:

    • Predictive Analytics: Researchers might use this dataset to predict future setlist choices for tours based on album success, song popularity, and fan feedback.

    Disclaimer and Responsible Use:

    This dataset, derived from Spotify focusing on Taylor Swift's The Eras Tour setlist data, is intended for educational, research, and analysis purposes only. Users are urged to use this data responsibly, ethically, and within the bounds of legal stipulations.

    • Compliance with Terms of Service: Users should adhere to Spotify's Terms of Service and Developer Policies when utilizing this dataset.
    • Copyright Notice: The dataset presents music track information including names and artist details for analytical purposes and does not convey any rights to the music itself. Users must ensure that their use does not infringe on the copyright holders' rights. Any analysis, distribution, or derivative work should respect the intellectual property rights of all involved parties and comply with applicable laws.
    • No Warranty Disclaimer: The dataset is provided "as is," without warranty, and the creator disclaims any legal liability for its use by others.
    • Ethical Use: Users are encouraged to consider the ethical implications of their analyses and the potential impact on artists and the broader community.
    • Data Accuracy and Timeliness: The dataset reflects a snapshot in time and may not represent the most current information available. Users are encouraged to verify the data's accuracy and timeliness.
    • Source Verification: For the most accurate and up-to-date information, users are encouraged to refer directly to Spotify's official website.
    • Independence Declaration: ...
  14. Spotify daily top 200 songs with genres 2017-2021

    • kaggle.com
    zip
    Updated Aug 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Natarov (2021). Spotify daily top 200 songs with genres 2017-2021 [Dataset]. https://www.kaggle.com/ivannatarov/spotify-daily-top-200-songs-with-genres-20172021
    Explore at:
    zip(4253635 bytes)Available download formats
    Dataset updated
    Aug 24, 2021
    Authors
    Ivan Natarov
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    👍 If this dataset was useful to you, leave your vote at the top of the page 👍

    The dataset provides information on the daily top 200 tracks listened to by users of the Spotify digital platform around the world.

    I put together this dataset because I really love music (I listen to it for several hours a day) and have not found a similar dataset with track genres on kaggle.

    The dataset can be useful for beginners in the field of working with data. It contains missing values, arrays in columns, and so on, which can be great practice when conducting an EDA phase.

    Soon, my example will appear here as possible, based on the specified dataset, go on a musical journey around the world and understand how the musical tastes of humanity have changed around the world)))

    In addition, I will be very happy to see the work of the community on this dataset.

    Also, in case of interest in data by country, I am ready to place it upon request.

    You can contact me through: telegram @natarov_ivan

  15. A

    ‘Spotify Recommendation’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Spotify Recommendation’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-spotify-recommendation-3903/3a5b5131/?iid=006-678&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Spotify Recommendation’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/bricevergnou/spotify-recommendation on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Spotify Recommandation

    ( You can check how I used this dataset on my github repository )

    I am basically a HUGE fan of music ( mostly French rap though with some exceptions but I love music ). And someday , while browsing stuff on Internet , I found the Spotify's API . I knew I had to use it when I found out you could get information like danceability about your favorite songs just with their id's.

    https://user-images.githubusercontent.com/86613710/127216769-745ac143-7456-4464-bbe3-adc53872c133.png" alt="image">

    Once I saw that , my machine learning instincts forced me to work on this project.

    1. Data Collection

    1.1 Playlist creation

    I collected 100 liked songs and 95 disliked songs

    For those I like , I made a playlist of my favorite 100 songs. It is mainly French Rap , sometimes American rap , rock or electro music.

    For those I dislike , I collected songs from various kind of music so the model will have a broader view of what I don't like

    There is : - 25 metal songs ( Cannibal Corps ) - 20 " I don't like " rap songs ( PNL ) - 25 classical songs - 25 Disco songs

    I didn't include any Pop song because I'm kinda neutral about it

    1.2 Getting the ID's

    1. From the Spotify's API "Get a playlist's Items" , I turned the playlists into json formatted data which cointains the ID and the name of each track ( ids/yes.py and ids/no.py ). NB : on the website , specify "items(track(id,name))" in the fields format , to avoid being overwhelmed by useless data.

    2. With a script ( ids/ids_to_data.py ) , I turned the json data into a long string with each ID separated with a comma.

    1.3 Getting the statistics

    Now I just had to enter the strings into the Spotify API "Get Audio Features from several tracks" and get my data files ( data/good.json and data/dislike.json )

    2. Data features

    From Spotify's API documentation :

    • acousticness : A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.
    • danceability : Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.
    • duration_ms : The duration of the track in milliseconds.
    • energy : Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.
    • instrumentalness : Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0.
    • key : The key the track is in. Integers map to pitches using standard Pitch Class notation . E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on.
    • liveness : Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.
    • loudness : The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typical range between -60 and 0 db.
    • mode : Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.
    • speechiness : Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.
    • tempo : The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.
    • time_signature : An estimated overall time signature of a track. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure).
    • valence : A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

    And the variable that has to be predicted :

    • liked : 1 for liked songs , 0 for disliked songs

    --- Original source retains full ownership of the source dataset ---

  16. s

    Spotify User Demographics Statistics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Spotify User Demographics Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/spotify-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    29% of all Spotify users fall into the 25 to 34 age range. This is closely followed by 26% of users in the 18 to 24-year-old age.

  17. h

    spotify-tracks-lite

    • huggingface.co
    Updated May 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Blu (2024). spotify-tracks-lite [Dataset]. https://huggingface.co/datasets/engels/spotify-tracks-lite
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 14, 2024
    Authors
    Anton Blu
    License

    https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/

    Description

    Context

    This dataset consists of 24000 tracks from 30 genres, and is a shrunk version of maharshipandya/spotify-tracks-dataset dataset. All non-heuristic data is cut and cleaned for better usability and performance. All data taken from Spotify API and is open source. This dataset can be used to train prediction models based on user preferences, or categorise tracks by corresponding heuristic.

      Column Description
    

    danceability: Danceability describes how suitable a track is… See the full description on the dataset page: https://huggingface.co/datasets/engels/spotify-tracks-lite.

  18. Z

    MGD: Music Genre Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danilo B. Seufitelli (2021). MGD: Music Genre Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4778562
    Explore at:
    Dataset updated
    May 28, 2021
    Dataset provided by
    Anisio Lacerda
    Mirella M. Moro
    Gabriel P. Oliveira
    Danilo B. Seufitelli
    Mariana O. Silva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MGD: Music Genre Dataset

    Over recent years, the world has seen a dramatic change in the way people consume music, moving from physical records to streaming services. Since 2017, such services have become the main source of revenue within the global recorded music market. Therefore, this dataset is built by using data from Spotify. It provides a weekly chart of the 200 most streamed songs for each country and territory it is present, as well as an aggregated global chart.

    Considering that countries behave differently when it comes to musical tastes, we use chart data from global and regional markets from January 2017 to December 2019, considering eight of the top 10 music markets according to IFPI: United States (1st), Japan (2nd), United Kingdom (3rd), Germany (4th), France (5th), Canada (8th), Australia (9th), and Brazil (10th).

    We also provide information about the hit songs and artists present in the charts, such as all collaborating artists within a song (since the charts only provide the main ones) and their respective genres, which is the core of this work. MGD also provides data about musical collaboration, as we build collaboration networks based on artist partnerships in hit songs. Therefore, this dataset contains:

    Genre Networks: Success-based genre collaboration networks

    Genre Mapping: Genre mapping from Spotify genres to super-genres

    Artist Networks: Success-based artist collaboration networks

    Artists: Some artist data

    Hit Songs: Hit Song data and features

    Charts: Enhanced data from Spotify Weekly Top 200 Charts

    This dataset was originally built for a conference paper at ISMIR 2020. If you make use of the dataset, please also cite the following paper:

    Gabriel P. Oliveira, Mariana O. Silva, Danilo B. Seufitelli, Anisio Lacerda, and Mirella M. Moro. Detecting Collaboration Profiles in Success-based Music Genre Networks. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020), 2020.

    @inproceedings{ismir/OliveiraSSLM20, title = {Detecting Collaboration Profiles in Success-based Music Genre Networks}, author = {Gabriel P. Oliveira and Mariana O. Silva and Danilo B. Seufitelli and Anisio Lacerda and Mirella M. Moro}, booktitle = {21st International Society for Music Information Retrieval Conference} pages = {726--732}, year = {2020} }

  19. s

    Spotify Artists Statistics

    • searchlogistics.com
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Spotify Artists Statistics [Dataset]. https://www.searchlogistics.com/learn/statistics/spotify-statistics/
    Explore at:
    Dataset updated
    Apr 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spotify has about 11 million artists and creators on the platform.

  20. Music Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Jan 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2017). Music Dataset [Dataset]. https://brightdata.com/products/datasets/music
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Jan 6, 2017
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock powerful insights with our custom music datasets, offering access to millions of records from popular music platforms like Spotify, SoundCloud, Amazon Music, YouTube Music, and more. These datasets provide comprehensive data points such as track titles, artists, albums, genres, release dates, play counts, playlist details, popularity scores, user-generated tags, and much more, allowing you to analyze music trends, listener behavior, and industry patterns with precision. Use these datasets to optimize your music strategies by identifying trending tracks, analyzing artist performance, understanding playlist dynamics, and tracking audience preferences across platforms. Gain valuable insights into streaming habits, regional popularity, and emerging genres to make data-driven decisions that enhance your marketing campaigns, content creation, and audience engagement. Whether you’re a music producer, marketer, data analyst, or researcher, our music datasets empower you with the data needed to stay ahead in the ever-evolving music industry. Available in various formats such as JSON, CSV, and Parquet, and delivered via flexible options like API, S3, or email, these datasets ensure seamless integration into your workflows.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Brian Brost; Rishabh Mehrotra; Tristan Jehan (2025). MSSD Dataset [Dataset]. https://paperswithcode.com/dataset/mssd

Data from: MSSD Dataset

Music Streaming Sessions Dataset

Related Article
Explore at:
Dataset updated
May 13, 2025
Authors
Brian Brost; Rishabh Mehrotra; Tristan Jehan
Description

The Spotify Music Streaming Sessions Dataset (MSSD) consists of 160 million streaming sessions with associated user interactions, audio features and metadata describing the tracks streamed during the sessions, and snapshots of the playlists listened to during the sessions.

This dataset enables research on important problems including how to model user listening and interaction behaviour in streaming, as well as Music Information Retrieval (MIR), and session-based sequential recommendations.

Search
Clear search
Close search
Google apps
Main menu