100+ datasets found
  1. Playlist2vec: Spotify Million Playlist Dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jun 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush Papreja; Piyush Papreja (2021). Playlist2vec: Spotify Million Playlist Dataset [Dataset]. http://doi.org/10.5281/zenodo.5002584
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 22, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Piyush Papreja; Piyush Papreja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
    The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
    The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
    Each of the aforementioned entities are represented by unique IDs (Spotify URI).
    Data is stored into following tables:

    • album
    • artist
    • track
    • playlist
    • track_artist1
    • track_playlist1

    album

    | id | name | uri |

    id: Album ID as provided by Spotify
    name: Album Name as provided by Spotify
    uri: Album URI as provided by Spotify


    artist

    | id | name | uri |

    id: Artist ID as provided by Spotify
    name: Artist Name as provided by Spotify
    uri: Artist URI as provided by Spotify


    track

    | id | name | duration | popularity | explicit | preview_url | uri | album_id |

    id: Track ID as provided by Spotify
    name: Track Name as provided by Spotify
    duration: Track Duration (in milliseconds) as provided by Spotify
    popularity: Track Popularity as provided by Spotify
    explicit: Whether the track has explicit lyrics or not. (true or false)
    preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
    uri: Track Uri as provided by Spotify
    album_id: Album Id to which the track belongs


    playlist

    | id | name | followers | uri | total_tracks |

    id: Playlist ID as provided by Spotify
    name: Playlist Name as provided by Spotify
    followers: Playlist Followers as provided by Spotify
    uri: Playlist Uri as provided by Spotify
    total_tracks: Total number of tracks in the playlist.

    track_artist1

    | track_id | artist_id |

    Track-Artist association table

    track_playlist1

    | track_id | playlist_id |

    Track-Playlist association table

    - - - - - SETUP - - - - -


    The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.

    spotifydbdumpschemashare.sql contains the schema for the database (for reference):
    spotifydbdumpshare.sql is the actual data dump.


    Setup steps:
    1. Create database

    - - - - - PAPER - - - - -


    The description of this dataset can be found in the following paper:

    Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham

  2. Spotify Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data, Spotify Dataset [Dataset]. https://brightdata.com/products/datasets/spotify
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights into music trends, artist popularity, and streaming analytics with our comprehensive Spotify Dataset. Designed for music analysts, marketers, and businesses, this dataset provides structured and reliable data from Spotify to enhance market research, content strategy, and audience engagement.

    Dataset Features

    Track Information: Access detailed data on songs, including track name, artist, album, genre, and release date. Streaming Popularity: Extract track popularity scores, listener engagement metrics, and ranking trends. Artist & Album Insights: Analyze artist performance, album releases, and genre trends over time. Related Searches & Recommendations: Track related search terms and suggested content for deeper audience insights. Historical & Real-Time Data: Retrieve historical streaming data or access continuously updated records for real-time trend analysis.

    Customizable Subsets for Specific Needs Our Spotify Dataset is fully customizable, allowing you to filter data based on track popularity, artist, genre, release date, or listener engagement. Whether you need broad coverage for industry analysis or focused data for content optimization, we tailor the dataset to your needs.

    Popular Use Cases

    Market Analysis & Trend Forecasting: Identify emerging music trends, genre popularity, and listener preferences. Artist & Label Performance Tracking: Monitor artist rankings, album success, and audience engagement. Competitive Intelligence: Analyze competitor music strategies, playlist placements, and streaming performance. AI & Machine Learning Applications: Use structured music data to train AI models for recommendation engines, playlist curation, and predictive analytics. Advertising & Sponsorship Insights: Identify high-performing tracks and artists for targeted advertising and sponsorship opportunities.

    Whether you're optimizing music marketing, analyzing streaming trends, or enhancing content strategies, our Spotify Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  3. Spotify's Long Hits (2014-2024) 🎶

    • kaggle.com
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kanchana1990 (2024). Spotify's Long Hits (2014-2024) 🎶 [Dataset]. http://doi.org/10.34740/kaggle/dsv/7685397
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    Kaggle
    Authors
    Kanchana1990
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This dataset, "Spotify's Long Hits (2014-2024) 🎶," offers a unique collection of over 800 tracks, each standing out for its extended playtime, marking the years from 2014 to 2024. It serves as a unique lens through which the evolution of musical duration and listener preferences can be observed over a significant period. Each track in this dataset not only surpasses the conventional lengths but also encapsulates the essence of its time, making it a valuable resource for in-depth musical analysis.

    Data Science Applications: The dataset's structure lends itself to various analytical pursuits within the data science realm. Researchers and enthusiasts can delve into trend analysis to uncover shifts in musical durations over the years, perform genre-based studies to explore the relationship between genre and track length, or even train machine learning models to predict track popularity based on various features. However, make sure to use the dataset only for educational purposes as per Spotify guidelines.

    Column Descriptors: - ID: The unique identifier for each track on Spotify, facilitating direct access to the track. - Name: The title of the track, revealing its identity. - Duration (Minutes): The length of each track, provided in minutes, highlighting the extended nature of these compositions. - Artists: The names of the artists involved, offering insights into the collaborative landscape of each piece.

    Ethically Mined Data: This dataset has been compiled with strict adherence to ethical data mining practices, utilizing Spotify's public API in full compliance with their guidelines. It represents a harmonious blend of technology and creativity, showcasing the vast musical archive that Spotify offers.

    Gratitude is extended to Spotify for the data provided and the usage of their logo in the dataset thumbnail, which adds a recognizable visual cue to this academic resource. This dataset stands as a testament to the power of music and data combined, inviting exploration into the depths of musical analysis.

  4. Data from: MusicOSet: An Enhanced Open Dataset for Music Data Mining

    • zenodo.org
    bin, zip
    Updated Jun 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota (2021). MusicOSet: An Enhanced Open Dataset for Music Data Mining [Dataset]. http://doi.org/10.5281/zenodo.4904639
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jun 7, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MusicOSet is an open and enhanced dataset of musical elements (artists, songs and albums) based on musical popularity classification. Provides a directly accessible collection of data suitable for numerous tasks in music data mining (e.g., data visualization, classification, clustering, similarity search, MIR, HSS and so forth). To create MusicOSet, the potential information sources were divided into three main categories: music popularity sources, metadata sources, and acoustic and lyrical features sources. Data from all three categories were initially collected between January and May 2019. Nevertheless, the update and enhancement of the data happened in June 2019.

    The attractive features of MusicOSet include:

    • Integration and centralization of different musical data sources
    • Calculation of popularity scores and classification of hits and non-hits musical elements, varying from 1962 to 2018
    • Enriched metadata for music, artists, and albums from the US popular music industry
    • Availability of acoustic and lyrical resources
    • Unrestricted access in two formats: SQL database and compressed .csv files
    |    Data    | # Records |
    |:-----------------:|:---------:|
    | Songs       | 20,405  |
    | Artists      | 11,518  |
    | Albums      | 26,522  |
    | Lyrics      | 19,664  |
    | Acoustic Features | 20,405  |
    | Genres      | 1,561   |
  5. h

    spotify-tracks-dataset

    • huggingface.co
    Updated Jun 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maharshipandya (2023). spotify-tracks-dataset [Dataset]. https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 30, 2023
    Authors
    maharshipandya
    License

    https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/

    Description

    Content

    This is a dataset of Spotify tracks over a range of 125 different genres. Each track has some audio features associated with it. The data is in CSV format which is tabular and can be loaded quickly.

      Usage
    

    The dataset can be used for:

    Building a Recommendation System based on some user input or preference Classification purposes based on audio features and available genres Any other application that you can think of. Feel free to discuss!

      Column… See the full description on the dataset page: https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset.
    
  6. h

    spotify-million-song-dataset

    • huggingface.co
    Updated Jun 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishnu Priya VR (2024). spotify-million-song-dataset [Dataset]. https://huggingface.co/datasets/vishnupriyavr/spotify-million-song-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 16, 2024
    Authors
    Vishnu Priya VR
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for Spotify Million Song Dataset

      Dataset Summary
    

    This is Spotify Million Song Dataset. This dataset contains song names, artists names, link to the song and lyrics. This dataset can be used for recommending songs, classifying or clustering songs.

      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    [More Information Needed]

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    [More Information Needed]

      Data… See the full description on the dataset page: https://huggingface.co/datasets/vishnupriyavr/spotify-million-song-dataset.
    
  7. d

    My Spotify Data

    • dataone.org
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mulholland, Ty (2023). My Spotify Data [Dataset]. http://doi.org/10.7910/DVN/FVCXKG
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Mulholland, Ty
    Description
  8. Z

    Data from: Spotify Playlists Dataset

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Pichl (2020). Spotify Playlists Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_2594556
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Martin Pichl
    Eva Zangerle
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists and the tracks contained in these playlists.

    The csv-file holding the dataset contains the following columns: "user_id", "artistname", "trackname", "playlistname", where

    user_id is a hash of the user's Spotify user name

    artistname is the name of the artist

    trackname is the title of the track and

    playlistname is the name of the playlist that contains this track.

    The separator used is , each entry is enclosed by double quotes and the escape character used is .

    A description of the generation of the dataset and the dataset itself can be found in the following paper:

    Pichl, Martin; Zangerle, Eva; Specht, Günther: "Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?" in 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pp. 1360-1365, IEEE, Atlantic City, 2015.

  9. Spotify Tracks Attributes and Popularity

    • kaggle.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Melissa Monfared (2025). Spotify Tracks Attributes and Popularity [Dataset]. https://www.kaggle.com/datasets/melissamonfared/spotify-tracks-attributes-and-popularity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Kaggle
    Authors
    Melissa Monfared
    Description

    About Dataset

    Overview:

    This dataset provides detailed metadata and audio analysis for a wide collection of Spotify music tracks across various genres. It includes track-level information such as popularity, tempo, energy, danceability, and other musical features that can be used for music recommendation systems, genre classification, or trend analysis. The dataset is a rich source for exploring music consumption patterns and user preferences based on song characteristics.

    Dataset Details:

    This dataset contains rows of individual music tracks, each described by both metadata (such as track name, artist, album, and genre) and quantitative audio features. These features reflect different musical attributes such as energy, acousticness, instrumentalness, valence, and more, making it ideal for audio machine learning projects and exploratory data analysis.

    Schema and Column Descriptions:

    Column NameDescription
    indexUnique index for each track (can be ignored for analysis)
    track_idSpotify's unique identifier for the track
    artistsName of the performing artist(s)
    album_nameTitle of the album the track belongs to
    track_nameTitle of the track
    popularityPopularity score on Spotify (0–100 scale)
    duration_msDuration of the track in milliseconds
    explicitIndicates whether the track contains explicit content
    danceabilityHow suitable the track is for dancing (0.0 to 1.0)
    energyIntensity and activity level of the track (0.0 to 1.0)
    keyMusical key (0 = C, 1 = C♯/D♭, …, 11 = B)
    loudnessOverall loudness of the track in decibels (dB)
    modeModality (major = 1, minor = 0)
    speechinessPresence of spoken words in the track (0.0 to 1.0)
    acousticnessConfidence measure of whether the track is acoustic (0.0 to 1.0)
    instrumentalnessPredicts whether the track contains no vocals (0.0 to 1.0)
    livenessPresence of an audience in the recording (0.0 to 1.0)
    valenceMusical positivity conveyed (0.0 = sad, 1.0 = happy)
    tempoEstimated tempo in beats per minute (BPM)
    time_signatureTime signature of the track (e.g., 4 = 4/4)
    track_genreAssigned genre label for the track

    Key Features:

    • Comprehensive Track Data: Metadata combined with detailed audio analysis.
    • Genre Diversity: Includes tracks from various music genres.
    • Audio Feature Rich: Suitable for audio classification, recommendation engines, or clustering.
    • Machine Learning Friendly: Clean and numerical format ideal for ML models.

    Usage:

    This dataset is valuable for:

    • 🎵 Music Recommendation Systems: Building collaborative or content-based recommenders.
    • 📊 Data Visualization & Dashboards: Analyzing genre or mood trends over time.
    • 🤖 Machine Learning Projects: Predicting song popularity or clustering similar tracks.
    • 🧠 Music Psychology & Behavioral Studies: Exploring how music features relate to emotions or behavior.

    Data Maintenance:

    Additional Notes:

    • This dataset can be enhanced by merging it with user listening behavior data, lyrics datasets, or chart positions for more advanced analysis.
    • Some columns like key, mode, and explicit may need to be mapped for better readability in visualization.
  10. c

    Spotify Tracks Dataset

    • cubig.ai
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Spotify Tracks Dataset [Dataset]. https://cubig.ai/store/products/276/spotify-tracks-dataset
    Explore at:
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Spotify Tracks Dataset contains information on tracks from over 125 music genres, including both audio features (e.g., danceability, energy, valence) and metadata (e.g., title, artist, genre).

    2) Data Utilization (1) Characteristics of the Spotify Tracks Dataset: • The data is structured in a tabular format at the track level, where each column represents numerical or categorical features based on musical properties. This makes it suitable for recommendation systems, genre classification, and emotion analysis. • It includes multi-dimensional attributes grounded in music theory such as track duration, time signature, energy, loudness, tempo, and speechiness—enabling its use in music classification and clustering tasks.

    (2) Applications of the Spotify Tracks Dataset: • Design of Music Recommendation Systems: It can be used to build content-based filtering systems or hybrid recommendation algorithms based on user preferences.

  11. Data from: Spotify Playlists

    • zenodo.org
    csv
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Cambria; Francesco Cambria (2025). Spotify Playlists [Dataset]. http://doi.org/10.5281/zenodo.14728731
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francesco Cambria; Francesco Cambria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was constructed based on the data found in Kaggle from Spotify.

    The files here reported can be used to build a property graph in Neo4J:

    • song.csv - contains all the data for the Song nodes.
    • artist.csv - contains the data for the Artist nodes.
    • playlist.csv - contains the data for the Playlist nodes.
    • user.csv - contains the data for the Playlist nodes (those creating Playlists).
    • genre.csv - contains the data for the Genre nodes (a category for the Artists).
    • type.csv - contains the data for the Type nodes (a category for the Playlists).
    • sing.csv - contains the data for the SING relationship from Artist to Song nodes.
    • created.csv - contains the data for the CREATED relationship from User to Playlist nodes.
    • in.csv - contains the data for the IN relationship from Song to Playlist nodes.
    • of_type.csv - contains the data for the OFTYPE relationship from Playlist to Type nodes.
    • labelled.csv - contains the data for the LABELLED relationship from Artist to Genre nodes.

    This data was used as test dataset in the paper "MINE GRAPH RULE: A New GQL Operator for Mining Association Rules in Property Graph Databases".

  12. Spotify Music Data

    • kaggle.com
    Updated Jul 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavan Sanagapati (2020). Spotify Music Data [Dataset]. https://www.kaggle.com/datasets/pavansanagapati/spotify-music-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 23, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Pavan Sanagapati
    Description

    Dataset

    This dataset was created by Pavan Sanagapati

    Contents

  13. spotify-data-001

    • kaggle.com
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kayla McComb (2024). spotify-data-001 [Dataset]. https://www.kaggle.com/datasets/kaylamccomb/spotify-data-001/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 9, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kayla McComb
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Kayla McComb

    Released under MIT

    Contents

  14. 4

    The Spotify Audio Features Hit Predictor Dataset (1960-2019)

    • data.4tu.nl
    zip
    Updated Feb 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farooq Ansari (2020). The Spotify Audio Features Hit Predictor Dataset (1960-2019) [Dataset]. http://doi.org/10.4121/uuid:d77e74b0-66bc-47ac-8b25-5796d3084478
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 4, 2020
    Dataset provided by
    4TU.Centre for Research Data
    Authors
    Farooq Ansari
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Time period covered
    1960 - 2019
    Description

    This is a dataset consisting of features for tracks fetched using Spotify's Web API. The tracks are labeled '1' or '0' ('Hit' or 'Flop') depending on some criterias of the author. This dataset can be used to make a classification model that predicts whethere a track would be a 'Hit' or not. (Note: The author does not objectively considers a track inferior, bad or a failure if its labeled 'Flop'. 'Flop' here merely implies that it is a track that probably could not be considered popular in the mainstream.) Here's an implementation of this idea in the form of a website that I made. {http://www.hitpredictor.in/}

  15. Data consumption of mobile Spotify users in Italy 2018

    • statista.com
    Updated Jul 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Data consumption of mobile Spotify users in Italy 2018 [Dataset]. https://www.statista.com/statistics/866880/data-consumption-of-spotify-users-in-italy/
    Explore at:
    Dataset updated
    Jul 10, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2018 - May 2018
    Area covered
    Italy
    Description

    This statistic shows the average data consumption of mobile Spotify users in Italy from January to May 2018, including both WiFi and mobile data. According to the data tracked by Walletsaver, the average data consumption of mobile users for Spotify increased from ** megabytes (MB) in February to ** MB in May 2018.

  16. Spotify Data Broken Down by Decade

    • kaggle.com
    Updated Sep 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elit Dogu (2020). Spotify Data Broken Down by Decade [Dataset]. https://www.kaggle.com/elitdogu/spotify-data-broken-down-by-decade/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 19, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Elit Dogu
    Description

    Dataset

    This dataset was created by Elit Dogu

    Contents

  17. Z

    MGD: Music Genre Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated May 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danilo B. Seufitelli (2021). MGD: Music Genre Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4778562
    Explore at:
    Dataset updated
    May 28, 2021
    Dataset provided by
    Danilo B. Seufitelli
    Mirella M. Moro
    Gabriel P. Oliveira
    Anisio Lacerda
    Mariana O. Silva
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MGD: Music Genre Dataset

    Over recent years, the world has seen a dramatic change in the way people consume music, moving from physical records to streaming services. Since 2017, such services have become the main source of revenue within the global recorded music market. Therefore, this dataset is built by using data from Spotify. It provides a weekly chart of the 200 most streamed songs for each country and territory it is present, as well as an aggregated global chart.

    Considering that countries behave differently when it comes to musical tastes, we use chart data from global and regional markets from January 2017 to December 2019, considering eight of the top 10 music markets according to IFPI: United States (1st), Japan (2nd), United Kingdom (3rd), Germany (4th), France (5th), Canada (8th), Australia (9th), and Brazil (10th).

    We also provide information about the hit songs and artists present in the charts, such as all collaborating artists within a song (since the charts only provide the main ones) and their respective genres, which is the core of this work. MGD also provides data about musical collaboration, as we build collaboration networks based on artist partnerships in hit songs. Therefore, this dataset contains:

    Genre Networks: Success-based genre collaboration networks

    Genre Mapping: Genre mapping from Spotify genres to super-genres

    Artist Networks: Success-based artist collaboration networks

    Artists: Some artist data

    Hit Songs: Hit Song data and features

    Charts: Enhanced data from Spotify Weekly Top 200 Charts

    This dataset was originally built for a conference paper at ISMIR 2020. If you make use of the dataset, please also cite the following paper:

    Gabriel P. Oliveira, Mariana O. Silva, Danilo B. Seufitelli, Anisio Lacerda, and Mirella M. Moro. Detecting Collaboration Profiles in Success-based Music Genre Networks. In Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR 2020), 2020.

    @inproceedings{ismir/OliveiraSSLM20, title = {Detecting Collaboration Profiles in Success-based Music Genre Networks}, author = {Gabriel P. Oliveira and Mariana O. Silva and Danilo B. Seufitelli and Anisio Lacerda and Mirella M. Moro}, booktitle = {21st International Society for Music Information Retrieval Conference} pages = {726--732}, year = {2020} }

  18. Z

    Spotify and Youtube

    • data.niaid.nih.gov
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guarisco, Marco (2023). Spotify and Youtube [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10253414
    Explore at:
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    Sallustio, Marco
    Guarisco, Marco
    Rastelli, Salvatore
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This is the statistics for the Top 10 songs of various spotify artists and their YouTube videos. The Creators above generated the data and uploaded it to Kaggle on February 6-7 2023. The license to use this data is "CC0: Public Domain", allowing the data to be copied, modified, distributed, and worked on without having to ask permission. The data is in numerical and textual CSV format as attached. This dataset contains the statistics and attributes of the top 10 songs of various artists in the world. As described by the creators above, it includes 26 variables for each of the songs collected from spotify. These variables are briefly described next:

    Track: name of the song, as visible on the Spotify platform. Artist: name of the artist. Url_spotify: the Url of the artist. Album: the album in wich the song is contained on Spotify. Album_type: indicates if the song is relesead on Spotify as a single or contained in an album. Uri: a spotify link used to find the song through the API. Danceability: describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. Energy: is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. Key: the key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1. Loudness: the overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db. Speechiness: detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. Acousticness: a confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. Instrumentalness: predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal". The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. Liveness: detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. Valence: a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). Tempo: the overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. Duration_ms: the duration of the track in milliseconds. Stream: number of streams of the song on Spotify. Url_youtube: url of the video linked to the song on Youtube, if it have any. Title: title of the videoclip on youtube. Channel: name of the channel that have published the video. Views: number of views. Likes: number of likes. Comments: number of comments. Description: description of the video on Youtube. Licensed: Indicates whether the video represents licensed content, which means that the content was uploaded to a channel linked to a YouTube content partner and then claimed by that partner. official_video: boolean value that indicates if the video found is the official video of the song. The data was last updated on February 7, 2023.

  19. A

    ‘Spotify Song Attributes’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Aug 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2017). ‘Spotify Song Attributes’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-spotify-song-attributes-25ce/latest
    Explore at:
    Dataset updated
    Aug 6, 2017
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Spotify Song Attributes’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/geomack/spotifyclassification on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    A dataset of 2017 songs with attributes from Spotify's API. Each song is labeled "1" meaning I like it and "0" for songs I don't like. I used this to data to see if I could build a classifier that could predict whether or not I would like a song.

    I wrote an article about the project I used this data for. It includes code on how to grab this data from the Spotipy API wrapper and the methods behind my modeling. https://opendatascience.com/blog/a-machine-learning-deep-dive-into-my-spotify-data/

    Content

    Each row represents a song.

    There are 16 columns. 13 of which are song attributes, one column for song name, one for artist, and a column called "target" which is the label for the song.

    Here are the 13 track attributes: acousticness, danceability, duration_ms, energy, instrumentalness, key, liveness, loudness, mode, speechiness, tempo, time_signature, valence.

    Information on what those traits mean can be found here: https://developer.spotify.com/web-api/get-audio-features/

    Acknowledgements

    I would like to thank Spotify for providing this readily accessible data.

    Inspiration

    I'm a music lover who's curious about why I love the music that I love.

    --- Original source retains full ownership of the source dataset ---

  20. Z

    spotify data

    • data.niaid.nih.gov
    Updated Jul 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Hulke (2023). spotify data [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_8114617
    Explore at:
    Dataset updated
    Jul 5, 2023
    Dataset authored and provided by
    Ryan Hulke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    from kaggle

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Piyush Papreja; Piyush Papreja (2021). Playlist2vec: Spotify Million Playlist Dataset [Dataset]. http://doi.org/10.5281/zenodo.5002584
Organization logo

Playlist2vec: Spotify Million Playlist Dataset

Explore at:
binAvailable download formats
Dataset updated
Jun 22, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Piyush Papreja; Piyush Papreja
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
Each of the aforementioned entities are represented by unique IDs (Spotify URI).
Data is stored into following tables:

  • album
  • artist
  • track
  • playlist
  • track_artist1
  • track_playlist1

album

| id | name | uri |

id: Album ID as provided by Spotify
name: Album Name as provided by Spotify
uri: Album URI as provided by Spotify


artist

| id | name | uri |

id: Artist ID as provided by Spotify
name: Artist Name as provided by Spotify
uri: Artist URI as provided by Spotify


track

| id | name | duration | popularity | explicit | preview_url | uri | album_id |

id: Track ID as provided by Spotify
name: Track Name as provided by Spotify
duration: Track Duration (in milliseconds) as provided by Spotify
popularity: Track Popularity as provided by Spotify
explicit: Whether the track has explicit lyrics or not. (true or false)
preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
uri: Track Uri as provided by Spotify
album_id: Album Id to which the track belongs


playlist

| id | name | followers | uri | total_tracks |

id: Playlist ID as provided by Spotify
name: Playlist Name as provided by Spotify
followers: Playlist Followers as provided by Spotify
uri: Playlist Uri as provided by Spotify
total_tracks: Total number of tracks in the playlist.

track_artist1

| track_id | artist_id |

Track-Artist association table

track_playlist1

| track_id | playlist_id |

Track-Playlist association table

- - - - - SETUP - - - - -


The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.

spotifydbdumpschemashare.sql contains the schema for the database (for reference):
spotifydbdumpshare.sql is the actual data dump.


Setup steps:
1. Create database

- - - - - PAPER - - - - -


The description of this dataset can be found in the following paper:

Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham

Search
Clear search
Close search
Google apps
Main menu