100+ datasets found
  1. Spotify dataset

    • kaggle.com
    zip
    Updated Jul 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanjana chaudhari☑️ (2023). Spotify dataset [Dataset]. https://www.kaggle.com/datasets/sanjanchaudhari/spotify-dataset
    Explore at:
    zip(2045049 bytes)Available download formats
    Dataset updated
    Jul 20, 2023
    Authors
    Sanjana chaudhari☑️
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Introduction to the Spotify Dataset

    Overview of the Dataset Source and Purpose Description of the Data Collection Process Data Exploration

    Understanding the Structure and Size of the Dataset Overview of the Features and Columns Key Features in the Spotify Dataset

    Explanation of Important Columns (e.g., track name, artist, album, duration, popularity) Genre and Music Category Analysis

    Categorizing Songs by Genre and Music Type Most Popular Genres on Spotify **Artist Analysis ** Identifying Top Artists Based on Popularity and Number of Songs Relationship between Artist and Song Attributes Song Duration Analysis

    Distribution of Song Durations Impact of Song Duration on Popularity and Listener Engagement Song Popularity and Listener Engagement

    Analyzing the Popularity Scores of Songs Correlation between Popularity and Other Song Features Audio Features Analysis

    Examination of Audio Features (danceability, energy, instrumentalness, etc.) Clustering Songs Based on Audio Features Time-Based Analysis

    Seasonal Trends in Song Releases and Popularity Time Series Analysis of Listening Patterns Collaborations and Featured Artists

    Frequency of Collaborations and Featured Artists Impact of Collaborations on Song Popularity Recommendation Systems

    Overview of Spotify's Recommendation Algorithms Building Simple Recommendation Models User Behavior and Playlist Analysis

    Analysis of User-Generated Playlists Common Song Additions and Removals

  2. Spotify Dataset

    • brightdata.com
    .json, .csv, .xlsx
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2024). Spotify Dataset [Dataset]. https://brightdata.com/products/datasets/spotify
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Apr 10, 2024
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Gain valuable insights into music trends, artist popularity, and streaming analytics with our comprehensive Spotify Dataset. Designed for music analysts, marketers, and businesses, this dataset provides structured and reliable data from Spotify to enhance market research, content strategy, and audience engagement.

    Dataset Features

    Track Information: Access detailed data on songs, including track name, artist, album, genre, and release date. Streaming Popularity: Extract track popularity scores, listener engagement metrics, and ranking trends. Artist & Album Insights: Analyze artist performance, album releases, and genre trends over time. Related Searches & Recommendations: Track related search terms and suggested content for deeper audience insights. Historical & Real-Time Data: Retrieve historical streaming data or access continuously updated records for real-time trend analysis.

    Customizable Subsets for Specific Needs Our Spotify Dataset is fully customizable, allowing you to filter data based on track popularity, artist, genre, release date, or listener engagement. Whether you need broad coverage for industry analysis or focused data for content optimization, we tailor the dataset to your needs.

    Popular Use Cases

    Market Analysis & Trend Forecasting: Identify emerging music trends, genre popularity, and listener preferences. Artist & Label Performance Tracking: Monitor artist rankings, album success, and audience engagement. Competitive Intelligence: Analyze competitor music strategies, playlist placements, and streaming performance. AI & Machine Learning Applications: Use structured music data to train AI models for recommendation engines, playlist curation, and predictive analytics. Advertising & Sponsorship Insights: Identify high-performing tracks and artists for targeted advertising and sponsorship opportunities.

    Whether you're optimizing music marketing, analyzing streaming trends, or enhancing content strategies, our Spotify Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.

  3. Spotify dataset

    • kaggle.com
    zip
    Updated Jun 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gati Ambaliya (2024). Spotify dataset [Dataset]. https://www.kaggle.com/datasets/ambaliyagati/spotify-dataset-for-playing-around-with-sql
    Explore at:
    zip(309669 bytes)Available download formats
    Dataset updated
    Jun 17, 2024
    Authors
    Gati Ambaliya
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Description for Spotify Songs Dataset on Kaggle

    Dataset Title: Spotify Songs Dataset

    Description: This dataset contains a collection of songs fetched from the Spotify API, covering various genres including "acoustic", "afrobeat", "alt-rock", "alternative", "ambient", "anime", "black-metal", "bluegrass", "blues", "bossanova", "brazil", "breakbeat", "british", "cantopop", "chicago-house", "children", "chill", "classical", "club", "comedy", "country", "dance", "dancehall", "death-metal", "deep-house", "detroit-techno", "disco", "disney", "drum-and-bass", "dub", "dubstep", "edm", "electro", "electronic", "emo", "folk", "forro", "french", "funk", "garage", "german", "gospel", "goth", "grindcore", "groove", "grunge", "guitar", "happy", "hard-rock", "hardcore", "hardstyle", "heavy-metal", "hip-hop", "holidays", "honky-tonk", "house", "idm", "indian", "indie", "indie-pop", "industrial", "iranian", "j-dance", "j-idol", "j-pop", "j-rock", "jazz", "k-pop", "kids", "latin", "latino", "malay", "mandopop", "metal", "metal-misc", "metalcore", "minimal-techno", "movies", "mpb", "new-age", "new-release", "opera", "pagode", "party", "philippines-opm", "piano", "pop", "pop-film", "post-dubstep", "power-pop", "progressive-house", "psych-rock", "punk", "punk-rock", "r-n-b", "rainy-day", "reggae", "reggaeton", "road-trip", "rock", "rock-n-roll", "rockabilly", "romance", "sad", "salsa", "samba", "sertanejo", "show-tunes", "singer-songwriter", "ska", "sleep", "songwriter", "soul", "soundtracks", "spanish", "study", "summer", "swedish", "synth-pop", "tango", "techno", "trance", "trip-hop", "turkish", "work-out", "world-music". Each entry in the dataset provides detailed information about a song, including its name, artists, album, popularity, duration, and whether it is explicit.

    Data Collection Method: The data was collected using the Spotify Web API through a Python script. The script performed searches for different genres and retrieved the top tracks for each genre. The fetched data was then compiled and saved into a CSV file.

    Columns Description: id: Unique identifier for the track on Spotify. name: Name of the track. genre: genre of the song. artists: Names of the artists who performed the track, separated by commas if there are multiple artists. album: Name of the album the track belongs to. popularity: Popularity score of the track (0-100, where higher is more popular). duration_ms: Duration of the track in milliseconds. explicit: Boolean indicating whether the track contains explicit content.

    Potential Uses: This dataset can be used for a variety of purposes, including but not limited to:

    • Music Analysis: Analyze the popularity and characteristics of songs across different genres.
    • Recommendation Systems: Develop and test music recommendation algorithms.
    • Trend Analysis: Study trends in music preferences and popularity over time.
    • Machine Learning: Train machine learning models for tasks like genre classification or popularity prediction. _ Acknowledgements: This dataset was created using the Spotify Web API. Special thanks to Spotify for providing access to their extensive music library through their API. _ License: This dataset is made available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. You are free to use, modify, and distribute this dataset, provided you give appropriate credit to the original creator. _
  4. Playlist2vec: Spotify Million Playlist Dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin
    Updated Jun 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Piyush Papreja; Piyush Papreja (2021). Playlist2vec: Spotify Million Playlist Dataset [Dataset]. http://doi.org/10.5281/zenodo.5002584
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 22, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Piyush Papreja; Piyush Papreja
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was created using Spotify developer API. It consists of user-created as well as Spotify-curated playlists.
    The dataset consists of 1 million playlists, 3 million unique tracks, 3 million unique albums, and 1.3 million artists.
    The data is stored in a SQL database, with the primary entities being songs, albums, artists, and playlists.
    Each of the aforementioned entities are represented by unique IDs (Spotify URI).
    Data is stored into following tables:

    • album
    • artist
    • track
    • playlist
    • track_artist1
    • track_playlist1

    album

    | id | name | uri |

    id: Album ID as provided by Spotify
    name: Album Name as provided by Spotify
    uri: Album URI as provided by Spotify


    artist

    | id | name | uri |

    id: Artist ID as provided by Spotify
    name: Artist Name as provided by Spotify
    uri: Artist URI as provided by Spotify


    track

    | id | name | duration | popularity | explicit | preview_url | uri | album_id |

    id: Track ID as provided by Spotify
    name: Track Name as provided by Spotify
    duration: Track Duration (in milliseconds) as provided by Spotify
    popularity: Track Popularity as provided by Spotify
    explicit: Whether the track has explicit lyrics or not. (true or false)
    preview_url: A link to a 30 second preview (MP3 format) of the track. Can be null
    uri: Track Uri as provided by Spotify
    album_id: Album Id to which the track belongs


    playlist

    | id | name | followers | uri | total_tracks |

    id: Playlist ID as provided by Spotify
    name: Playlist Name as provided by Spotify
    followers: Playlist Followers as provided by Spotify
    uri: Playlist Uri as provided by Spotify
    total_tracks: Total number of tracks in the playlist.

    track_artist1

    | track_id | artist_id |

    Track-Artist association table

    track_playlist1

    | track_id | playlist_id |

    Track-Playlist association table

    - - - - - SETUP - - - - -


    The data is in the form of a SQL dump. The download size is about 10 GB, and the database populated from it comes out to about 35GB.

    spotifydbdumpschemashare.sql contains the schema for the database (for reference):
    spotifydbdumpshare.sql is the actual data dump.


    Setup steps:
    1. Create database

    - - - - - PAPER - - - - -


    The description of this dataset can be found in the following paper:

    Papreja P., Venkateswara H., Panchanathan S. (2020) Representation, Exploration and Recommendation of Playlists. In: Cellier P., Driessens K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2019. Communications in Computer and Information Science, vol 1168. Springer, Cham

  5. h

    spotify-tracks-dataset

    • huggingface.co
    Updated Jun 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maharshipandya (2023). spotify-tracks-dataset [Dataset]. https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 30, 2023
    Authors
    maharshipandya
    License

    https://choosealicense.com/licenses/bsd/https://choosealicense.com/licenses/bsd/

    Description

    Content

    This is a dataset of Spotify tracks over a range of 125 different genres. Each track has some audio features associated with it. The data is in CSV format which is tabular and can be loaded quickly.

      Usage
    

    The dataset can be used for:

    Building a Recommendation System based on some user input or preference Classification purposes based on audio features and available genres Any other application that you can think of. Feel free to discuss!

      Column… See the full description on the dataset page: https://huggingface.co/datasets/maharshipandya/spotify-tracks-dataset.
    
  6. 🎧 Spotify Global Streaming Data (2024)

    • kaggle.com
    zip
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Soundankar (2025). 🎧 Spotify Global Streaming Data (2024) [Dataset]. https://www.kaggle.com/datasets/atharvasoundankar/spotify-global-streaming-data-2024
    Explore at:
    zip(28022 bytes)Available download formats
    Dataset updated
    Apr 30, 2025
    Authors
    Atharva Soundankar
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    📊 About the Dataset

    This dataset captures the global music streaming trends on Spotify for the year 2024. It provides valuable insights into user preferences across various countries, top-performing artists and albums, streaming hours, and listener behavior patterns. It is designed to support data analysis, machine learning models, and business intelligence dashboards in the music and media industry.

    With over 500 rows of clean, non-duplicated, and realistic entries from countries around the world, this dataset is ideal for uncovering:

    • Global music popularity patterns
    • Listener engagement across genres and demographics
    • Artist performance across countries
    • Revenue forecasting and content recommendations

    --

  7. h

    spotify-million-song-dataset

    • huggingface.co
    Updated Jun 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vishnu Priya VR (2024). spotify-million-song-dataset [Dataset]. https://huggingface.co/datasets/vishnupriyavr/spotify-million-song-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 16, 2024
    Authors
    Vishnu Priya VR
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for Spotify Million Song Dataset

      Dataset Summary
    

    This is Spotify Million Song Dataset. This dataset contains song names, artists names, link to the song and lyrics. This dataset can be used for recommending songs, classifying or clustering songs.

      Supported Tasks and Leaderboards
    

    [More Information Needed]

      Languages
    

    [More Information Needed]

      Dataset Structure
    
    
    
    
    
      Data Instances
    

    [More Information Needed]

      Data… See the full description on the dataset page: https://huggingface.co/datasets/vishnupriyavr/spotify-million-song-dataset.
    
  8. My Spotify Data - Cleaned

    • kaggle.com
    zip
    Updated Jan 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malinga Rajapaksha (2024). My Spotify Data - Cleaned [Dataset]. https://www.kaggle.com/datasets/malingarajapaksha/my-spotify-data-cleaned
    Explore at:
    zip(2952139 bytes)Available download formats
    Dataset updated
    Jan 26, 2024
    Authors
    Malinga Rajapaksha
    Description

    The dataset contains records of the user's Spotify streaming history, with each row representing a specific instance of a played track. The data includes various attributes providing insights into the user's music listening habits.

    Columns:

    1. ts (Timestamp):

      • The timestamp when the track was played.
    2. platform:

      • The platform or device used for streaming (e.g., Windows 10).
    3. ms_played:

      • The duration in milliseconds of how long the track was played.
    4. conn_country:

      • The country code indicating the user's location during streaming (e.g., LK for Sri Lanka).
    5. master_metadata_track_name:

      • The name of the track played.
    6. master_metadata_album_artist_name:

      • The artist of the album to which the track belongs.
    7. master_metadata_album_album_name:

      • The name of the album containing the track.
    8. spotify_track_uri:

      • The unique Spotify URI for the track.
    9. reason_start:

      • The reason for starting the track (e.g., play button clicked).
    10. reason_end:

      • The reason for ending the track (e.g., track done).
    11. shuffle:

      • Indicates whether shuffle mode was enabled (True/False).
    12. offline:

      • Indicates whether the track was played offline (True/False).
    13. offline_timestamp:

      • Timestamp indicating when the track was played offline (if applicable).
    14. incognito_mode:

      • Indicates whether incognito mode was enabled (True/False).

    Purpose:

    This dataset is suitable for performing detailed Exploratory Data Analysis (EDA) to uncover patterns, trends, and insights into the user's music-listening behaviour. Potential analyses could include the distribution of listening durations, favourite artists and tracks, exploration of geographic listening patterns, and examination of usage patterns across different platforms.

    Visualization tools such as Matplotlib and Seaborn could be utilized for a more in-depth analysis to create visual representations of the findings. This dataset aligns well with your interest in data science, offering opportunities to apply analytical techniques to real-world streaming data.

  9. Data from: Spotify Playlists Dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Pichl; Eva Zangerle; Eva Zangerle; Martin Pichl (2020). Spotify Playlists Dataset [Dataset]. http://doi.org/10.5281/zenodo.2594557
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Martin Pichl; Eva Zangerle; Eva Zangerle; Martin Pichl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description


    This dataset is based on the subset of users in the #nowplaying dataset who publish their #nowplaying tweets via Spotify. In principle, the dataset holds users, their playlists and the tracks contained in these playlists.

    The csv-file holding the dataset contains the following columns: "user_id", "artistname", "trackname", "playlistname", where

    • user_id is a hash of the user's Spotify user name
    • artistname is the name of the artist
    • trackname is the title of the track and
    • playlistname is the name of the playlist that contains this track.

    The separator used is , each entry is enclosed by double quotes and the escape character used is \.

    A description of the generation of the dataset and the dataset itself can be found in the following paper:

    Pichl, Martin; Zangerle, Eva; Specht, Günther: "Towards a Context-Aware Music Recommendation Approach: What is Hidden in the Playlist Name?" in 15th IEEE International Conference on Data Mining Workshops (ICDM 2015), pp. 1360-1365, IEEE, Atlantic City, 2015.

  10. c

    Spotify Tracks Dataset

    • cubig.ai
    zip
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Spotify Tracks Dataset [Dataset]. https://cubig.ai/store/products/276/spotify-tracks-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Spotify Tracks Dataset contains information on tracks from over 125 music genres, including both audio features (e.g., danceability, energy, valence) and metadata (e.g., title, artist, genre).

    2) Data Utilization (1) Characteristics of the Spotify Tracks Dataset: • The data is structured in a tabular format at the track level, where each column represents numerical or categorical features based on musical properties. This makes it suitable for recommendation systems, genre classification, and emotion analysis. • It includes multi-dimensional attributes grounded in music theory such as track duration, time signature, energy, loudness, tempo, and speechiness—enabling its use in music classification and clustering tasks.

    (2) Applications of the Spotify Tracks Dataset: • Design of Music Recommendation Systems: It can be used to build content-based filtering systems or hybrid recommendation algorithms based on user preferences.

  11. Z

    spotify data

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Jul 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Hulke (2023). spotify data [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_8114617
    Explore at:
    Dataset updated
    Jul 5, 2023
    Dataset provided by
    student
    Authors
    Ryan Hulke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    from kaggle

  12. World's Spotify TOP-50 playlist musicality data

    • kaggle.com
    zip
    Updated Nov 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miquel Neck (2023). World's Spotify TOP-50 playlist musicality data [Dataset]. https://www.kaggle.com/datasets/miquelneck/worlds-spotify-top-50-playlist-musicality-data
    Explore at:
    zip(175413 bytes)Available download formats
    Dataset updated
    Nov 26, 2023
    Authors
    Miquel Neck
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    World
    Description

    Every week, Spotify updates its Top-50 playlists for each country. This dataset includes every country list of the 45th week of 2023 (6th November - 12th November). There are 73 available countries.

    The dataset has a column for every musical aspect of each song, and also the name, country, artist and publication date of the track.

    Data extracted from the Spotify Official API.

    Columns

    These features are created by Spotify to analyze tracks. Here I copy the definition of each column, based on Spotify's API documentation.

    Danceability: Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

    Acousticness: A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

    Duration_ms: The duration of the track in milliseconds.

    Energy: Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

    Instrumentalness: Predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal".

    Key: The key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1.

    Liveness: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live.

    Loudness: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks.

    Mode: Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0.

    Speechiness: Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.

    Tempo: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

    Time_signature: An estimated time signature. The time signature (meter) is a notational convention to specify how many beats are in each bar (or measure). The time signature ranges from 3 to 7 indicating time signatures of "3/4", to "7/4".

    Valence: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

  13. Data from: Spotify Playlists

    • zenodo.org
    csv
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Cambria; Francesco Cambria (2025). Spotify Playlists [Dataset]. http://doi.org/10.5281/zenodo.14728731
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Francesco Cambria; Francesco Cambria
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was constructed based on the data found in Kaggle from Spotify.

    The files here reported can be used to build a property graph in Neo4J:

    • song.csv - contains all the data for the Song nodes.
    • artist.csv - contains the data for the Artist nodes.
    • playlist.csv - contains the data for the Playlist nodes.
    • user.csv - contains the data for the Playlist nodes (those creating Playlists).
    • genre.csv - contains the data for the Genre nodes (a category for the Artists).
    • type.csv - contains the data for the Type nodes (a category for the Playlists).
    • sing.csv - contains the data for the SING relationship from Artist to Song nodes.
    • created.csv - contains the data for the CREATED relationship from User to Playlist nodes.
    • in.csv - contains the data for the IN relationship from Song to Playlist nodes.
    • of_type.csv - contains the data for the OFTYPE relationship from Playlist to Type nodes.
    • labelled.csv - contains the data for the LABELLED relationship from Artist to Genre nodes.

    This data was used as test dataset in the paper "MINE GRAPH RULE: A New GQL Operator for Mining Association Rules in Property Graph Databases".

  14. Data from: MusicOSet: An Enhanced Open Dataset for Music Data Mining

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, zip
    Updated Jun 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota (2021). MusicOSet: An Enhanced Open Dataset for Music Data Mining [Dataset]. http://doi.org/10.5281/zenodo.4904639
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Jun 7, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mariana O. Silva; Mariana O. Silva; Laís Mota; Mirella M. Moro; Mirella M. Moro; Laís Mota
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MusicOSet is an open and enhanced dataset of musical elements (artists, songs and albums) based on musical popularity classification. Provides a directly accessible collection of data suitable for numerous tasks in music data mining (e.g., data visualization, classification, clustering, similarity search, MIR, HSS and so forth). To create MusicOSet, the potential information sources were divided into three main categories: music popularity sources, metadata sources, and acoustic and lyrical features sources. Data from all three categories were initially collected between January and May 2019. Nevertheless, the update and enhancement of the data happened in June 2019.

    The attractive features of MusicOSet include:

    • Integration and centralization of different musical data sources
    • Calculation of popularity scores and classification of hits and non-hits musical elements, varying from 1962 to 2018
    • Enriched metadata for music, artists, and albums from the US popular music industry
    • Availability of acoustic and lyrical resources
    • Unrestricted access in two formats: SQL database and compressed .csv files
    |    Data    | # Records |
    |:-----------------:|:---------:|
    | Songs       | 20,405  |
    | Artists      | 11,518  |
    | Albums      | 26,522  |
    | Lyrics      | 19,664  |
    | Acoustic Features | 20,405  |
    | Genres      | 1,561   |
  15. DATA-spotify-data-analysis

    • kaggle.com
    zip
    Updated Feb 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LIA GASPARIN (2024). DATA-spotify-data-analysis [Dataset]. https://www.kaggle.com/datasets/liagasparin/data-spotify-data-analysis
    Explore at:
    zip(88896013 bytes)Available download formats
    Dataset updated
    Feb 16, 2024
    Authors
    LIA GASPARIN
    Description

    Dataset

    This dataset was created by LIA GASPARIN

    Contents

  16. c

    Spotify Playlist ORIGINS Dataset

    • cubig.ai
    zip
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Spotify Playlist ORIGINS Dataset [Dataset]. https://cubig.ai/store/products/402/spotify-playlist-origins-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Spotify Playlist-ORIGINS Dataset is a dataset of Spotify playlists called ORIGINS, which individuals have made with their favorite songs since 2014.

    2) Data Utilization (1) Spotify Playlist-ORIGINS Dataset has characteristics that: • This dataset contains detailed music information for each playlist, including song name, artist, album, genre, release year, track ID, and structured metadata such as name, description, and song order for each playlist. (2) Spotify Playlist-ORIGINS Dataset can be used to: • Playlist-based music recommendation and user preference analysis: It can be used to develop a machine learning/deep learning-based music recommendation system or to study user preference analysis using playlist and song information. • Music Trend and Genre Popularity Analysis: It analyzes release year, genre, and artist data and can be used to study the music industry and culture, including music trends by period and genre, and changes in popular artists and songs.

  17. 160k Spotify songs from 1921 to 2020 (Sorted)

    • kaggle.com
    Updated Sep 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FCPercival (2022). 160k Spotify songs from 1921 to 2020 (Sorted) [Dataset]. https://www.kaggle.com/datasets/fcpercival/160k-spotify-songs-sorted
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 17, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FCPercival
    Description

    This is an analysis of the data on Spotify tracks from 1921-2020 with Jupyter Notebook and Python Data Science tools.

    About the Dataset

    The Spotify dataset (titled data.csv) consists of 160,000+ tracks sorted by name, from 1921-2020 found in Spotify as of June 2020. Collected by Kaggle user and Turkish Data Scientist Yamaç Eren Ay, the data was retrieved and tabulated from the Spotify Web API. Each row in the dataset corresponds to a track, with variables such as the title, artist, and year located in their respective columns. Aside from the fundamental variables, musical elements of each track, such as the tempo, danceability, and key, were likewise extracted; the algorithm for these values were generated by Spotify based on a range of technical parameters.

    Exploratory Data Analysis (EDA)

    1. Studying the correlations between the variables in the Spotify data.
    2. The evolution of different musical elements through the years.
    3. The divide between explicit and non-explicit songs through the years.

    Further Investigation and Inference (FII)

    1. Determining if there is a significant difference in popularity between explicit and non-explicit songs.
    2. Finding the most frequent emotions in Spotify tracks and analyzing their musical elements based on the track's mode and key.
    3. Determining the classifications of the Spotify tracks through K-Means Clustering.

    Project Directory Guide

    1. Spotify Data.ipynb is the main notebook where the data is imported for EDA and FII.
    2. data.csv is the dataset downloaded from Kaggle.
    3. spotify_eda.html is the HTML file for the comprehensive EDA done using the Pandas Profiling module.

    Project Notes

    1. This is in partial fulfillment of the course Statistical Modelling and Simulation (CSMODEL).

    Credits to gabminamedez for the original dataset.

  18. Z

    Spotify and Youtube

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guarisco, Marco; Sallustio, Marco; Rastelli, Salvatore (2023). Spotify and Youtube [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10253414
    Explore at:
    Dataset updated
    Dec 4, 2023
    Authors
    Guarisco, Marco; Sallustio, Marco; Rastelli, Salvatore
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    YouTube
    Description

    This is the statistics for the Top 10 songs of various spotify artists and their YouTube videos. The Creators above generated the data and uploaded it to Kaggle on February 6-7 2023. The license to use this data is "CC0: Public Domain", allowing the data to be copied, modified, distributed, and worked on without having to ask permission. The data is in numerical and textual CSV format as attached. This dataset contains the statistics and attributes of the top 10 songs of various artists in the world. As described by the creators above, it includes 26 variables for each of the songs collected from spotify. These variables are briefly described next:

    Track: name of the song, as visible on the Spotify platform. Artist: name of the artist. Url_spotify: the Url of the artist. Album: the album in wich the song is contained on Spotify. Album_type: indicates if the song is relesead on Spotify as a single or contained in an album. Uri: a spotify link used to find the song through the API. Danceability: describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable. Energy: is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy. Key: the key the track is in. Integers map to pitches using standard Pitch Class notation. E.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on. If no key was detected, the value is -1. Loudness: the overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db. Speechiness: detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks. Acousticness: a confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. Instrumentalness: predicts whether a track contains no vocals. "Ooh" and "aah" sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly "vocal". The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. Liveness: detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. Valence: a measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). Tempo: the overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. Duration_ms: the duration of the track in milliseconds. Stream: number of streams of the song on Spotify. Url_youtube: url of the video linked to the song on Youtube, if it have any. Title: title of the videoclip on youtube. Channel: name of the channel that have published the video. Views: number of views. Likes: number of likes. Comments: number of comments. Description: description of the video on Youtube. Licensed: Indicates whether the video represents licensed content, which means that the content was uploaded to a channel linked to a YouTube content partner and then claimed by that partner. official_video: boolean value that indicates if the video found is the official video of the song. The data was last updated on February 7, 2023.

  19. H

    My Spotify Data

    • dataverse.harvard.edu
    Updated Oct 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ty Mulholland (2022). My Spotify Data [Dataset]. http://doi.org/10.7910/DVN/FVCXKG
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 7, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Ty Mulholland
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    My Spotify Data

  20. Z

    Data from: P4KxSpotify: A Dataset of Pitchfork Music Reviews and Spotify...

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pinter, Anthony T.; Paul, Jacob M.; Jessie Smith; Brubaker, Jed R. (2020). P4KxSpotify: A Dataset of Pitchfork Music Reviews and Spotify Musical Features [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3603329
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    University of Colorado Boulder
    Authors
    Pinter, Anthony T.; Paul, Jacob M.; Jessie Smith; Brubaker, Jed R.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    18,403 music reviews scraped from Pitchfork, including relevant metadata such as author, review date, record release year, score, and genre, along with those album's audio features pulled from Spotify's API.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sanjana chaudhari☑️ (2023). Spotify dataset [Dataset]. https://www.kaggle.com/datasets/sanjanchaudhari/spotify-dataset
Organization logo

Spotify dataset

Explore at:
zip(2045049 bytes)Available download formats
Dataset updated
Jul 20, 2023
Authors
Sanjana chaudhari☑️
License

ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically

Description

Introduction to the Spotify Dataset

Overview of the Dataset Source and Purpose Description of the Data Collection Process Data Exploration

Understanding the Structure and Size of the Dataset Overview of the Features and Columns Key Features in the Spotify Dataset

Explanation of Important Columns (e.g., track name, artist, album, duration, popularity) Genre and Music Category Analysis

Categorizing Songs by Genre and Music Type Most Popular Genres on Spotify **Artist Analysis ** Identifying Top Artists Based on Popularity and Number of Songs Relationship between Artist and Song Attributes Song Duration Analysis

Distribution of Song Durations Impact of Song Duration on Popularity and Listener Engagement Song Popularity and Listener Engagement

Analyzing the Popularity Scores of Songs Correlation between Popularity and Other Song Features Audio Features Analysis

Examination of Audio Features (danceability, energy, instrumentalness, etc.) Clustering Songs Based on Audio Features Time-Based Analysis

Seasonal Trends in Song Releases and Popularity Time Series Analysis of Listening Patterns Collaborations and Featured Artists

Frequency of Collaborations and Featured Artists Impact of Collaborations on Song Popularity Recommendation Systems

Overview of Spotify's Recommendation Algorithms Building Simple Recommendation Models User Behavior and Playlist Analysis

Analysis of User-Generated Playlists Common Song Additions and Removals

Search
Clear search
Close search
Google apps
Main menu