Since The Eras Tour Film was just released, this time we're exploring Taylor Swift song data!
Are you Ready for It?
The taylor R package from W. Jake Thompson is a curated data set of Taylor Swift songs, including lyrics and audio characteristics. The data comes from Genius and the Spotify API.
There are three main datasets.
The first is taylor_album_songs, which includes lyrics and audio features from the Spotify API for all songs on Taylor’s official studio albums. Notably this excludes singles released separately from an album (e.g., Only the Young, Christmas Tree Farm, etc.), and non-Taylor-owned albums that have a Taylor-owned alternative (e.g., Fearless is excluded in favor of Fearless (Taylor’s Version)). We stan artists owning their own songs.
You can access Taylor’s entire discography with taylor_all_songs. This includes all of the songs in taylor_album_songs plus EPs, individual singles, and the original versions of albums that have been re-released as Taylor’s Version.
Finally, there is a small data set, taylor_albums, summarizing Taylor’s album release history.
Information on the audio features in the dataset from Spotify are included in their API documentation.
For your visualizations, the {taylor} package comes with it’s own class of color palettes, inspired by the work of Josiah Parry in the {cpcinema} package.
You might also be interested in the tayoRswift package by Alex Stephenson, a ggplot2 color palette based on Taylor Swift album covers. "For when your colors absolutely should not be excluded from the narrative."
taylor_album_songs.csv
variable | class | description |
---|---|---|
album_name | character | Album name |
ep | logical | Is it an EP |
album_release | double | Album release date |
track_number | integer | Track number |
track_name | character | Track name |
artist | character | Artists |
featuring | character | Artists featured |
bonus_track | logical | Is it a bonus track |
promotional_release | double | Date of promotional release |
single_release | double | Date of single release |
track_release | double | Date of track release |
danceability | double | Spotify danceability score. A value of 0.0 is least danceable and 1.0 is most danceable. |
energy | double | Spotify energy score. Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. |
key | integer | The key the track is in. |
loudness | double | Spotify loudness score. The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track. |
mode | integer | Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0. |
speechiness | double | Spotify speechiness score. Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. |
acousticness | double | Spotify acousticness score. A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. |
instrumentalness | double | Spotify instrumentalness score. Predicts whether a track contains no vocals. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. |
liveness | double | Spotify liveness score. Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. |
valence | double | Spotify valence score. A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). |
tempo | double | The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. |
time_signature | integer | An estimated time signature. The time signature (meter) is a notational convention to specify how many beats ar... |
According to a survey from March 2023 among U.S. adults, the most popular Taylor Swift album was "1989" with ***** percent of respondents stating that they liked the album. "Fearless" was also popular with another ***** percent saying that they enjoyed it. Less popular was Swift's album "Folklore" with which she surprised her fans during the Covid-19 pandemic.
💁♀️Please take a moment to carefully read through this description and metadata to better understand the dataset and its nuances before proceeding to the Suggestions and Discussions section.
This dataset provides a comprehensive collection of setlists from Taylor Swift’s official era tours, curated expertly by Spotify. The playlist, available on Spotify under the title "Taylor Swift The Eras Tour Official Setlist," encompasses a diverse range of songs that have been performed live during the tour events of this global artist. Each dataset entry corresponds to a song featured in the playlist.
Taylor Swift, a pivotal figure in both country and pop music scenes, has had a transformative impact on the music industry. Her tours are celebrated not just for their musical variety but also for their theatrical elements, narrative style, and the deep emotional connection they foster with fans worldwide. This dataset aims to provide fans and researchers an insight into the evolution of Swift's musical and performance style through her tours, capturing the essence of what makes her tour unique.
Obtaining the Data: The data was obtained directly from the Spotify Web API, specifically focusing on the setlist tracks by the artist. The Spotify API provides detailed information about tracks, artists, and albums through various endpoints.
Data Processing: To process and structure the data, Python scripts were developed using data science libraries such as pandas for data manipulation and spotipy for API interactions, specifically for Spotify data retrieval.
Workflow:
Authentication API Requests Data Cleaning and Transformation Saving the Data
Note: Popularity score reflects the score recorded on the day that retrieves this dataset. The popularity score could fluctuate daily.
This dataset, derived from Spotify focusing on Taylor Swift's The Eras Tour setlist data, is intended for educational, research, and analysis purposes only. Users are urged to use this data responsibly, ethically, and within the bounds of legal stipulations.
According to a 2025 survey among adults in the United States, Taylor Swift did not play a major role in the excitement about watching the Super Bowl LIX. While 38 percent of respondents stated not to be more excited, nearly 28 percent said that their decision to watch did not depend on Taylor Swift's attendance. On the contrary, one in four Americans were indeed more excited to watch because of Taylor Swift.
According to the results of a survey conducted in the United States in October 2023, about ** percent of respondents aged 30 to 44 years old stated that they considered themselves Taylor Swift fans. More than one in ** survey respondents aged 65 years or older stated the same. The phenomenon of Taylor Swift Since her first album in 2006, Taylor Swift has managed to build a global brand around her music. According to an analysis from 2023, she officially became a billionaire after making approximately *** million U.S. dollars through ticket sales and merchandise. Another main component of her income is the release and re-release of her music. However, 2023 seemed to be the year of Taylor Swift in other regards as well. Her Eras Tour, which she launched in March, grossed over *********** U.S. dollars, becoming the highest grossing tour of all time. Additionally, it is estimated that Swift even boosted the U.S. economy with her tour, by *** billion U.S. dollar Yet, she did not stop there, turning her tour into a concert film, which quickly became the most successful concert film in history. By the end of the year, she was Spotify’s most streamed artist globally and subsequently Time Magazine named her “Person of the Year.”
According to a study carried out between January and September 2023, American singer-songwriter Taylor Swift was mentioned over 2.8 million times in online discussions. American singer, songwriter, and businesswoman Beyoncé Giselle Knowles-Carter was mentioned 2.4 million times in online posts. In January 2023, Rolling Stone announced Beyoncé as one of the greatest vocalists of all time, closely following legends such as Aretha Franklin, Whitney Houston, and Sam Cooke.
Spotify data about Taylor Swift's songs. The streams were collected unitl April 24 2022
This statistic presents information on the first week unit sales of Taylor Swift's albums from 2006 to 2017. Swift's fifth album '1989' sold **** million copies during the first week after its release on ****************. Her most recent album, 'reputation', released in *************, sold **** million copies in its first week in the United States
https://www.apache.org/licenses/LICENSE-2.0https://www.apache.org/licenses/LICENSE-2.0
English news that mention the "Taylor Swift". Crawled date: Oct, 2024. Documents count: 700.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
What makes Taylor Swift so successful? And can developing artists harness the same techniques to jumpstart their next album development and release? A dataset for the analysis of relationships and elements common within Swift's catalogue of albums, from Spotify API data and Metacritic.
Includes album level data on:
Spotify Popularity Index
Spotify streaming numbers
Metacritic scores
Spotify algorithm metrics - acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness, tempo, valance
Theme
Genre
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This data set was created by PromptCloud (a Data-as-a-Service provider), using the API exposed by Genius.com.
It has the following data fields:
You can check out this article to understand the following initial set of analysis:
– Exploratory analysis
– Text mining
According to an analysis from 2023, Taylor Swift has become a billionaire after making approximately *** million U.S. dollars through ticket sales and merchandise. Obviously, beside her super successful Era's tour, the other main component of her income is the new release and re-release of her music, with her re-recording previous albums in order to secure ownership of her work after the rights of her first music were sold.
This dataset was created by Kailane Felix
Following the 2019 release of "Lover," Taylor Swift undertook the task of re-recording her earlier albums. This decision was spurred by the sale of all rights to her music catalog by her former record label, Big Machine Records, to her music manager, Scooter Braun. With the release of each subsequent album, Taylor introduced new merchandise, propelling the American artist to unprecedented heights. Since 2019, the traffic to Taylor's online store has steadily increased. This phenomenon has particularly intensified in 2023, marked by the launch of "The Eras Tour," where each concert is a journey through her various musical "eras" of the past 17 years. From January to September 2023, the singer's online store had surpassed ** million DTC (direct-to-consumer) visits.
According to a survey conducted between September and October 2023, over ** percent of respondents stated that they were very or somewhat familiar with Taylor Swift. Only *** percent said that they were not at all familiar with the American singer.
According to a survey from March 2023 among U.S. Taylor Swift fans, also called 'Swifties', the largest share of Swift fans were in the group of **********. The second largest share was within the group of ************, followed by Gen X. In 2023, Swift announced her first tour in years, called 'Era', which already caused chaos when her significant fan base tried to purchase tickets for the long anticipated concerts.
As of 2023, the top-selling album in the United States was '1989 (Taylor's Version)' by Taylor Swift, with over 1.9 million unit sales. Overall, Taylor Swift dominated the United States album sales, as she not only managed to enter the Top10 with five more albums, she also registered the second- and third-best selling albums that year.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides detailed information about songs, including the artist, title, and lyrics. It is primarily designed for use in Natural Language Processing (NLP) tasks, offering a valuable resource for analysing textual data within the entertainment and media domain.
The dataset is presented in a tabular format. While specific numbers for rows or records are not explicitly provided, it includes information on 742 unique artist values and 745 unique title values. Data files are typically in CSV format.
This dataset is ideally suited for a variety of applications, including: * Developing and training NLP models for text analysis. * Researching lyrical themes and patterns across different artists or genres. * Building music recommendation systems based on lyrical content. * Exploring trends in popular music by artist and song.
The dataset has a global regional coverage, making it applicable for worldwide analysis. Specific time ranges or demographic scopes are not detailed.
CCO
This dataset is particularly useful for: * Data scientists and machine learning engineers focusing on NLP. * Researchers studying musicology or cultural trends through lyrical analysis. * Students and beginners in data science looking for an accessible dataset for text-based projects. * Anyone interested in entertainment and media consumption data.
Original Data Source: Lyrics
The graph shows the revenue generated by the American singer and songwriter Taylor Swift in the United States in 2018, broken down by source. Taylor Swift generated *** million U.S. dollars with her publishing activities in that year.
Taylor Swift was a regular feature at Kansas City Chiefs games during the 2023 NFL season to support her boyfriend and Chiefs tight end, Travis Kelce. During a January 2024 survey in the United States, almost one quarter of fans stated that it was great that the pop icon had become a presence at NFL games. By contrast, 14 percent of respondents were not happy with her presence.
Since The Eras Tour Film was just released, this time we're exploring Taylor Swift song data!
Are you Ready for It?
The taylor R package from W. Jake Thompson is a curated data set of Taylor Swift songs, including lyrics and audio characteristics. The data comes from Genius and the Spotify API.
There are three main datasets.
The first is taylor_album_songs, which includes lyrics and audio features from the Spotify API for all songs on Taylor’s official studio albums. Notably this excludes singles released separately from an album (e.g., Only the Young, Christmas Tree Farm, etc.), and non-Taylor-owned albums that have a Taylor-owned alternative (e.g., Fearless is excluded in favor of Fearless (Taylor’s Version)). We stan artists owning their own songs.
You can access Taylor’s entire discography with taylor_all_songs. This includes all of the songs in taylor_album_songs plus EPs, individual singles, and the original versions of albums that have been re-released as Taylor’s Version.
Finally, there is a small data set, taylor_albums, summarizing Taylor’s album release history.
Information on the audio features in the dataset from Spotify are included in their API documentation.
For your visualizations, the {taylor} package comes with it’s own class of color palettes, inspired by the work of Josiah Parry in the {cpcinema} package.
You might also be interested in the tayoRswift package by Alex Stephenson, a ggplot2 color palette based on Taylor Swift album covers. "For when your colors absolutely should not be excluded from the narrative."
taylor_album_songs.csv
variable | class | description |
---|---|---|
album_name | character | Album name |
ep | logical | Is it an EP |
album_release | double | Album release date |
track_number | integer | Track number |
track_name | character | Track name |
artist | character | Artists |
featuring | character | Artists featured |
bonus_track | logical | Is it a bonus track |
promotional_release | double | Date of promotional release |
single_release | double | Date of single release |
track_release | double | Date of track release |
danceability | double | Spotify danceability score. A value of 0.0 is least danceable and 1.0 is most danceable. |
energy | double | Spotify energy score. Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. |
key | integer | The key the track is in. |
loudness | double | Spotify loudness score. The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track. |
mode | integer | Mode indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. Major is represented by 1 and minor is 0. |
speechiness | double | Spotify speechiness score. Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. |
acousticness | double | Spotify acousticness score. A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic. |
instrumentalness | double | Spotify instrumentalness score. Predicts whether a track contains no vocals. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0. |
liveness | double | Spotify liveness score. Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. |
valence | double | Spotify valence score. A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry). |
tempo | double | The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration. |
time_signature | integer | An estimated time signature. The time signature (meter) is a notational convention to specify how many beats ar... |