Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By data.world's Admin [source]
This dataset contains a comprehensive collection of Super Bowl Ads broadcasted. Our data comes from superbowl-ads.com, providing us with the URL's to watch each ad on YouTube. We have included seven defining characteristics of these advertisements - including funniness, patriotism, celebrity presence, animals featured, and use of sex to sell the product - that will offer unique insights into the cultural trends present in each year's advertising campaigns. Furthermore, this dataset implores us to ask questions about the relationship between popular culture and the kinds of ads companies have used in order to both promote their products as well as better relate with their audience through utilizing images and themes which reflect current society. With so much data available in an easily accessible format than ever before thanks to modern technology; exploring this content could give way to unprecedented opportunities for marketers who want gain an advantage in understanding their target demographic or can provide a fresh perspective for those looking consume something new
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
There are a few different ways you can use this data to uncover America’s secrets through Super Bowl ads. Let’s explore some potential uses!
Analyze changes in the types of themes across years: By looking at the data for each year separately and trying to identify trends or similarities across years in particular themes (like funny ads or dangerous ad), you can gain an understanding of any changes in how Americans view these aspects of their entertainment. For example, is there a trend towards more funny ads? Or more patriotic ones?
Utilize Brand Analysis: pull up all of an individual brand’s data from all years and ask what types of messages this brand has been sending throughout its Super Bowl advertising over time– Do they like animals? Are their famous people in most ads? An understanding what type brands put out will allow insight into how Americans perceive them overall.
Analyze correlations between themes: Find correlations between different aspects by performing analyses that compare two columns at a time over multiple years; some examples may include correlation between using sex vs using animals in advertising or correlation between having a celebrity spokesperson/actor/actress vs being patriotic with ad content could also be interesting to analyze.
Creating an interactive visualization that allows users to explore the different trends surrounding Super Bowl ads over the last two decades. This could include visuals such as bar graphs, line charts and scatter plots that show how often certain characteristics are used in ads, and how these characteristics have evolved over time.
Running a classifier model to predict which characteristics will be used in an upcoming Super Bowl ad. This could use factors such as past data from similar brands or from the same company over multiple years.
Using the data to create a machine learning algorithm that recommends which kinds of elements (i.e funny jokes, celebrity appearances, animals ect.) should be included in a new ad based on user input about their desired outcome for the ad (i.e increase brand awareness or position brand image)
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: superbowl-ads.csv | Column name | Description | |:------------------------------|:--------------------------------------------------------------| | year | The year the ad was broadcasted. (Integer) | | brand | The brand associated with the ad. (String) | | superbowl_ads_dot_com_url | The URL of the ad on Superbowl-ads.com. (String) | | youtube_url | The URL of the ad on YouTube. (String) ...
Facebook
TwitterThe Super Bowl is the highlight of the NFL season, watched by millions in the United States and many more across the world. During a 2025 survey in the United States, around 78 percent of respondents stated that they planned to watch Super Bowl LIX between the Kansas City Chiefs and the San Francisco 49ers.
Facebook
TwitterBy Throwback Thursday [source]
Winning and Losing Teams: The dataset includes columns for both the winning team and the losing team. You can analyze trends or patterns in team performance over time by comparing their records in different Super Bowl games.
Scores: The final score of each Super Bowl game is provided in two separate columns: Winning Team Points and Losing Team Points. You can explore which games had high-scoring or low-scoring outcomes, and identify any interesting patterns or outliers.
Conferences: The dataset includes information about the conferences to which both the winning and losing teams belong. You can analyze the success rates of teams from different conferences or compare their performances in specific seasons.
Venue and City: You can find information about where each Super Bowl game was played by referring to the Venue and City columns. This allows you to explore geographical aspects of the games' locations.
Attendance: The number of people in attendance at each Super Bowl game is provided under the column Attendance. This data point allows you to understand how popular a particular game was among fans.
Networks: Television networks that broadcasted each Super Bowl game are included in this dataset under Network. Analyzing network preferences for airing these games may reveal interesting insights into TV viewership habits over time.
Average U.S.Viewers,Rating,and Share: Columns like Average U.S.Viewers provide valuable information regarding viewership trends across different years while Rating provides insight into audience interest as measured by ratings.Advertisers may be interested in exploring instances where the Cost Per 30s Ad increased in line with higher ratings.
Cost Per 30s Ad: The cost of a 30-second advertisement during each Super Bowl game is listed under the Cost Per 30s Ad column. This allows you to examine trends in advertising costs or identify Super Bowl games that commanded particularly high advertising rates.
Notes: Additional notes or details about each Super Bowl game are provided under the Notes column. These notes may contain interesting information, trivia, or historical context that can enrich your analysis.
Remember not to include dates as per your requirement for this guide.
With
- Analyzing the popularity of Super Bowl games: With data on average U.S. viewers, rating, share, and cost per 30s ad, this dataset can be used to analyze the popularity and viewership trends of different Super Bowl games over the years. This can help identify patterns and factors that contribute to a successful Super Bowl event.
- Comparing team performance: By analyzing the winning and losing team points for each game, as well as their conferences, this dataset can be used to compare the performance of different teams in Super Bowl games. It can help determine which conferences or teams have historically performed better or worse in these high-stakes games.
- Studying advertising trends: The cost per 30s ad information in this dataset allows for an analysis of advertising trends during the Super Bowl. By examining how ad costs have changed over time, advertisers can gain insights into the value and effectiveness of Super Bowl commercials, as well as understand shifts in consumer behavior and preferences during these major sporting events
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: ThrowbackDataThursday 2019 Week 5 - Super Bowl.csv | Column name | Description | |:----------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------| | Game | The number assigned to each Super Bowl game. (Numeric) | | Date | The date on which the Super Bowl game took place. (Date) | | Winning team | The name of the team ...
Facebook
TwitterThe Super Bowl is one of the highlights of the sporting calendar, but many viewers tune in for more than just the game itself. During a January 2025 survey in the United States, almost 35 percent of respondents stated that the famous Super Bowl commercials were one of the parts of the event they were looking forward to.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15666745%2F714ec9ae87e0180165c4b629a7e83de2%2F1693432306522.jpg?generation=1715894931524755&alt=media" alt="">
stores.csv
This file contains anonymized information about the 45 stores, indicating the type and size of store.
train.csv
This is the historical training data, which covers to 2010-02-05 to 2012-11-01. Within this file you will find the following fields:
test.csv This file is identical to train.csv, except we have withheld the weekly sales. You must predict the sales for each triplet of store, department, and date in this file.
features.csv This file contains additional data related to the store, department, and regional activity for the given dates. It contains the following fields:
For convenience, the four holidays fall within the following weeks in the dataset (not all holidays are in the data):
Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13 Labor Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13 Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13 Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13
Facebook
TwitterThe Challenge - One challenge of modeling retail data is the need to make decisions based on limited history. Holidays and select major events come once a year, and so does the chance to see how strategic decisions impacted the bottom line. In addition, markdowns are known to affect sales – the challenge is to predict which departments will be affected and to what extent.
You are provided with historical sales data for 45 stores located in different regions - each store contains a number of departments. The company also runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of which are the Super Bowl, Labor Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks.
Origin Link: https://www.kaggle.com/datasets/manjeetsingh/retaildataset.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Retail_Analysis_with_Walmart/main/Wallmart1.jpg" alt="">
One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.
Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.
The dataset is taken from Kaggle.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We estimated ideological preferences of 3.8 million Twitter users and, using a dataset of 150 million tweets concerning 12 political and non-political issues, explored whether online communication resembles an “echo chamber” due to selective exposure and ideological segregation or a “national conversation.” We observed that information was exchanged primarily among individuals with similar ideological preferences for political issues (e.g., presidential election, government shutdown) but not for many other current events (e.g., Boston marathon bombing, Super Bowl). Discussion of the Newtown shootings in 2012 reflected a dynamic process, beginning as a “national conversation” before being transformed into a polarized exchange. With respect to political and non-political issues, liberals were more likely than conservatives to engage in cross-ideological dissemination, highlighting an important asymmetry with respect to the structure of communication that is consistent with psychological theory and research. We conclude that previous work may have overestimated the degree of ideological segregation in social media usage.
Facebook
TwitterDESCRIPTION
One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.
Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.
**Dataset Description **
This is the historical data that covers sales from 2010-02-05 to 2012-11-01, in the file Walmart_Store_sales. Within this file you will find the following fields:
Store - the store number
Date - the week of sales
Weekly_Sales - sales for the given store
Holiday_Flag - whether the week is a special holiday week 1 – Holiday week 0 – Non-holiday week
Temperature - Temperature on the day of sale
Fuel_Price - Cost of fuel in the region
CPI – Prevailing consumer price index
Unemployment - Prevailing unemployment rate
Holiday Events
Super Bowl: 12-Feb-10, 11-Feb-11, 10-Feb-12, 8-Feb-13 Labour Day: 10-Sep-10, 9-Sep-11, 7-Sep-12, 6-Sep-13 Thanksgiving: 26-Nov-10, 25-Nov-11, 23-Nov-12, 29-Nov-13 Christmas: 31-Dec-10, 30-Dec-11, 28-Dec-12, 27-Dec-13
Analysis Tasks
Basic Statistics tasks
Which store has maximum sales
Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation
Which store/s has good quarterly growth rate in Q3’2012
Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together
Provide a monthly and semester view of sales in units and give insights
Statistical Model
For Store 1 – Build prediction models to forecast demand
Linear Regression – Utilize variables like date and restructure dates as 1 for 5 Feb 2010 (starting from the earliest date in order). Hypothesize if CPI, unemployment, and fuel price have any impact on sales.
Change dates into days by creating new variable.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Abstract: We estimated ideological preferences of 3.8 million Twitter users and, using a dataset of 150 million tweets concerning 12 political and non-political issues, explored whether online communication resembles an "echo chamber" due to selective exposure and ideological segregation or a "national conversation." We observed that information was exchanged primarily among individuals with similar ideological preferences for political issues (e.g., presidential election, government shutdown) but not for many other current events (e.g., Boston marathon bombing, Super Bowl). Discussion of the Newtown shootings in 2012 reflected a dynamic process, beginning as a "national conversation" before being transformed into a polarized exchange. With respect to political and non-political issues, liberals were more likely than conservatives to engage in cross-ideological dissemination, highlighting an important asymmetry with respect to the structure of communication that is consistent with psychological theory and research. We conclude that previous work may have overestimated the degree of ideological segregation in social media usage.
Our datasets are divided in four different folders (zipped in Dataverse):
tweet-collections/ contains the list of tweet IDs of the nearly 150 million tweets we use in our analysis. In compliance with Twitter's Terms of Service, we cannot share the full text of the tweets, but we provide the code in [01_data_collection/01-collect-tweets.r](https://github.com/pablobarbera/echo_chambers/blob/master/01_data_collection/01-collect-tweets.r) shows how to re-generate this dataset directly from the Twitter API (it may take a while, though).input/ contains datasets that we generated prior to the analysis, such as the list of political accounts we consider (elites-data.csv), the ideal point estimates for members of Congress based on roll-call votes and estimated by Simon Jackman (house.csv and senate.csv), and the matches of Twitter IDs with the voter registration files in five different states (voter-matches.csv).Imported from: Barbera, Pablo; Jost, John; Nagler, Jonathan; Tucker, Joshua; Bonneau, Richard, 2015, "Replication Data for: Tweeting from Left to Right: Is Online Political Communication More Than an Echo Chamber?", https://doi.org/10.7910/DVN/F9ICHH, Harvard Dataverse, V1
Related publication: Barbera, P., Jost, J.T., Nagler, J., Tucker, J. & Bonneau, R. (2015) "Tweeting from Left to Right: Is Online Political Communication More Than an Echo Chamber?" Psychological Science, https://doi.org/10.1177/0956797615594620
Facebook
TwitterConventional retail stores still play a prominent role in a world dominated by Ecommerce. Retail is the process of selling consumer goods or services to customers through multiple channels of distribution to earn a profit. From groceries to clothing to electronics, customers keep flooding the gates of retail stores to satisfy their needs. As time has passed, retailers have had to evolve in order to keep up with changes in demands and the ever-changing mindset of customers. One such retail industry juggernaut that has kept up with the demands of customers as well changed the face of the retail industry for the better is Walmart Inc.
Walmart Inc is an American multinational retail corporation that operates a chain of hypermarkets, discount department stores, and grocery stores, headquartered in Bentonville, Arkansas. They have many stores across the globe and it is the largest retail company by revenue.
We have historical sales data for 45 Walmart stores located in different regions. Each store contains a number of departments. Apart from these, weekly data of Fuel price, Holiday, Temperature with some other features are also present in the data set.
In addition, Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks.
Data consists of 421570 records of weekly sales from stores spanning between ’05-Feb-2010’ to ’26-Oct-2012’. This comprises of 143 Weeks of sales data.
Total 16 numbers of attributes are provided in the Data set including Target variable. Attribute definition is: Store: The store number Size: Size of the Store Dept: Department of the Store Date: Specifying the Week (Friday of every Week) Temperature: Average temperature in the region (in ℉) FuelPrice: Cost of fuel in the region MarkDown1-5: Anonymized data related to promotional markdowns that Walmart is running. Markdown data is only available after November 2011, and is not available for all stores all the time. Any missing value is marked with Null. CPI: Consumer price index Unemployment: Unemployment rate IsHoliday: Whether the week is a special holiday week
Using the above given features, we have to predict the weekly sales of the store with given parameters.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By data.world's Admin [source]
This dataset contains a comprehensive collection of Super Bowl Ads broadcasted. Our data comes from superbowl-ads.com, providing us with the URL's to watch each ad on YouTube. We have included seven defining characteristics of these advertisements - including funniness, patriotism, celebrity presence, animals featured, and use of sex to sell the product - that will offer unique insights into the cultural trends present in each year's advertising campaigns. Furthermore, this dataset implores us to ask questions about the relationship between popular culture and the kinds of ads companies have used in order to both promote their products as well as better relate with their audience through utilizing images and themes which reflect current society. With so much data available in an easily accessible format than ever before thanks to modern technology; exploring this content could give way to unprecedented opportunities for marketers who want gain an advantage in understanding their target demographic or can provide a fresh perspective for those looking consume something new
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
There are a few different ways you can use this data to uncover America’s secrets through Super Bowl ads. Let’s explore some potential uses!
Analyze changes in the types of themes across years: By looking at the data for each year separately and trying to identify trends or similarities across years in particular themes (like funny ads or dangerous ad), you can gain an understanding of any changes in how Americans view these aspects of their entertainment. For example, is there a trend towards more funny ads? Or more patriotic ones?
Utilize Brand Analysis: pull up all of an individual brand’s data from all years and ask what types of messages this brand has been sending throughout its Super Bowl advertising over time– Do they like animals? Are their famous people in most ads? An understanding what type brands put out will allow insight into how Americans perceive them overall.
Analyze correlations between themes: Find correlations between different aspects by performing analyses that compare two columns at a time over multiple years; some examples may include correlation between using sex vs using animals in advertising or correlation between having a celebrity spokesperson/actor/actress vs being patriotic with ad content could also be interesting to analyze.
Creating an interactive visualization that allows users to explore the different trends surrounding Super Bowl ads over the last two decades. This could include visuals such as bar graphs, line charts and scatter plots that show how often certain characteristics are used in ads, and how these characteristics have evolved over time.
Running a classifier model to predict which characteristics will be used in an upcoming Super Bowl ad. This could use factors such as past data from similar brands or from the same company over multiple years.
Using the data to create a machine learning algorithm that recommends which kinds of elements (i.e funny jokes, celebrity appearances, animals ect.) should be included in a new ad based on user input about their desired outcome for the ad (i.e increase brand awareness or position brand image)
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: superbowl-ads.csv | Column name | Description | |:------------------------------|:--------------------------------------------------------------| | year | The year the ad was broadcasted. (Integer) | | brand | The brand associated with the ad. (String) | | superbowl_ads_dot_com_url | The URL of the ad on Superbowl-ads.com. (String) | | youtube_url | The URL of the ad on YouTube. (String) ...