Retail Scanner Data consist of weekly pricing, volume, and store environment information generated by point-of-sale systems from more than 90 participating retail chains across all US markets.
Store Demographics: Includes store chain code, channel type, and area location. Retailer names are masked to protect identity.
Weekly Product Data: For each UPC code, participating stores report units, price, price multiplier, baseline units, baseline price, feature indicator, and display indicator. Products: Weekly product data for 2.6-4.5* million UPCs including food, nonfood grocery items, health and beauty aids, and select general merchandise aggregated into 1,100 product categories store environment variables (i.e., feature and display indicators) from a subset of stores. The 1,100 product categories are categorized into 125 product groups and 10 departments. The structure matches that of the consumer panel data. All private-label goods have a masked UPC to protect the identity of the retailers.
Product Characteristics: All products include UPC code and description, brand, multipack, and size, as well as NielsenIQ codes for department, product group, and product module. Some products contain additional characteristics (e.g., flavor).
Geographies: Scanner Data from 35,000-50,000* participating grocery, drug, mass merchandiser, and other stores, covering more than half the total sales volume of US grocery and drug stores and more than 30 percent of all US mass merchandiser sales volume. Data cover the entire United States, divided into 52 major markets, and include the same codes as those used in the consumer panel data.
Retail Channels: Food, drug, mass merchandise, convenience, and liquor.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Means and standard deviations of explanatory variables in the Tobit random effect model.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Conditional marginal effects of household food and beverage expenditures and marginal effects associated with the probability of purchasing by store type.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Store Transaction data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/iamprateek/store-transaction-data on 14 February 2022.
--- Dataset description provided by original source is as follows ---
Nielsen receives transaction level scanning data (POS Data) from its partner stores on a regular basis. Stores sharing POS data include bigger format store types such as supermarkets, hypermarkets as well as smaller traditional trade grocery stores (Kirana stores), medical stores etc. using a POS machine.
While in a bigger format store, all items for all transactions are scanned using a POS machine, smaller and more localized shops do not have a 100% compliance rate in terms of scanning and inputting information into the POS machine for all transactions.
A transaction involving a single packet of chips or a single piece of candy may not be scanned and recorded to spare customer the inconvenience or during rush hours when the store is crowded with customers.
Thus, the data received from such stores is often incomplete and lacks complete information of all transactions completed within a day.
Additionally, apart from incomplete transaction data in a day, it is observed that certain stores do not share data for all active days. Stores share data ranging from 2 to 28 days in a month. While it is possible to impute/extrapolate data for 2 days of a month using 28 days of actual historical data, the vice versa is not recommended.
Nielsen encourages you to create a model which can help impute/extrapolate data to fill in the missing data gaps in the store level POS data currently received.
You are provided with the dataset that contains store level data by brands and categories for select stores-
Hackathon_ Ideal_Data - The file contains brand level data for 10 stores for the last 3 months. This can be referred to as the ideal data.
Hackathon_Working_Data - This contains data for selected stores which are missing and/or incomplete.
Hackathon_Mapping_File - This file is provided to help understand the column names in the data set.
Hackathon_Validation_Data - This file contains the data stores and product groups for which you have to predict the Total_VALUE.
Sample Submission - This file represents what needs to be uploaded as output by candidate in the same format. The sample data is provided in the file to help understand the columns and values required.
Nielsen Holdings plc (NYSE: NLSN) is a global measurement and data analytics company that provides the most complete and trusted view available of consumers and markets worldwide. Nielsen is divided into two business units. Nielsen Global Media, the arbiter of truth for media markets, provides media and advertising industries with unbiased and reliable metrics that create a shared understanding of the industry required for markets to function. Nielsen Global Connect provides consumer packaged goods manufacturers and retailers with accurate, actionable information and insights and a complete picture of the complex and changing marketplace that companies need to innovate and grow. Our approach marries proprietary Nielsen data with other data sources to help clients around the world understand what’s happening now, what’s happening next, and how to best act on this knowledge. An S&P 500 company, Nielsen has operations in over 100 countries, covering more than 90% of the world’s population.
Know more: https://www.nielsen.com/us/en/
Build an imputation and/or extrapolation model to fill the missing data gaps for select stores by analyzing the data and determine which factors/variables/features can help best predict the store sales.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic stratification according to diagnostic support for fatal acute myocardial infarction.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographics of participants providing fungal samples in the MicroCOPD study.
In 2025, **** million Americans watched the Grammy Awards ceremony. This figure marked a decrease from the previous year, and represented the fourth-lowest TV audience since the 2000 ceremony. Throwback to the Grammys’ glory days The Grammy Awards have suffered viewer losses for the better part of a decade. The show's rankings last peaked in 2012, when an estimated 39 million people tuned in to watch Music's Biggest Night. After that, the format failed to draw the same impressive audience numbers as it did in the early 2010s. But what made the 54th annual Grammy Awards so special? For one, the event incorporated various musical tributes to Whitney Houston, who had died the day before the show. On top of that, the ceremony was hosted by LL Cool J, who was the first Grammys host in seven years. Spotlight on other awards ceremonies The Grammys are not the only awards ceremony that has lost viewers and relevance over the past decade. Even the Oscars, Hollywood's most prestigious celebration, have recently failed to draw television viewers' attention. In 2024, the number of Academy Awards viewers stood at **** million, and even though this figure marked a significant improvement compared to the previous years, the audience was still only half as big as it was back in 2015. A similar downward trend also unfolded at the Golden Globes, where the number of TV viewers dropped to *** million in 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic information of parents from the outdoor kindergarten and the conventional kindergarten groups.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographic characteristics and risk factors.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Retail Scanner Data consist of weekly pricing, volume, and store environment information generated by point-of-sale systems from more than 90 participating retail chains across all US markets.
Store Demographics: Includes store chain code, channel type, and area location. Retailer names are masked to protect identity.
Weekly Product Data: For each UPC code, participating stores report units, price, price multiplier, baseline units, baseline price, feature indicator, and display indicator. Products: Weekly product data for 2.6-4.5* million UPCs including food, nonfood grocery items, health and beauty aids, and select general merchandise aggregated into 1,100 product categories store environment variables (i.e., feature and display indicators) from a subset of stores. The 1,100 product categories are categorized into 125 product groups and 10 departments. The structure matches that of the consumer panel data. All private-label goods have a masked UPC to protect the identity of the retailers.
Product Characteristics: All products include UPC code and description, brand, multipack, and size, as well as NielsenIQ codes for department, product group, and product module. Some products contain additional characteristics (e.g., flavor).
Geographies: Scanner Data from 35,000-50,000* participating grocery, drug, mass merchandiser, and other stores, covering more than half the total sales volume of US grocery and drug stores and more than 30 percent of all US mass merchandiser sales volume. Data cover the entire United States, divided into 52 major markets, and include the same codes as those used in the consumer panel data.
Retail Channels: Food, drug, mass merchandise, convenience, and liquor.