Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in the United States increased 0.60 percent in August of 2025 over the previous month. This dataset provides - U.S. December Retail Sales Increased More Than Forecast - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Grocery Sales Prediction
This dataset provides a rich resource for researchers and practitioners interested in retail sales prediction and analysis. It contains information about various grocery products, the outlets where they are sold, and their historical sales data.
Product Characteristics:
Item_Identifier: Unique identifier for each product. Item_Weight: Weight of the product item. Item_Fat_Content: Categorical variable indicating the fat content of the product (e.g., low fat, regular). Item_Visibility: Numerical attribute reflecting the visibility of the product in the store (likely a promotional measure). Item_Type: Category of the product (e.g., Snacks, Beverages, Bakery). Item_MRP: Maximum Retail Price of the product. Outlet Information:
Outlet_Identifier: Unique identifier for each outlet (store). Outlet_Establishment_Year: Year the outlet was established. Outlet_Size: Categorical variable indicating the size of the outlet (e.g., Small, Medium, Large). (Note: This data may have missing values) Outlet_Location_Type: Categorical variable indicating the type of location the outlet is in (e.g., Tier 1 City, Tier 2 City, Upstate). Outlet_Type: Categorical variable indicating the type of outlet (e.g., Supermarket, Grocery Store, Convenience Store). Sales Data:
Item_Outlet_Sales: The historical sales data for each product-outlet combination. Profit: The profit margin earned on each product sold. Potential Uses
This dataset can be used for various retail sales analysis and prediction tasks, including:
Demand forecasting: Build models to predict future sales of individual products or product categories at specific outlets. Promotion optimization: Analyze the effectiveness of different promotional strategies (reflected by Item_Visibility) on sales. Assortment planning: Optimize product selection and placement within stores based on sales history and outlet characteristics. Outlet performance analysis: Compare the performance of different outlets based on sales figures and profit margins. Customer segmentation: Identify customer segments with distinct purchasing behavior based on product types and outlet locations. By analyzing these rich data points, retailers can gain valuable insights to improve their sales strategies, optimize inventory management, and maximize profits.
This statistic shows the retail sales value in Saudi Arabia in 2018, with estimates from 2019 to 2025. In 2018, the retail sales value amounted to ***** billion U.S. dollars. It was estimated that the retail sales value would grow until 2025, reaching around ***** billion U.S. dollars.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research Domain:
The dataset is part of a project focused on retail sales forecasting. Specifically, it is designed to predict daily sales for Rossmann, a chain of over 3,000 drug stores operating across seven European countries. The project falls under the broader domain of time series analysis and machine learning applications for business optimization. The goal is to apply machine learning techniques to forecast future sales based on historical data, which includes factors like promotions, competition, holidays, and seasonal trends.
Purpose:
The primary purpose of this dataset is to help Rossmann store managers predict daily sales for up to six weeks in advance. By making accurate sales predictions, Rossmann can improve inventory management, staffing decisions, and promotional strategies. This dataset serves as a training set for machine learning models aimed at reducing forecasting errors and supporting decision-making processes across the company’s large network of stores.
How the Dataset Was Created:
The dataset was compiled from several sources, including historical sales data from Rossmann stores, promotional calendars, holiday schedules, and external factors such as competition. The data is split into multiple features, such as the store's location, promotion details, whether the store was open or closed, and weather information. The dataset is publicly available on platforms like Kaggle and was initially created for the Kaggle Rossmann Store Sales competition. The data is made accessible via an API for further analysis and modeling, and it is structured to help machine learning models predict future sales based on various input variables.
Dataset Structure:
The dataset consists of three main files, each with its specific role:
Train:
This file contains the historical sales data, which is used to train machine learning models. It includes daily sales information for each store, as well as various features that could influence the sales (e.g., promotions, holidays, store type, etc.).
https://handle.test.datacite.org/10.82556/yb6j-jw41
PID: b1c59499-9c6e-42c2-af8f-840181e809db
Test2:
The test dataset mirrors the structure of train.csv
but does not include the actual sales values (i.e., the target variable). This file is used for making predictions using the trained machine learning models. It is used to evaluate the accuracy of predictions when the true sales data is unknown.
https://handle.test.datacite.org/10.82556/jerg-4b84
PID: 7cbb845c-21dd-4b60-b990-afa8754a0dd9
Store:
This file provides metadata about each store, including information such as the store’s location, type, and assortment level. This data is essential for understanding the context in which the sales data is gathered.
https://handle.test.datacite.org/10.82556/nqeg-gy34
PID: 9627ec46-4ee6-4969-b14a-bda555fe34db
Id: A unique identifier for each (Store, Date) combination within the test set.
Store: A unique identifier for each store.
Sales: The daily turnover (target variable) for each store on a specific day (this is what you are predicting).
Customers: The number of customers visiting the store on a given day.
Open: An indicator of whether the store was open (1 = open, 0 = closed).
StateHoliday: Indicates if the day is a state holiday, with values like:
'a' = public holiday,
'b' = Easter holiday,
'c' = Christmas,
'0' = no holiday.
SchoolHoliday: Indicates whether the store is affected by school closures (1 = yes, 0 = no).
StoreType: Differentiates between four types of stores: 'a', 'b', 'c', 'd'.
Assortment: Describes the level of product assortment in the store:
'a' = basic,
'b' = extra,
'c' = extended.
CompetitionDistance: Distance (in meters) to the nearest competitor store.
CompetitionOpenSince[Month/Year]: The month and year when the nearest competitor store opened.
Promo: Indicates whether the store is running a promotion on a particular day (1 = yes, 0 = no).
Promo2: Indicates whether the store is participating in Promo2, a continuing promotion for some stores (1 = participating, 0 = not participating).
Promo2Since[Year/Week]: The year and calendar week when the store started participating in Promo2.
PromoInterval: Describes the months when Promo2 is active, e.g., "Feb,May,Aug,Nov" means the promotion starts in February, May, August, and November.
To work with this dataset, you will need to have specific software installed, including:
DBRepo Authorization: This is required to access the datasets via the DBRepo API. You may need to authenticate with an API key or login credentials to retrieve the datasets.
Python Libraries: Key libraries for working with the dataset include:
pandas
for data manipulation,
numpy
for numerical operations,
matplotlib
and seaborn
for data visualization,
scikit-learn
for machine learning algorithms.
Several additional resources are available for working with the dataset:
Presentation:
A presentation summarizing the exploratory data analysis (EDA), feature engineering process, and key insights from the analysis is provided. This presentation also includes visualizations that help in understanding the dataset’s trends and relationships.
Jupyter Notebook:
A Jupyter notebook, titled Retail_Sales_Prediction_Capstone_Project.ipynb
, is provided, which details the entire machine learning pipeline, from data loading and cleaning to model training and evaluation.
Model Evaluation Results:
The project includes a detailed evaluation of various machine learning models, including their performance metrics like training and testing scores, Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE). This allows for a comparison of model effectiveness in forecasting sales.
Trained Models (.pkl files):
The models trained during the project are saved as .pkl
files. These files contain the trained machine learning models (e.g., Random Forest, Linear Regression, etc.) that can be loaded and used to make predictions without retraining the models from scratch.
sample_submission.csv:
This file is a sample submission file that demonstrates the format of predictions expected when using the trained model. The sample_submission.csv
contains predictions made on the test dataset using the trained Random Forest model. It provides an example of how the output should be structured for submission.
These resources provide a comprehensive guide to implementing and analyzing the sales forecasting model, helping you understand the data, methods, and results in greater detail.
Based on a forecast, retail sales revenues in Germany will amount to over *** billion euros in 2025. Figures are expected to increase annually. This timeline shows the retail sales revenue development in Germany from 2011 to 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in China increased 3.40 percent in August of 2025 over the same month in the previous year. This dataset provides - China Retail Sales YoY - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Estimated Retail Sales in the US 2023 - 2027 Discover more data with ReportLinker!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The historical sales dataset for this research is obtained from a Bangladeshi retailer. The dataset covers a period of 1826 days and includes daily sales data for a particular product from 01 January 2013 to 31 December 2017. The raw sales data has 2 columns: the first column contains timestamps, while the remaining column reflects the quantity sold.
In 2020, global retail sales fell by 2.9 percent as a result of the COVID-19 pandemic, bouncing back in 2021 with a growth of 9.7 percent Global retail sales were projected to amount to around 27.3 trillion U.S. dollars by 2022, up from approximately 23.7 trillion U.S. dollars in 2020.
American retailers worldwide
As a result of globalization and various trade agreements between markets and countries, many retailers are capable of doing business on a global scale. Many of the world’s leading retailers are American companies. Walmart and Amazon are examples of such American retailers. The success of U.S. retailers can also be seen through their performance in online retail.
Retail in the U.S.
The domestic retail market in the United States is a lucrative market, in which many companies compete. Walmart, a retail chain offering low prices and a wide selection of products, is the leading retailer in the United States. Amazon, The Kroger Co., Costco, and Target are a selection of other leading U.S. retailers.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Retail Sales in Japan 2023 - 2027 Discover more data with ReportLinker!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in Taiwan increased 0.80 percent in August of 2025 over the previous month. This dataset provides the latest reported value for - Taiwan Retail Sales MoM - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🗂 Dataset Description Title: Custom Sales Forecasting Dataset
This dataset contains a synthetic yet realistic representation of product sales across multiple stores and time periods. It is designed for use in time series forecasting, retail analytics, or machine learning experiments focusing on demand prediction and inventory planning. Each row corresponds to daily sales data for a given product at a particular store, enriched with contextual information like promotions and holidays.
This dataset is ideal for:
Building and testing time series models (ARIMA, Prophet, LSTM, etc.)
Forecasting product demand
Evaluating store-level sales trends
Training machine learning models with tabular time series data
Column Name | Description |
---|---|
order_id | Unique identifier for the order placed by a customer. |
customer_id | Unique identifier for the customer making the purchase. |
order_date | Date on which the order was placed (YYYY-MM-DD ). |
product_category | Category of the product purchased (e.g., Sports, Home, Beauty). |
product_price | Original price of a single unit of the product (before discount). |
quantity | Number of units of the product ordered. |
payment_method | Method used for payment (e.g., PayPal, Cash on Delivery). |
delivery_status | Current delivery status of the order (e.g., Delivered, Pending). |
city | City to which the order was delivered. |
state | U.S. state where the customer is located. |
zipcode | Postal code of the delivery location. |
product_id | Unique identifier for the purchased product. |
discount_applied | Fractional discount applied to the order (e.g., 0.20 for 20% off). |
order_value | Total value of the order after discount (product_price * quantity * (1 - discount_applied) ). |
review_rating | Customer’s review rating of the order on a 1–5 scale. |
return_requested | Boolean value indicating if the customer requested a return (True /False ). |
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in France decreased 0.90 percent in August of 2025 over the same month in the previous year. This dataset provides - France Retail Sales YoY - actual values, historical data, forecast, chart, statistics, economic calendar and news.
According to estimates, the hyper store generated the highest retail sales in Mexico in 2021. The estimated value of retail sales in hyper stores was **** billion U.S. dollars in that year. While traditional channels, such as hyper stores and discounters, were the main channels both in 2016 and 2021, projections showed that e-commerce will grow the fastest, and the retail sales garnered by e-commerce channels will ultimately reach ** billion U.S. dollars in Mexico.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
A first estimate of retail sales in value and volume terms for Great Britain, seasonally and non-seasonally adjusted.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: General Merchandise Retail Sales in Japan 2023 - 2027 Discover more data with ReportLinker!
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Retail Sales of Consumer Goods in China 2022 - 2026 Discover more data with ReportLinker!
Global retail sales were projected to amount to around **** trillion U.S. dollars by 2026, up from approximately **** trillion U.S. dollars in 2021. The retail industry encompasses the journey of a good or service. This typically starts with the manufacturing of a product and ends with said product being purchased by a consumer from a retailer. Retail establishments come in many forms such as grocery stores, restaurants, and bookstores. American retailers worldwide As a result of globalization and various trade agreements between markets and countries, many retailers are capable of doing business on a global scale. Many of the world’s leading retailers are American companies. Walmart and Amazon are examples of such American retailers. The success of U.S. retailers can also be seen through their performance in online retail. Retail in the U.S. The domestic retail market in the United States is a lucrative market, in which many companies compete. Walmart, a retail chain offering low prices and a wide selection of products, is the leading retailer in the United States. Amazon, The Kroger Co., Costco, and Target are a selection of other leading U.S. retailers.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This comprehensive fashion retail synthetic dataset contains 2,176 real-world style records spanning seasonal collections, customer purchasing behavior, pricing strategies, and return analytics. Perfect for data science projects, machine learning models, and business intelligence dashboards focused on retail analytics and e-commerce insights.
Column Name | Data Type | Description | Business Impact |
---|---|---|---|
product_id | String | Unique product identifier (FB000001-FB002176) | Product tracking and inventory management |
category | Categorical | Product type (Dresses, Tops, Bottoms, Outerwear, Shoes, Accessories) | Category performance analysis |
brand | Categorical | Fashion brand name (Zara, H&M, Forever21, Mango, Uniqlo, Gap, Banana Republic, Ann Taylor) | Brand comparison and market positioning |
season | Categorical | Collection season (Spring, Summer, Fall, Winter) | Seasonal trend analysis and forecasting |
size | Categorical | Clothing size (XS, S, M, L, XL, XXL) - Null for accessories | Size demand optimization |
color | Categorical | Product color (Black, White, Navy, Gray, Beige, Red, Blue, Green, Pink, Brown, Purple) | Color preference analysis |
original_price | Numerical | Base product price ($15.14 - $249.98) | Pricing strategy development |
markdown_percentage | Numerical | Discount percentage (0% - 59.9%) | Markdown effectiveness analysis |
current_price | Numerical | Final selling price after discounts | Revenue and margin analysis |
purchase_date | Date | Transaction date (2024-2025 range) | Time series analysis and seasonality |
stock_quantity | Numerical | Available inventory (0-50 units) | Inventory optimization |
customer_rating | Numerical | Product rating (1.0-5.0 scale) - Includes nulls | Quality assessment and customer satisfaction |
is_returned | Boolean | Return status (True/False) | Return rate calculation and analysis |
return_reason | Categorical | Specific return reason (Size Issue, Quality Issue, Color Mismatch, Damaged, Changed Mind, Wrong Item) | Return pattern analysis |
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Forecast: Machinery and Equipment Retail Sales in Japan 2023 - 2027 Discover more data with ReportLinker!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in the United States increased 0.60 percent in August of 2025 over the previous month. This dataset provides - U.S. December Retail Sales Increased More Than Forecast - actual values, historical data, forecast, chart, statistics, economic calendar and news.