Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in the United States decreased 0.90 percent in May of 2025 over the previous month. This dataset provides - U.S. December Retail Sales Increased More Than Forecast - actual values, historical data, forecast, chart, statistics, economic calendar and news.
This dataset contains a list of sales and movement data by item and department appended monthly. Update Frequency : Monthly
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This synthetic dataset simulates two years of transactional data for a multinational fashion retailer, featuring:
- 📈 4+ million sales records
- 🏪 35 stores across 7 countries:
🇺🇸 United States | 🇨🇳 China | 🇩🇪 Germany | 🇬🇧 United Kingdom | 🇫🇷 France | 🇪🇸 Spain | 🇵🇹 Portugal
Currencies Covered:
Each transaction includes detailed currency information, covering multiple currencies:
💵 USD (United States) | 💶 EUR (Eurozone) | 💴 CNY (China) | 💷 GBP (United Kingdom)
🌐 Geographic Sales Comparison
Gain insights into how sales performance varies between regions and countries, and identify trends that drive success in different markets.
👥 Analyze Staffing and Performance
Evaluate store staffing ratios and analyze the impact of employee performance on store success.
🛍️ Customer Behavior and Segmentation
Understand regional customer preferences, analyze demographic factors such as age and occupation, and segment customers based on their purchasing habits.
💱 Multi-Currency Analysis
Explore how transactions in different currencies (USD, EUR, CNY, GBP) are handled, analyze currency exchange effects, and compare sales across regions using multiple currencies.
👗 Product Trends
Assess how product categories (e.g., Feminine, Masculine, Children) and specific product attributes (size, color) perform across different regions.
🎯 Pricing and Discount Analysis
Study how different pricing models and discounts affect sales and customer decisions across diverse geographies.
📊 Advanced Cross-Country & Currency Analysis
Conduct complex, multi-dimensional analytics that interconnect countries, currencies, and sales data, identifying hidden correlations between economic factors, regional demand, and financial performance.
Generated using algorithms, it simulates real-world retail dynamics while ensuring privacy.
This dataset is an ideal resource for retail analysts, data scientists, and business intelligence professionals aiming to explore multinational retail data, optimize operations, and uncover new insights into customer behavior, sales trends, and employee efficiency.
Total retail sales in the United States was forecast to amount to 5.23 trillion U.S. dollars in 2024, up by 13 billion U.S. dollars in the previous year. Retail establishments come in many forms such as grocery stores, restaurants, and bookstores. There are around four million retail establishments in the United States. Leading companies in U.S. retail The domestic retail market in the United States is very competitive, with many companies recording substantial retail sales. Walmart, a retail chain offering low prices and a wide selection of products, is the leading retailer in the United States. Amazon, The Kroger Co., Costco, and Target are a selection of other leading U.S. retailers. American retailers worldwide Many of the world’s leading retailers are American companies. Walmart and Amazon are examples of American retailers doing business on a global scale. The success of U.S. retailers can also be seen through their performance in online retail. Amazon is a prime example of this, with the company’s sales revenue flourishing over the previous years.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in the United States increased 3.30 percent in May of 2025 over the same month in the previous year. This dataset provides - United States Retail Sales YoY - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
A series of retail sales data for Great Britain in value and volume terms, seasonally and non-seasonally adjusted.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:
Context:
Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.
Inspiration:
The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.
Dataset Information:
The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:
Use Cases:
Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.
According to a Swedish pharmacy annual report, e-commerce channels in different drugstore retail segments overall showed a higher growth rate than bricks-and-mortar sales during the year of 2024. Other e-commerce sales took a lead with *****percent of growth, compared with bricks-and-mortar sales, which showed around **** percent of growth in this segment. Total online sales of all segments grew by around ** percent, while retail sales in physical stood at *** percent.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides a comprehensive collection of key performance indicators (KPIs) for retail stores, offering insights into factors influencing store performance, customer engagement, and financial outcomes. The dataset is suitable for various machine learning and data analysis tasks, including regression, classification, and clustering. It can help in understanding the relationships between operational metrics, store characteristics, and sales performance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in Japan increased 2.20 percent in May of 2025 over the same month in the previous year. This dataset provides the latest reported value for - Japan Retail Sales YoY - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
This statistic displays the change in like-for-like sales of the UK clothing and fashion store retailer New Look, from financial year 2015/15 to 2018/19. In the financial year ending in March 2019, like-for-like sales fell by *** percent. This was still an improvement on the previous year for New Look, who experienced a **** percent decline in like-for-like sales.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This fictional sales dataset was created using a R code for the purpose of visualizing trends in customer demographics, product performance, and sales over time. A link to my Github repository containing all the codes used in generating the data frame and all the preceding processes can be found here
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Sample Sales Data is a retail sales dataset of 2,823 orders and 25 columns that includes a variety of sales-related data, including order numbers, product information, quantity, unit price, sales, order date, order status, customer and delivery information.
2) Data Utilization (1) Sample Sales Data has characteristics that: • This dataset consists of numerical (sales, quantity, unit price, etc.), categorical (product, country, city, customer name, transaction size, etc.), and date (order date) variables, with missing values in some columns (STATE, ADDRESSLINE2, POSTALCODE, etc.). (2) Sample Sales Data can be used to: • Analysis of sales trends and performance by product: Key variables such as order date, product line, and country can be used to visualize and analyze monthly and yearly sales trends, the proportion of sales by product line, and top sales by country and region. • Segmentation and marketing strategies: Segmentation of customer groups based on customer information, transaction size, and regional data, and use them to design targeted marketing and customized promotion strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key information about Italy Retail Sales Growth
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in Germany decreased 1.60 percent in May of 2025 over the previous month. This dataset provides the latest reported value for - Germany Retail Sales MoM - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
naics_code
kind_of_business
sales_month
sales
estimate_type
(NA)
and (S)
values, which were converted to null values.
This dataset can be applied to a variety of analytical and machine learning tasks, including:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Research Domain:
The dataset is part of a project focused on retail sales forecasting. Specifically, it is designed to predict daily sales for Rossmann, a chain of over 3,000 drug stores operating across seven European countries. The project falls under the broader domain of time series analysis and machine learning applications for business optimization. The goal is to apply machine learning techniques to forecast future sales based on historical data, which includes factors like promotions, competition, holidays, and seasonal trends.
Purpose:
The primary purpose of this dataset is to help Rossmann store managers predict daily sales for up to six weeks in advance. By making accurate sales predictions, Rossmann can improve inventory management, staffing decisions, and promotional strategies. This dataset serves as a training set for machine learning models aimed at reducing forecasting errors and supporting decision-making processes across the company’s large network of stores.
How the Dataset Was Created:
The dataset was compiled from several sources, including historical sales data from Rossmann stores, promotional calendars, holiday schedules, and external factors such as competition. The data is split into multiple features, such as the store's location, promotion details, whether the store was open or closed, and weather information. The dataset is publicly available on platforms like Kaggle and was initially created for the Kaggle Rossmann Store Sales competition. The data is made accessible via an API for further analysis and modeling, and it is structured to help machine learning models predict future sales based on various input variables.
Dataset Structure:
The dataset consists of three main files, each with its specific role:
Train:
This file contains the historical sales data, which is used to train machine learning models. It includes daily sales information for each store, as well as various features that could influence the sales (e.g., promotions, holidays, store type, etc.).
https://handle.test.datacite.org/10.82556/yb6j-jw41
PID: b1c59499-9c6e-42c2-af8f-840181e809db
Test2:
The test dataset mirrors the structure of train.csv
but does not include the actual sales values (i.e., the target variable). This file is used for making predictions using the trained machine learning models. It is used to evaluate the accuracy of predictions when the true sales data is unknown.
https://handle.test.datacite.org/10.82556/jerg-4b84
PID: 7cbb845c-21dd-4b60-b990-afa8754a0dd9
Store:
This file provides metadata about each store, including information such as the store’s location, type, and assortment level. This data is essential for understanding the context in which the sales data is gathered.
https://handle.test.datacite.org/10.82556/nqeg-gy34
PID: 9627ec46-4ee6-4969-b14a-bda555fe34db
Id: A unique identifier for each (Store, Date) combination within the test set.
Store: A unique identifier for each store.
Sales: The daily turnover (target variable) for each store on a specific day (this is what you are predicting).
Customers: The number of customers visiting the store on a given day.
Open: An indicator of whether the store was open (1 = open, 0 = closed).
StateHoliday: Indicates if the day is a state holiday, with values like:
'a' = public holiday,
'b' = Easter holiday,
'c' = Christmas,
'0' = no holiday.
SchoolHoliday: Indicates whether the store is affected by school closures (1 = yes, 0 = no).
StoreType: Differentiates between four types of stores: 'a', 'b', 'c', 'd'.
Assortment: Describes the level of product assortment in the store:
'a' = basic,
'b' = extra,
'c' = extended.
CompetitionDistance: Distance (in meters) to the nearest competitor store.
CompetitionOpenSince[Month/Year]: The month and year when the nearest competitor store opened.
Promo: Indicates whether the store is running a promotion on a particular day (1 = yes, 0 = no).
Promo2: Indicates whether the store is participating in Promo2, a continuing promotion for some stores (1 = participating, 0 = not participating).
Promo2Since[Year/Week]: The year and calendar week when the store started participating in Promo2.
PromoInterval: Describes the months when Promo2 is active, e.g., "Feb,May,Aug,Nov" means the promotion starts in February, May, August, and November.
To work with this dataset, you will need to have specific software installed, including:
DBRepo Authorization: This is required to access the datasets via the DBRepo API. You may need to authenticate with an API key or login credentials to retrieve the datasets.
Python Libraries: Key libraries for working with the dataset include:
pandas
for data manipulation,
numpy
for numerical operations,
matplotlib
and seaborn
for data visualization,
scikit-learn
for machine learning algorithms.
Several additional resources are available for working with the dataset:
Presentation:
A presentation summarizing the exploratory data analysis (EDA), feature engineering process, and key insights from the analysis is provided. This presentation also includes visualizations that help in understanding the dataset’s trends and relationships.
Jupyter Notebook:
A Jupyter notebook, titled Retail_Sales_Prediction_Capstone_Project.ipynb
, is provided, which details the entire machine learning pipeline, from data loading and cleaning to model training and evaluation.
Model Evaluation Results:
The project includes a detailed evaluation of various machine learning models, including their performance metrics like training and testing scores, Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE). This allows for a comparison of model effectiveness in forecasting sales.
Trained Models (.pkl files):
The models trained during the project are saved as .pkl
files. These files contain the trained machine learning models (e.g., Random Forest, Linear Regression, etc.) that can be loaded and used to make predictions without retraining the models from scratch.
sample_submission.csv:
This file is a sample submission file that demonstrates the format of predictions expected when using the trained model. The sample_submission.csv
contains predictions made on the test dataset using the trained Random Forest model. It provides an example of how the output should be structured for submission.
These resources provide a comprehensive guide to implementing and analyzing the sales forecasting model, helping you understand the data, methods, and results in greater detail.
The U.S. Census Bureau.s economic indicator surveys provide monthly and quarterly data that are timely, reliable, and offer comprehensive measures of the U.S. economy. These surveys produce a variety of statistics covering construction, housing, international trade, retail trade, wholesale trade, services and manufacturing. The survey data provide measures of economic activity that allow analysis of economic performance and inform business investment and policy decisions. Other data included, which are not considered principal economic indicators, are the Quarterly Summary of State & Local Taxes, Quarterly Survey of Public Pensions, and the Manufactured Homes Survey. For information on the reliability and use of the data, including important notes on estimation and sampling variance, seasonal adjustment, measures of sampling variability, and other information pertinent to the economic indicators, visit the individual programs' webpages - http://www.census.gov/cgi-bin/briefroom/BriefRm.
The Retail Sales Index (RSI) measures the short-term performance of retail industries based on the sales records of retail establishments.
The RSI is presented at both current prices and constant prices. The indices at current prices measure the changes of sales values which can result from changes in both price and quantity. By removing the price effect, the indices at constant prices measure the changes in the volume of economic activity.
The base year is 2014. (2014 = 100)
Performance data and success metrics for Code X Coast retail sales program in Toledo, Ohio
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retail Sales in the United States decreased 0.90 percent in May of 2025 over the previous month. This dataset provides - U.S. December Retail Sales Increased More Than Forecast - actual values, historical data, forecast, chart, statistics, economic calendar and news.