Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset offers a valuable resource for businesses operating in the retail furniture sector. By analyzing historical sales data from the superstore dataset, users can gain insights into future sales patterns and trends. This information can be utilized to optimize inventory management strategies, anticipate customer demand, and enhance overall operational efficiency. Whether for retail managers, analysts, or data scientists, this dataset provides a foundation for informed decision-making, helping businesses maintain stability and drive sustained growth in the dynamic retail environment.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Grocery Sales Prediction
This dataset provides a rich resource for researchers and practitioners interested in retail sales prediction and analysis. It contains information about various grocery products, the outlets where they are sold, and their historical sales data.
Product Characteristics:
Item_Identifier: Unique identifier for each product. Item_Weight: Weight of the product item. Item_Fat_Content: Categorical variable indicating the fat content of the product (e.g., low fat, regular). Item_Visibility: Numerical attribute reflecting the visibility of the product in the store (likely a promotional measure). Item_Type: Category of the product (e.g., Snacks, Beverages, Bakery). Item_MRP: Maximum Retail Price of the product. Outlet Information:
Outlet_Identifier: Unique identifier for each outlet (store). Outlet_Establishment_Year: Year the outlet was established. Outlet_Size: Categorical variable indicating the size of the outlet (e.g., Small, Medium, Large). (Note: This data may have missing values) Outlet_Location_Type: Categorical variable indicating the type of location the outlet is in (e.g., Tier 1 City, Tier 2 City, Upstate). Outlet_Type: Categorical variable indicating the type of outlet (e.g., Supermarket, Grocery Store, Convenience Store). Sales Data:
Item_Outlet_Sales: The historical sales data for each product-outlet combination. Profit: The profit margin earned on each product sold. Potential Uses
This dataset can be used for various retail sales analysis and prediction tasks, including:
Demand forecasting: Build models to predict future sales of individual products or product categories at specific outlets. Promotion optimization: Analyze the effectiveness of different promotional strategies (reflected by Item_Visibility) on sales. Assortment planning: Optimize product selection and placement within stores based on sales history and outlet characteristics. Outlet performance analysis: Compare the performance of different outlets based on sales figures and profit margins. Customer segmentation: Identify customer segments with distinct purchasing behavior based on product types and outlet locations. By analyzing these rich data points, retailers can gain valuable insights to improve their sales strategies, optimize inventory management, and maximize profits.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The "Store Sales - Time Series Forecasting" dataset is designed to help predict future sales for various stores based on historical data. It includes daily sales figures for multiple locations, along with features such as store types, promotions, holidays, and regional factors. The objective is to create models that can accurately forecast future sales trends while considering the impact of external influences like seasonality and special events. This dataset is an excellent resource for practicing time series forecasting techniques in retail analytics and improving business decision-making.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains simulated daily sales records for 10 retail stores across a two-year period (2022–2023). It is designed specifically for practicing and showcasing time series forecasting, seasonal analysis, and retail trend modeling.
Each record includes:
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Retail Sales Promotions and Demand Forecasting:
This dataset contains synthetic retail sales data designed to analyze how pricing, promotions, discounts, inventory levels, and time-based factors influence product demand across retail stores. Each record represents the daily sales performance of a product in a specific store.
The dataset is structured for moderate-level machine learning and forecasting tasks, particularly demand prediction, and is suitable for exploratory data analysis (EDA), regression modeling, and retail business analytics.
Dataset Details:
Use Cases:
This dataset is synthetically generated and does not contain any real customer, store, or sales data.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This dataset contains historical sales data from a large grocery store located in Islamabad, Pakistan. With an average daily footfall of around 1,500 customers, the store serves a broad consumer base, making it ideal for analyzing and predicting sales trends.
In this project, we focus specifically on predicting the sale of rice by leveraging historical data from January 22, 2024, to October 14, 2024. Using this dataset, we trained a Random Forest Regressor model to forecast rice sales based on past patterns.
The dataset includes the following columns:
The goal of this project is to predict future sales of rice at this store using historical data. By accurately forecasting sales, the store can optimize inventory and improve stock management for this essential product.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains synthetic retail sales records designed for machine learning, business analytics, and forecasting. It includes product information, store attributes, pricing, and outlet-level sales values.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This project builds an end-to-end Retail Sales Forecasting system using the “Retail Sales Forecasting Dataset” on Kaggle. It analyzes historical sales data to identify trends, seasonality, product behavior, and store-level performance. Using machine learning and time-series forecasting techniques, the model predicts future sales, helping retailers optimize inventory, reduce losses, and improve profit margins.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset offers a valuable resource for businesses operating in the retail furniture sector. By analyzing historical sales data from the superstore dataset, users can gain insights into future sales patterns and trends. This information can be utilized to optimize inventory management strategies, anticipate customer demand, and enhance overall operational efficiency. Whether for retail managers, analysts, or data scientists, this dataset provides a foundation for informed decision-making, helping businesses maintain stability and drive sustained growth in the dynamic retail environment.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains synthetic daily retail sales data spanning January 2019 to December 2023. It simulates realistic demand patterns across multiple stores and items, incorporating trend, seasonality, pricing, and promotional effects.
Sales are generated using: - Store- and item-level base demand - Long-term upward trend - Weekly and yearly seasonal patterns - Promotional uplift - Random noise to mimic real-world variability
This dataset is suitable for time-series forecasting, demand prediction, promotion impact analysis, and machine learning experiments.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by InSaNe03
Released under MIT
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by sagexx
Released under MIT
Facebook
TwitterOpen Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied.
In their first Kaggle competition, Rossmann is challenging you to predict 6 weeks of daily sales for 1,115 stores located across Germany. Reliable sales forecasts enable store managers to create effective staff schedules that increase productivity and motivation. By helping Rossmann create a robust prediction model, you will help store managers stay focused on what’s most important to them: their customers and their teams!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🗂 Dataset Description Title: Custom Sales Forecasting Dataset
This dataset contains a synthetic yet realistic representation of product sales across multiple stores and time periods. It is designed for use in time series forecasting, retail analytics, or machine learning experiments focusing on demand prediction and inventory planning. Each row corresponds to daily sales data for a given product at a particular store, enriched with contextual information like promotions and holidays.
This dataset is ideal for:
Building and testing time series models (ARIMA, Prophet, LSTM, etc.)
Forecasting product demand
Evaluating store-level sales trends
Training machine learning models with tabular time series data
| Column Name | Description |
|---|---|
order_id | Unique identifier for the order placed by a customer. |
customer_id | Unique identifier for the customer making the purchase. |
order_date | Date on which the order was placed (YYYY-MM-DD). |
product_category | Category of the product purchased (e.g., Sports, Home, Beauty). |
product_price | Original price of a single unit of the product (before discount). |
quantity | Number of units of the product ordered. |
payment_method | Method used for payment (e.g., PayPal, Cash on Delivery). |
delivery_status | Current delivery status of the order (e.g., Delivered, Pending). |
city | City to which the order was delivered. |
state | U.S. state where the customer is located. |
zipcode | Postal code of the delivery location. |
product_id | Unique identifier for the purchased product. |
discount_applied | Fractional discount applied to the order (e.g., 0.20 for 20% off). |
order_value | Total value of the order after discount (product_price * quantity * (1 - discount_applied)). |
review_rating | Customer’s review rating of the order on a 1–5 scale. |
return_requested | Boolean value indicating if the customer requested a return (True/False). |
Facebook
TwitterThis dataset was created by Anushka Kalra
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset contains historical sales data from multiple retail stores, designed for time series forecasting and demand prediction. It captures daily sales across stores and product families, enriched with promotions, holidays, oil prices, and transactions, making it ideal for real-world forecasting problems.
📅 Daily historical sales data
🔮 Future dates for which sales must be predicted
📍 Store metadata
📆 National & local holidays/events
💰 Daily oil prices
🧮 Daily transaction counts per store
📝 Submission format reference
✅ Realistic
✅ Industry-grade
✅ Perfect for portfolios
✅ Ideal for interviews & competitions
Facebook
TwitterThis dataset is a merged dataset created from the data provided in the competition "Store Sales - Time Series Forecasting". The other datasets that were provided there apart from train and test (for example holidays_events, oil, stores, etc.) could not be used in the final prediction. According to my understanding, through the EDA of the merged dataset, we will be able to get a clearer picture of the other factors that might also affect the final prediction of grocery sales. Therefore, I created this merged dataset and posted it here for the further scope of analysis.
##### Data Description Data Field Information (This is a copy of the description as provided in the actual dataset)
Train.csv - id: store id - date: date of the sale - store_nbr: identifies the store at which the products are sold. -**family**: identifies the type of product sold. - sales: gives the total sales for a product family at a particular store at a given date. Fractional values are possible since products can be sold in fractional units (1.5 kg of cheese, for instance, as opposed to 1 bag of chips). - onpromotion: gives the total number of items in a product family that were being promoted at a store on a given date. - Store metadata, including ****city, state, type, and cluster.**** - cluster is a grouping of similar stores. - Holidays and Events, with metadata NOTE: Pay special attention to the transferred column. A holiday that is transferred officially falls on that calendar day but was moved to another date by the government. A transferred day is more like a normal day than a holiday. To find the day that it was celebrated, look for the corresponding row where the type is Transfer. For example, the holiday Independencia de Guayaquil was transferred from 2012-10-09 to 2012-10-12, which means it was celebrated on 2012-10-12. Days that are type Bridge are extra days that are added to a holiday (e.g., to extend the break across a long weekend). These are frequently made up by the type Work Day which is a day not normally scheduled for work (e.g., Saturday) that is meant to pay back the Bridge. Additional holidays are days added to a regular calendar holiday, for example, as typically happens around Christmas (making Christmas Eve a holiday). - dcoilwtico: Daily oil price. Includes values during both the train and test data timeframes. (Ecuador is an oil-dependent country and its economic health is highly vulnerable to shocks in oil prices.)
**Note: ***There is a transaction column in the training dataset which displays the sales transactions on that particular date. * Test.csv - The test data, having the same features like the training data. You will predict the target sales for the dates in this file. - The dates in the test data are for the 15 days after the last date in the training data. **Note: ***There is a no transaction column in the test dataset as was there in the training dataset. Therefore, while building the model, you might exclude this column and may use it only for EDA.*
submission.csv - A sample submission file in the correct format.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Ahmed Gulab Khan
Released under MIT
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides detailed insights into retail sales, featuring a range of factors that influence sales performance. It includes records on sales revenue, units sold, discount percentages, marketing spend, and the impact of seasonal trends and holidays.
This dataset is synthetic and generated for analysis purposes. It reflects typical retail sales patterns and is designed to support a wide range of data science and business analytics projects.
Facebook
TwitterThis dataset was created by Shrijeet16
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset offers a valuable resource for businesses operating in the retail furniture sector. By analyzing historical sales data from the superstore dataset, users can gain insights into future sales patterns and trends. This information can be utilized to optimize inventory management strategies, anticipate customer demand, and enhance overall operational efficiency. Whether for retail managers, analysts, or data scientists, this dataset provides a foundation for informed decision-making, helping businesses maintain stability and drive sustained growth in the dynamic retail environment.