Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1.Introduction
Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.
One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.
This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.
Please cite the following papers when using this dataset:
I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted
The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.
3.1 Data Collection
The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.
The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.
Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.
It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.
The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).
File
Period
Number of Samples (days)
product 1 2020.xlsx
01/01/2020–31/12/2020
363
product 1 2021.xlsx
01/01/2021–31/12/2021
364
product 1 2022.xlsx
01/01/2022–31/12/2022
365
product 2 2020.xlsx
01/01/2020–31/12/2020
363
product 2 2021.xlsx
01/01/2021–31/12/2021
364
product 2 2022.xlsx
01/01/2022–31/12/2022
365
product 3 2020.xlsx
01/01/2020–31/12/2020
363
product 3 2021.xlsx
01/01/2021–31/12/2021
364
product 3 2022.xlsx
01/01/2022–31/12/2022
365
product 4 2020.xlsx
01/01/2020–31/12/2020
363
product 4 2021.xlsx
01/01/2021–31/12/2021
364
product 4 2022.xlsx
01/01/2022–31/12/2022
364
product 5 2020.xlsx
01/01/2020–31/12/2020
363
product 5 2021.xlsx
01/01/2021–31/12/2021
364
product 5 2022.xlsx
01/01/2022–31/12/2022
365
product 6 2020.xlsx
01/01/2020–31/12/2020
362
product 6 2021.xlsx
01/01/2021–31/12/2021
364
product 6 2022.xlsx
01/01/2022–31/12/2022
365
product 7 2020.xlsx
01/01/2020–31/12/2020
362
product 7 2021.xlsx
01/01/2021–31/12/2021
364
product 7 2022.xlsx
01/01/2022–31/12/2022
365
3.2 Dataset Overview
The following table enumerates and explains the features included across all of the included files.
Feature
Description
Unit
Day
day of the month
-
Month
Month
-
Year
Year
-
daily_unit_sales
Daily sales - the amount of products, measured in units, that during that specific day were sold
units
previous_year_daily_unit_sales
Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year
units
percentage_difference_daily_unit_sales
The percentage difference between the two above values
%
daily_unit_sales_kg
The amount of products, measured in kilograms, that during that specific day were sold
kg
previous_year_daily_unit_sales_kg
Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year
kg
percentage_difference_daily_unit_sales_kg
The percentage difference between the two above values
kg
daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned
%
previous_year_daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned the previous year
%
points_of_distribution
The amount of sales representatives through which the product was sold to the market for this year
previous_year_points_of_distribution
The amount of sales representatives through which the product was sold to the market for the same day for the previous year
Table 1 – Dataset Feature Description
4.1 Dataset Structure
The provided dataset has the following structure:
Where:
Name
Type
Property
Readme.docx
Report
A File that contains the documentation of the Dataset.
product X
Folder
A folder containing the data of a product X.
product X YYYY.xlsx
Data file
An excel file containing the sales data of product X for year YYYY.
Table 2 - Dataset File Description
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).
References
[1] MEVGAL is a Greek dairy production company
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Sample data for exercises in Further Adventures in Data Cleaning.
Facebook
TwitterThis is a dataset downloaded off excelbianalytics.com created off of random VBA logic. I recently performed an extensive exploratory data analysis on it and I included new columns to it, namely: Unit margin, Order year, Order month, Order weekday and Order_Ship_Days which I think can help with analysis on the data. I shared it because I thought it was a great dataset to practice analytical processes on for newbies like myself.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The Superstore Sales Data dataset, available in an Excel format as "Superstore.xlsx," is a comprehensive collection of sales and customer-related information from a retail superstore. This dataset comprises* three distinct tables*, each providing specific insights into the store's operations and customer interactions.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset was created by Andrés Armando Sánchez Martín
Released under Community Data License Agreement - Sharing - Version 1.0
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Standard error reference tables for the Retail Sales Index in Great Britain.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a spreadsheet of 1 of 10 companies in the shoe industry. Highlighting COGS, Total Revenue, Market share and Industry share.
Facebook
TwitterThe dataset includes customer id,Martial Status,Gender,Income,Children,Education,Occupation,Home Owner,Cars,Commute Distance,Region,Age,Purchased Bike. Blog
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the Car sales data set which include information about different cars . This data set is being taken from the Analytixlabs for the purpose of prediction In this we have to see two things
First we have see which feature has more impact on car sales and carry out result of this
Secondly we have to train the classifier and to predict car sales and check the accuracy of the prediction.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
About Datasets:
Domain : Sales Project: Coca Cola Sales Analysis Datasets: Power BI Dataset vF Dataset Type: Excel Data Dataset Size: 52k+ records
KPI's: 1. Analyze Profit Margins per Brand 2. Sales by Region 3. Price per unit 4. Operating Profit 5. Additional Analysis
Process: 1. Understanding the problem 2. Data Collection 3. Exploring and analyzing the data 4. Interpreting the results
This data contains Power Query, Q&A visual, Key influencers visual, map chart, matrix, dynamic timeline, dashboard, formatting, text box.
Facebook
Twitterhttps://www.koncile.ai/en/termsandconditionshttps://www.koncile.ai/en/termsandconditions
OCR API to extract sales agreement data for road transport. Capture key legal or logistical fields from PDF/image and export to Excel.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Retail_Analysis_with_Walmart/main/Wallmart1.jpg" alt="">
One of the leading retail stores in the US, Walmart, would like to predict the sales and demand accurately. There are certain events and holidays which impact sales on each day. There are sales data available for 45 stores of Walmart. The business is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc.
Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks. Part of the challenge presented by this competition is modeling the effects of markdowns on these holiday weeks in the absence of complete/ideal historical data. Historical sales data for 45 Walmart stores located in different regions are available.
The dataset is taken from Kaggle.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary : Fuel demand is shown to be influenced by fuel prices, people's income and motorization rates. We explore the effects of electric vehicle's rates in gasoline demand using this panel dataset.
Files : dataset.csv - Panel dimensions are the Brazilian state ( i ) and year ( t ). The other columns are: gasoline sales per capita (ln_Sg_pc), prices of gasoline (ln_Pg) and ethanol (ln_Pe) and their lags, motorization rates of combustion vehicles (ln_Mi_c) and electric vehicles (ln_Mi_e) and GDP per capita (ln_gdp_pc). All variables are all under the natural log function, since we use this to calculate demand elasticities in a regression model.
adjacency.csv - The adjacency matrix used in interaction with electric vehicles' motorization rates to calculate spatial effects. At first, it follows a binary adjacency formula: for each pair of states i and j, the cell (i, j) is 0 if the states are not adjacent and 1 if they are. Then, each row is normalized to have sum equal to one.
regression.do - Series of Stata commands used to estimate the regression models of our study. dataset.csv must be imported to work, see comment section.
dataset_predictions.xlsx - Based on the estimations from Stata, we use this excel file to make average predictions by year and by state. Also, by including years beyond the last panel sample, we also forecast the model into the future and evaluate the effects of different policies that influence gasoline prices (taxation) and EV motorization rates (electrification). This file is primarily used to create images, but can be used to further understand how the forecasting scenarios are set up.
Sources: Fuel prices and sales: ANP (https://www.gov.br/anp/en/access-information/what-is-anp/what-is-anp) State population, GDP and vehicle fleet: IBGE (https://www.ibge.gov.br/en/home-eng.html?lang=en-GB) State EV fleet: Anfavea (https://anfavea.com.br/en/site/anuarios/)
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
About Datasets:
Domain : Sales Project: McDonalds Sales Analysis Project Dataset: START-Dashboard Dataset Type: Excel Data Dataset Size: 100 records
KPI's: 1. Customer Satisfaction 2. Sales by Country 2022 3. 2021-2022 Sales Trend 4. Sales 5. Profit 6. Customers
Process: 1. Understanding the problem 2. Data Collection 3. Exploring and analyzing the data 4. Interpreting the results
This data contains dashboard, hyperlink, shapes, icons, map, radar chart, line chart, doughnut chart, KPIs, formatting.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
his dataset contains transactional grocery data from BlinkIT, a grocery delivery platform. It includes product names, categories, prices, units sold, and potentially order or date-based features (depending on the columns in the file
Facebook
TwitterThis dataset was created by Shiva Vashishtha
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains video game sales data prepared for an Excel data analysis and dashboard project.
It includes detailed information on:
Game titles
Platforms
Genres
Publishers
Regional and global sales
The dataset was cleaned, structured, and analyzed in Microsoft Excel to explore patterns in the global video game market. It can be used to:
Practice data cleaning and pivot tables
Build interactive dashboards
Perform sales comparisons across regions and genres
Develop business insights from entertainment data
🧩 File Information
Format: .xlsx (Excel Workbook)
Columns: Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales
💡 Use Cases
Excel dashboard and chart creation
Data visualization and storytelling
Business and market analysis practice
Portfolio or learning projects
👤 Prepared by
Adewale Lateef W — for data analysis and Excel dashboard learning purposes.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Dirty Retail Store Sales dataset contains 12,575 rows of synthetic data representing sales transactions from a retail store. The dataset includes eight product categories with 25 items per category, each having static prices. It is designed to simulate real-world sales data, including intentional "dirtiness" such as missing or inconsistent values. This dataset is suitable for practicing data cleaning, exploratory data analysis (EDA), and feature engineering.
retail_store_sales.csv| Column Name | Description | Example Values |
|---|---|---|
Transaction ID | A unique identifier for each transaction. Always present and unique. | TXN_1234567 |
Customer ID | A unique identifier for each customer. 25 unique customers. | CUST_01 |
Category | The category of the purchased item. | Food, Furniture |
Item | The name of the purchased item. May contain missing values or None. | Item_1_FOOD, None |
Price Per Unit | The static price of a single unit of the item. May contain missing or None values. | 4.00, None |
Quantity | The quantity of the item purchased. May contain missing or None values. | 1, None |
Total Spent | The total amount spent on the transaction. Calculated as Quantity * Price Per Unit. | 8.00, None |
Payment Method | The method of payment used. May contain missing or invalid values. | Cash, Credit Card |
Location | The location where the transaction occurred. May contain missing or invalid values. | In-store, Online |
Transaction Date | The date of the transaction. Always present and valid. | 2023-01-15 |
Discount Applied | Indicates if a discount was applied to the transaction. May contain missing values. | True, False, None |
The dataset includes the following categories, each containing 25 items with corresponding codes, names, and static prices:
| Item Code | Item Name | Price |
|---|---|---|
| Item_1_EHE | Blender | 5.0 |
| Item_2_EHE | Microwave | 6.5 |
| Item_3_EHE | Toaster | 8.0 |
| Item_4_EHE | Vacuum Cleaner | 9.5 |
| Item_5_EHE | Air Purifier | 11.0 |
| Item_6_EHE | Electric Kettle | 12.5 |
| Item_7_EHE | Rice Cooker | 14.0 |
| Item_8_EHE | Iron | 15.5 |
| Item_9_EHE | Ceiling Fan | 17.0 |
| Item_10_EHE | Table Fan | 18.5 |
| Item_11_EHE | Hair Dryer | 20.0 |
| Item_12_EHE | Heater | 21.5 |
| Item_13_EHE | Humidifier | 23.0 |
| Item_14_EHE | Dehumidifier | 24.5 |
| Item_15_EHE | Coffee Maker | 26.0 |
| Item_16_EHE | Portable AC | 27.5 |
| Item_17_EHE | Electric Stove | 29.0 |
| Item_18_EHE | Pressure Cooker | 30.5 |
| Item_19_EHE | Induction Cooktop | 32.0 |
| Item_20_EHE | Water Dispenser | 33.5 |
| Item_21_EHE | Hand Blender | 35.0 |
| Item_22_EHE | Mixer Grinder | 36.5 |
| Item_23_EHE | Sandwich Maker | 38.0 |
| Item_24_EHE | Air Fryer | 39.5 |
| Item_25_EHE | Juicer | 41.0 |
| Item Code | Item Name | Price |
|---|---|---|
| Item_1_FUR | Office Chair | 5.0 |
| Item_2_FUR | Sofa | 6.5 |
| Item_3_FUR | Coffee Table | 8.0 |
| Item_4_FUR | Dining Table | 9.5 |
| Item_5_FUR | Bookshelf | 11.0 |
| Item_6_FUR | Bed F... |
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
In the case study titled "Blinkit: Grocery Product Analysis," a dataset called 'Grocery Sales' contains 12 columns with information on sales of grocery items across different outlets. Using Tableau, you as a data analyst can uncover customer behavior insights, track sales trends, and gather feedback. These insights will drive operational improvements, enhance customer satisfaction, and optimize product offerings and store layout. Tableau enables data-driven decision-making for positive outcomes at Blinkit.
The table Grocery Sales is a .CSV file and has the following columns, details of which are as follows:
• Item_Identifier: A unique ID for each product in the dataset. • Item_Weight: The weight of the product. • Item_Fat_Content: Indicates whether the product is low fat or not. • Item_Visibility: The percentage of the total display area in the store that is allocated to the specific product. • Item_Type: The category or type of product. • Item_MRP: The maximum retail price (list price) of the product. • Outlet_Identifier: A unique ID for each store in the dataset. • Outlet_Establishment_Year: The year in which the store was established. • Outlet_Size: The size of the store in terms of ground area covered. • Outlet_Location_Type: The type of city or region in which the store is located. • Outlet_Type: Indicates whether the store is a grocery store or a supermarket. • Item_Outlet_Sales: The sales of the product in the particular store. This is the outcome variable that we want to predict.
Facebook
TwitterThe link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">