Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1.Introduction
Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.
One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.
This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.
Please cite the following papers when using this dataset:
I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted
The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.
3.1 Data Collection
The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.
The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.
Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.
It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.
The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).
File
Period
Number of Samples (days)
product 1 2020.xlsx
01/01/2020–31/12/2020
363
product 1 2021.xlsx
01/01/2021–31/12/2021
364
product 1 2022.xlsx
01/01/2022–31/12/2022
365
product 2 2020.xlsx
01/01/2020–31/12/2020
363
product 2 2021.xlsx
01/01/2021–31/12/2021
364
product 2 2022.xlsx
01/01/2022–31/12/2022
365
product 3 2020.xlsx
01/01/2020–31/12/2020
363
product 3 2021.xlsx
01/01/2021–31/12/2021
364
product 3 2022.xlsx
01/01/2022–31/12/2022
365
product 4 2020.xlsx
01/01/2020–31/12/2020
363
product 4 2021.xlsx
01/01/2021–31/12/2021
364
product 4 2022.xlsx
01/01/2022–31/12/2022
364
product 5 2020.xlsx
01/01/2020–31/12/2020
363
product 5 2021.xlsx
01/01/2021–31/12/2021
364
product 5 2022.xlsx
01/01/2022–31/12/2022
365
product 6 2020.xlsx
01/01/2020–31/12/2020
362
product 6 2021.xlsx
01/01/2021–31/12/2021
364
product 6 2022.xlsx
01/01/2022–31/12/2022
365
product 7 2020.xlsx
01/01/2020–31/12/2020
362
product 7 2021.xlsx
01/01/2021–31/12/2021
364
product 7 2022.xlsx
01/01/2022–31/12/2022
365
3.2 Dataset Overview
The following table enumerates and explains the features included across all of the included files.
Feature
Description
Unit
Day
day of the month
-
Month
Month
-
Year
Year
-
daily_unit_sales
Daily sales - the amount of products, measured in units, that during that specific day were sold
units
previous_year_daily_unit_sales
Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year
units
percentage_difference_daily_unit_sales
The percentage difference between the two above values
%
daily_unit_sales_kg
The amount of products, measured in kilograms, that during that specific day were sold
kg
previous_year_daily_unit_sales_kg
Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year
kg
percentage_difference_daily_unit_sales_kg
The percentage difference between the two above values
kg
daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned
%
previous_year_daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned the previous year
%
points_of_distribution
The amount of sales representatives through which the product was sold to the market for this year
previous_year_points_of_distribution
The amount of sales representatives through which the product was sold to the market for the same day for the previous year
Table 1 – Dataset Feature Description
4.1 Dataset Structure
The provided dataset has the following structure:
Where:
Name
Type
Property
Readme.docx
Report
A File that contains the documentation of the Dataset.
product X
Folder
A folder containing the data of a product X.
product X YYYY.xlsx
Data file
An excel file containing the sales data of product X for year YYYY.
Table 2 - Dataset File Description
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).
References
[1] MEVGAL is a Greek dairy production company
The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a spreadsheet of 1 of 10 companies in the shoe industry. Highlighting COGS, Total Revenue, Market share and Industry share.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Analyzing Coffee Shop Sales: Excel Insights 📈
In my first Data Analytics Project, I Discover the secrets of a fictional coffee shop's success with my data-driven analysis. By Analyzing a 5-sheet Excel dataset, I've uncovered valuable sales trends, customer preferences, and insights that can guide future business decisions. 📊☕
DATA CLEANING 🧹
• REMOVED DUPLICATES OR IRRELEVANT ENTRIES: Thoroughly eliminated duplicate records and irrelevant data to refine the dataset for analysis.
• FIXED STRUCTURAL ERRORS: Rectified any inconsistencies or structural issues within the data to ensure uniformity and accuracy.
• CHECKED FOR DATA CONSISTENCY: Verified the integrity and coherence of the dataset by identifying and resolving any inconsistencies or discrepancies.
DATA MANIPULATION 🛠️
• UTILIZED LOOKUPS: Used Excel's lookup functions for efficient data retrieval and analysis.
• IMPLEMENTED INDEX MATCH: Leveraged the Index Match function to perform advanced data searches and matches.
• APPLIED SUMIFS FUNCTIONS: Utilized SumIFs to calculate totals based on specified criteria.
• CALCULATED PROFITS: Used relevant formulas and techniques to determine profit margins and insights from the data.
PIVOTING THE DATA 𝄜
• CREATED PIVOT TABLES: Utilized Excel's PivotTable feature to pivot the data for in-depth analysis.
• FILTERED DATA: Utilized pivot tables to filter and analyze specific subsets of data, enabling focused insights. Specially used in “PEAK HOURS” and “TOP 3 PRODUCTS” charts.
VISUALIZATION 📊
• KEY INSIGHTS: Unveiled the grand total sales revenue while also analyzing the average bill per person, offering comprehensive insights into the coffee shop's performance and customer spending habits.
• SALES TREND ANALYSIS: Used Line chart to compute total sales across various time intervals, revealing valuable insights into evolving sales trends.
• PEAK HOUR ANALYSIS: Leveraged Clustered Column chart to identify peak sales hours, shedding light on optimal operating times and potential staffing needs.
• TOP 3 PRODUCTS IDENTIFICATION: Utilized Clustered Bar chart to determine the top three coffee types, facilitating strategic decisions regarding inventory management and marketing focus.
*I also used a Timeline to visualize chronological data trends and identify key patterns over specific times.
While it's a significant milestone for me, I recognize that there's always room for growth and improvement. Your feedback and insights are invaluable to me as I continue to refine my skills and tackle future projects. I'm eager to hear your thoughts and suggestions on how I can make my next endeavor even more impactful and insightful.
THANKS TO: WsCube Tech Mo Chen Alex Freberg
TOOLS USED: Microsoft Excel
The link for the Excel project to download can be found on GitHub here.
It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt="">
The link for the Tableau adjusted dashboard can be found here.
A screenshot of the interactive Excel dashboard is also included below for ease.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
At CompanyData.com (BoldData), we provide direct access to comprehensive, verified retail company data from around the world—available in easy-to-use Excel files. With a curated list of 38 million retail companies, our database is built on official trade registers, ensuring accuracy, compliance, and depth. Whether you're targeting retailers globally or analyzing markets, our dataset is a reliable foundation for your business strategies.
Each record includes detailed company information such as legal entity details, industry codes, company hierarchies, contact names, direct emails, phone numbers (including mobile when available), and firmographics like revenue, size, and geography. The data is continuously updated, fully GDPR-compliant, and meticulously verified, making it ideal for precise targeting, compliance tasks, and strategic outreach.
Our retail company data serves a wide range of industries and use cases, including KYC verification, compliance checks, global sales prospecting, multichannel marketing, CRM enrichment, and AI model training. Whether you're mapping retail supply chains or launching a new product globally, our data ensures you're connecting with the right companies at the right time.
Delivery is simple and scalable: receive tailored Excel files, access our self-service platform, integrate via real-time API, or enhance your existing records through our data enrichment services. With coverage of 380 million verified companies across all sectors and regions, CompanyData.com (BoldData) empowers your business with the global retail insights needed to thrive in a fast-moving market.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains video game sales data prepared for an Excel data analysis and dashboard project.
It includes detailed information on:
Game titles
Platforms
Genres
Publishers
Regional and global sales
The dataset was cleaned, structured, and analyzed in Microsoft Excel to explore patterns in the global video game market. It can be used to:
Practice data cleaning and pivot tables
Build interactive dashboards
Perform sales comparisons across regions and genres
Develop business insights from entertainment data
🧩 File Information
Format: .xlsx (Excel Workbook)
Columns: Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, Global_Sales
💡 Use Cases
Excel dashboard and chart creation
Data visualization and storytelling
Business and market analysis practice
Portfolio or learning projects
👤 Prepared by
Adewale Lateef W — for data analysis and Excel dashboard learning purposes.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The extent to which individual businesses in Great Britain experienced actual changes in their sales.
Access 380M+ verified retail companies worldwide in Excel format. CompanyData.com (BoldData) delivers GDPR-compliant data from official trade registers—ideal for sales, marketing, KYC, and market analysis across 380M global companies.
Success.ai’s Ecommerce Store Data for the APAC E-commerce Sector provides a reliable and accurate dataset tailored for businesses aiming to connect with e-commerce professionals and organizations across the Asia-Pacific region. Covering roles and businesses involved in online retail, marketplace management, logistics, and digital commerce, this dataset includes verified business profiles, decision-maker contact details, and actionable insights.
With access to continuously updated, AI-validated data and over 700 million global profiles, Success.ai ensures your outreach, market analysis, and partnership strategies are effective and data-driven. Backed by our Best Price Guarantee, this solution helps you excel in one of the world’s fastest-growing e-commerce markets.
Why Choose Success.ai’s Ecommerce Store Data?
Verified Profiles for Precision Engagement
Comprehensive Coverage of the APAC E-commerce Sector
Continuously Updated Datasets
Ethical and Compliant
Data Highlights:
Key Features of the Dataset:
Comprehensive E-commerce Business Profiles
Advanced Filters for Precision Campaigns
Regional and Sector-specific Insights
AI-Driven Enrichment
Strategic Use Cases:
Marketing Campaigns and Outreach
Partnership Development and Vendor Collaboration
Market Research and Competitive Analysis
Recruitment and Talent Acquisition
Why Choose Success.ai?
Best Price Guarantee
Seamless Integration
Access GDPR-proof, verified data on 27 million wholesale companies worldwide. Sourced from official trade registers and part of our 380M+ company database. Delivered in Excel files or via API—ideal for sales, compliance, and B2B targeting.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Management Dashboard - Fantasie GmbH
Description:
This is an interactive management dashboard I created in Microsoft Excel using fictional data. It visualizes key business metrics such as annual revenue, sales by employees and branches, as well as product trends. The dashboard incorporates VBA-powered buttons for navigation and control, along with functions like IF and VLOOKUP for dynamic data processing.
This dashboard is intended for orientation and inspiration for your own projects. The dataset used is entirely fictional and is not included.
License: CC0 - Free to use and adapt.
This dataset contains a list of sales and movement data by item and department appended monthly. Update Frequency : Monthly
CompanyData.com powered by BoldData is your gateway to verified, high-quality business data from around the world. We specialize in delivering structured company information sourced directly from official trade registers, giving you reliable data to fuel smarter business decisions.
Our USA company database includes over 69,853,300 verified business records, making it one of the most comprehensive sources of company information available. Each record contains detailed firmographics such as industry classification, company size and revenue, corporate hierarchies and verified contact details including decision-maker names, email addresses, direct dials and mobile numbers.
This rich dataset supports a wide range of use cases including - Regulatory compliance and KYC verification - Sales prospecting and lead generation - B2B marketing and audience segmentation - CRM enrichment and data cleansing - Training data for AI and machine learning models
We offer flexible delivery options tailored to your workflow - Tailored company lists filtered by location, size, industry and more - Full USA company database exports in Excel or CSV - Real-time API access for seamless data integration - Data enrichment services to enhance your internal records
The United States is a key part of our global database of over 69,853,300 verified companies across more than 200 countries. Whether you are expanding into the US market or enriching global CRM systems, we deliver the accuracy, scale and flexibility your business demands.
Partner with CompanyData.com to unlock actionable company intelligence in the USA delivered how you need it, when you need it, with the precision your business deserves.
Success.ai’s Ecommerce Market Data for South-east Asia E-commerce Contacts provides a robust and accurate dataset tailored for businesses and organizations looking to connect with professionals in the fast-growing e-commerce industry across South-east Asia. Covering roles such as e-commerce managers, digital strategists, logistics experts, and online marketplace leaders, this dataset offers verified contact details, professional insights, and actionable market data.
With access to over 170 million verified profiles globally, Success.ai ensures your outreach, marketing, and research strategies are powered by accurate, continuously updated, and AI-validated data. Backed by our Best Price Guarantee, this solution empowers you to excel in one of the world’s most dynamic e-commerce regions.
Why Choose Success.ai’s Ecommerce Market Data?
Verified Contact Data for Precision Outreach
Comprehensive Coverage of South-east Asia’s E-commerce Market
Continuously Updated Datasets
Ethical and Compliant
Data Highlights:
Key Features of the Dataset:
Comprehensive Professional Profiles in E-commerce
Advanced Filters for Precision Campaigns
Regional and Market-specific Insights
AI-Driven Enrichment
Strategic Use Cases:
Marketing Campaigns and Digital Outreach
Market Research and Competitive Analysis
Partnership Development and Vendor Collaboration
Recruitment and Talent Acquisition
Why Choose Success.ai?
Best Price Guarantee
Seamless Integration
Get verified company data from official trade registers — 1.92 million records in Norway or 380 million globally. Delivered via tailored lists, full databases, API, Excel, or CSV. Reliable, up-to-date data to support compliance, sales, and business growth.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Google Ads Sales Dataset for Data Analytics Campaigns (Raw & Uncleaned) 📝 Dataset Overview This dataset contains raw, uncleaned advertising data from a simulated Google Ads campaign promoting data analytics courses and services. It closely mimics what real digital marketers and analysts would encounter when working with exported campaign data — including typos, formatting issues, missing values, and inconsistencies.
It is ideal for practicing:
Data cleaning
Exploratory Data Analysis (EDA)
Marketing analytics
Campaign performance insights
Dashboard creation using tools like Excel, Python, or Power BI
📁 Columns in the Dataset Column Name ----- -Description Ad_ID --------Unique ID of the ad campaign Campaign_Name ------Name of the campaign (with typos and variations) Clicks --Number of clicks received Impressions --Number of ad impressions Cost --Total cost of the ad (in ₹ or $ format with missing values) Leads ---Number of leads generated Conversions ----Number of actual conversions (signups, sales, etc.) Conversion Rate ---Calculated conversion rate (Conversions ÷ Clicks) Sale_Amount ---Revenue generated from the conversions Ad_Date------ Date of the ad activity (in inconsistent formats like YYYY/MM/DD, DD-MM-YY) Location ------------City where the ad was served (includes spelling/case variations) Device------------ Device type (Mobile, Desktop, Tablet with mixed casing) Keyword ----------Keyword that triggered the ad (with typos)
⚠️ Data Quality Issues (Intentional) This dataset was intentionally left raw and uncleaned to reflect real-world messiness, such as:
Inconsistent date formats
Spelling errors (e.g., "analitics", "anaytics")
Duplicate rows
Mixed units and symbols in cost/revenue columns
Missing values
Irregular casing in categorical fields (e.g., "mobile", "Mobile", "MOBILE")
🎯 Use Cases Data cleaning exercises in Python (Pandas), R, Excel
Data preprocessing for machine learning
Campaign performance analysis
Conversion optimization tracking
Building dashboards in Power BI, Tableau, or Looker
💡 Sample Analysis Ideas Track campaign cost vs. return (ROI)
Analyze click-through rates (CTR) by device or location
Clean and standardize campaign names and keywords
Investigate keyword performance vs. conversions
🔖 Tags Digital Marketing · Google Ads · Marketing Analytics · Data Cleaning · Pandas Practice · Business Analytics · CRM Data
Get verified company data from official trade registers with 979K records in New Zealand or 380M globally. Access full databases or tailored lists via Excel, CSV or API. Accurate, up-to-date business data to power your compliance, sales and marketing success.
SIBS provides statistical information on strategic decisions, innovation activities and operational tactics used by Canadian enterprises. The survey also collects information on enterprise involvement in global value chains. The data was collected from January 2007 - December 2009. The questions address the following themes: Business strategies and monitoring Enterprise structure Operational activities Relocation of business activities Sales activities Business practices and relationships with suppliers Advanced technology use Product / process / marketing / organizational innovations Production performance management Human resources management Main product and market structure Government support programs Obstacles to innovation
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global market size of Coffee Shop is $XX million in 2018 with XX CAGR from 2014 to 2018, and it is expected to reach $XX million by the end of 2024 with a CAGR of XX% from 2019 to 2024.
Global Coffee Shop Market Report 2019 - Market Size, Share, Price, Trend and Forecast is a professional and in-depth study on the current state of the global Coffee Shop industry. The key insights of the report:
1.The report provides key statistics on the market status of the Coffee Shop manufacturers and is a valuable source of guidance and direction for companies and individuals interested in the industry.
2.The report provides a basic overview of the industry including its definition, applications and manufacturing technology.
3.The report presents the company profile, product specifications, capacity, production value, and 2013-2018 market shares for key vendors.
4.The total market is further divided by company, by country, and by application/type for the competitive landscape analysis.
5.The report estimates 2019-2024 market development trends of Coffee Shop industry.
6.Analysis of upstream raw materials, downstream demand, and current market dynamics is also carried out
7.The report makes some important proposals for a new project of Coffee Shop Industry before evaluating its feasibility.
There are 4 key segments covered in this report: competitor segment, product type segment, end use/application segment and geography segment.
For competitor segment, the report includes global key players of Coffee Shop as well as some small players.
The information for each competitor includes:
* Company Profile
* Main Business Information
* SWOT Analysis
* Sales, Revenue, Price and Gross Margin
* Market Share
For product type segment, this report listed main product type of Coffee Shop market
* Product Type I
* Product Type II
* Product Type III
For end use/application segment, this report focuses on the status and outlook for key applications. End users sre also listed.
* Application I
* Application II
* Application III
For geography segment, regional supply, application-wise and type-wise demand, major players, price is presented from 2013 to 2023. This report covers following regions:
* North America
* South America
* Asia & Pacific
* Europe
* MEA (Middle East and Africa)
The key countries in each region are taken into consideration as well, such as United States, China, Japan, India, Korea, ASEAN, Germany, France, UK, Italy, Spain, CIS, and Brazil etc.
Reasons to Purchase this Report:
* Analyzing the outlook of the market with the recent trends and SWOT analysis
* Market dynamics scenario, along with growth opportunities of the market in the years to come
* Market segmentation analysis including qualitative and quantitative research incorporating the impact of economic and non-economic aspects
* Regional and country level analysis integrating the demand and supply forces that are influencing the growth of the market.
* Market value (USD Million) and volume (Units Million) data for each segment and sub-segment
* Competitive landscape involving the market share of major players, along with the new projects and strategies adopted by players in the past five years
* Comprehensive company profiles covering the product offerings, key financial information, recent developments, SWOT analysis, and strategies employed by the major market players
* 1-year analyst support, along with the data support in excel format.
We also can offer customized report to fulfill special requirements of our clients. Regional and Countries report can be provided as well.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1.Introduction
Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.
One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.
This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.
Please cite the following papers when using this dataset:
I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted
The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.
3.1 Data Collection
The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.
The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.
Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.
It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.
The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).
File
Period
Number of Samples (days)
product 1 2020.xlsx
01/01/2020–31/12/2020
363
product 1 2021.xlsx
01/01/2021–31/12/2021
364
product 1 2022.xlsx
01/01/2022–31/12/2022
365
product 2 2020.xlsx
01/01/2020–31/12/2020
363
product 2 2021.xlsx
01/01/2021–31/12/2021
364
product 2 2022.xlsx
01/01/2022–31/12/2022
365
product 3 2020.xlsx
01/01/2020–31/12/2020
363
product 3 2021.xlsx
01/01/2021–31/12/2021
364
product 3 2022.xlsx
01/01/2022–31/12/2022
365
product 4 2020.xlsx
01/01/2020–31/12/2020
363
product 4 2021.xlsx
01/01/2021–31/12/2021
364
product 4 2022.xlsx
01/01/2022–31/12/2022
364
product 5 2020.xlsx
01/01/2020–31/12/2020
363
product 5 2021.xlsx
01/01/2021–31/12/2021
364
product 5 2022.xlsx
01/01/2022–31/12/2022
365
product 6 2020.xlsx
01/01/2020–31/12/2020
362
product 6 2021.xlsx
01/01/2021–31/12/2021
364
product 6 2022.xlsx
01/01/2022–31/12/2022
365
product 7 2020.xlsx
01/01/2020–31/12/2020
362
product 7 2021.xlsx
01/01/2021–31/12/2021
364
product 7 2022.xlsx
01/01/2022–31/12/2022
365
3.2 Dataset Overview
The following table enumerates and explains the features included across all of the included files.
Feature
Description
Unit
Day
day of the month
-
Month
Month
-
Year
Year
-
daily_unit_sales
Daily sales - the amount of products, measured in units, that during that specific day were sold
units
previous_year_daily_unit_sales
Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year
units
percentage_difference_daily_unit_sales
The percentage difference between the two above values
%
daily_unit_sales_kg
The amount of products, measured in kilograms, that during that specific day were sold
kg
previous_year_daily_unit_sales_kg
Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year
kg
percentage_difference_daily_unit_sales_kg
The percentage difference between the two above values
kg
daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned
%
previous_year_daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned the previous year
%
points_of_distribution
The amount of sales representatives through which the product was sold to the market for this year
previous_year_points_of_distribution
The amount of sales representatives through which the product was sold to the market for the same day for the previous year
Table 1 – Dataset Feature Description
4.1 Dataset Structure
The provided dataset has the following structure:
Where:
Name
Type
Property
Readme.docx
Report
A File that contains the documentation of the Dataset.
product X
Folder
A folder containing the data of a product X.
product X YYYY.xlsx
Data file
An excel file containing the sales data of product X for year YYYY.
Table 2 - Dataset File Description
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).
References
[1] MEVGAL is a Greek dairy production company