This dataset contains a list of sales and movement data by item and department appended monthly. Update Frequency : Monthly
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Business Goal
Date: 2023/09/15
Dataset: Sales quantity of a certain brand from January to December 2022 and from January to September 2023.
Please describe what you observe (no specific presentation format required). Among your observations, identify at least three valuable insights and explain why you consider them valuable.
If more resources were available to you (including time, information, etc.), what would you need, and what more could you achieve?
Metadata of the file Data Period: January 2022 - September 2023 Data Fields: - item - store_id - sales of each month
Metadata of the file Data Period: January 2022 - September 2023 Data Fields: - item - store_id - sales of each month
Sample question & answer 1. Product insights: identify the product sales analysis, such as BCG matrix 2. Store insights: identify the sales performance of the sales 3. Supply chain insights: identify the demand 4. Time series forecasting: identify tread, seasonality
This dataset contains sales data, including order dates, order IDs, item details, costs, and revenues, primarily featuring USB novelty items and mugs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1.Introduction
Sales data collection is a crucial aspect of any manufacturing industry as it provides valuable insights about the performance of products, customer behaviour, and market trends. By gathering and analysing this data, manufacturers can make informed decisions about product development, pricing, and marketing strategies in Internet of Things (IoT) business environments like the dairy supply chain.
One of the most important benefits of the sales data collection process is that it allows manufacturers to identify their most successful products and target their efforts towards those areas. For example, if a manufacturer could notice that a particular product is selling well in a certain region, this information could be utilised to develop new products, optimise the supply chain or improve existing ones to meet the changing needs of customers.
This dataset includes information about 7 of MEVGAL’s products [1]. According to the above information the data published will help researchers to understand the dynamics of the dairy market and its consumption patterns, which is creating the fertile ground for synergies between academia and industry and eventually help the industry in making informed decisions regarding product development, pricing and market strategies in the IoT playground. The use of this dataset could also aim to understand the impact of various external factors on the dairy market such as the economic, environmental, and technological factors. It could help in understanding the current state of the dairy industry and identifying potential opportunities for growth and development.
Please cite the following papers when using this dataset:
I. Siniosoglou, K. Xouveroudis, V. Argyriou, T. Lagkas, S. K. Goudos, K. E. Psannis and P. Sarigiannidis, "Evaluating the Effect of Volatile Federated Timeseries on Modern DNNs: Attention over Long/Short Memory," in the 12th International Conference on Circuits and Systems Technologies (MOCAST 2023), April 2023, Accepted
The dataset includes data regarding the daily sales of a series of dairy product codes offered by MEVGAL. In particular, the dataset includes information gathered by the logistics division and agencies within the industrial infrastructures overseeing the production of each product code. The products included in this dataset represent the daily sales and logistics of a variety of yogurt-based stock. Each of the different files include the logistics for that product on a daily basis for three years, from 2020 to 2022.
3.1 Data Collection
The process of building this dataset involves several steps to ensure that the data is accurate, comprehensive and relevant.
The first step is to determine the specific data that is needed to support the business objectives of the industry, i.e., in this publication’s case the daily sales data.
Once the data requirements have been identified, the next step is to implement an effective sales data collection method. In MEVGAL’s case this is conducted through direct communication and reports generated each day by representatives & selling points.
It is also important for MEVGAL to ensure that the data collection process conducted is in an ethical and compliant manner, adhering to data privacy laws and regulation. The industry also has a data management plan in place to ensure that the data is securely stored and protected from unauthorised access.
The published dataset is consisted of 13 features providing information about the date and the number of products that have been sold. Finally, the dataset was anonymised in consideration to the privacy requirement of the data owner (MEVGAL).
File
Period
Number of Samples (days)
product 1 2020.xlsx
01/01/2020–31/12/2020
363
product 1 2021.xlsx
01/01/2021–31/12/2021
364
product 1 2022.xlsx
01/01/2022–31/12/2022
365
product 2 2020.xlsx
01/01/2020–31/12/2020
363
product 2 2021.xlsx
01/01/2021–31/12/2021
364
product 2 2022.xlsx
01/01/2022–31/12/2022
365
product 3 2020.xlsx
01/01/2020–31/12/2020
363
product 3 2021.xlsx
01/01/2021–31/12/2021
364
product 3 2022.xlsx
01/01/2022–31/12/2022
365
product 4 2020.xlsx
01/01/2020–31/12/2020
363
product 4 2021.xlsx
01/01/2021–31/12/2021
364
product 4 2022.xlsx
01/01/2022–31/12/2022
364
product 5 2020.xlsx
01/01/2020–31/12/2020
363
product 5 2021.xlsx
01/01/2021–31/12/2021
364
product 5 2022.xlsx
01/01/2022–31/12/2022
365
product 6 2020.xlsx
01/01/2020–31/12/2020
362
product 6 2021.xlsx
01/01/2021–31/12/2021
364
product 6 2022.xlsx
01/01/2022–31/12/2022
365
product 7 2020.xlsx
01/01/2020–31/12/2020
362
product 7 2021.xlsx
01/01/2021–31/12/2021
364
product 7 2022.xlsx
01/01/2022–31/12/2022
365
3.2 Dataset Overview
The following table enumerates and explains the features included across all of the included files.
Feature
Description
Unit
Day
day of the month
-
Month
Month
-
Year
Year
-
daily_unit_sales
Daily sales - the amount of products, measured in units, that during that specific day were sold
units
previous_year_daily_unit_sales
Previous Year’s sales - the amount of products, measured in units, that during that specific day were sold the previous year
units
percentage_difference_daily_unit_sales
The percentage difference between the two above values
%
daily_unit_sales_kg
The amount of products, measured in kilograms, that during that specific day were sold
kg
previous_year_daily_unit_sales_kg
Previous Year’s sales - the amount of products, measured in kilograms, that during that specific day were sold, the previous year
kg
percentage_difference_daily_unit_sales_kg
The percentage difference between the two above values
kg
daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned
%
previous_year_daily_unit_returns_kg
The percentage of the products that were shipped to selling points and were returned the previous year
%
points_of_distribution
The amount of sales representatives through which the product was sold to the market for this year
previous_year_points_of_distribution
The amount of sales representatives through which the product was sold to the market for the same day for the previous year
Table 1 – Dataset Feature Description
4.1 Dataset Structure
The provided dataset has the following structure:
Where:
Name
Type
Property
Readme.docx
Report
A File that contains the documentation of the Dataset.
product X
Folder
A folder containing the data of a product X.
product X YYYY.xlsx
Data file
An excel file containing the sales data of product X for year YYYY.
Table 2 - Dataset File Description
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 957406 (TERMINET).
References
[1] MEVGAL is a Greek dairy production company
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Sample Sales Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kyanyoga/sample-sales-data on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Sample Sales Data, Order Info, Sales, Customer, Shipping, etc., Used for Segmentation, Customer Analytics, Clustering and More. Inspired for retail analytics. This was originally used for Pentaho DI Kettle, But I found the set could be useful for Sales Simulation training.
Originally Written by María Carina Roldán, Pentaho Community Member, BI consultant (Assert Solutions), Argentina. This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. Modified by Gus Segura June 2014.
--- Original source retains full ownership of the source dataset ---
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is based on the Sample Leads Dataset and is intended to allow some simple filtering by lead source. I had modified this dataset to support an upcoming Towards Data Science article walking through the process. Link to be shared once published.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Standard error reference tables for the Retail Sales Index in Great Britain.
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Nothing ever becomes real till it is experienced.
-John Keats
While we don't know the context in which John Keats mentioned this, we are sure about its implication in data science. While you would have enjoyed and gained exposure to real world problems in this challenge, here is another opportunity to get your hand dirty with this practice problem.
Problem Statement :
The data scientists at BigMart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and find out the sales of each product at a particular store.
Using this model, BigMart will try to understand the properties of products and stores which play a key role in increasing sales.
Please note that the data may have missing values as some stores might not report all the data due to technical glitches. Hence, it will be required to treat them accordingly.
Data :
We have 14204 samples in data set.
Variable Description
Item Identifier: A code provided for the item of sale
Item Weight: Weight of item
Item Fat Content: A categorical column of how much fat is present in the item: ‘Low Fat’, ‘Regular’, ‘low fat’, ‘LF’, ‘reg’
Item Visibility: Numeric value for how visible the item is
Item Type: What category does the item belong to: ‘Dairy’, ‘Soft Drinks’, ‘Meat’, ‘Fruits and Vegetables’, ‘Household’, ‘Baking Goods’, ‘Snack Foods’, ‘Frozen Foods’, ‘Breakfast’, ’Health and Hygiene’, ‘Hard Drinks’, ‘Canned’, ‘Breads’, ‘Starchy Foods’, ‘Others’, ‘Seafood’.
Item MRP: The MRP price of item
Outlet Identifier: Which outlet was the item sold. This will be categorical column
Outlet Establishment Year: Which year was the outlet established
Outlet Size: A categorical column to explain size of outlet: ‘Medium’, ‘High’, ‘Small’.
Outlet Location Type: A categorical column to describe the location of the outlet: ‘Tier 1’, ‘Tier 2’, ‘Tier 3’
Outlet Type: Categorical column for type of outlet: ‘Supermarket Type1’, ‘Supermarket Type2’, ‘Supermarket Type3’, ‘Grocery Store’
Item Outlet Sales: The number of sales for an item.
Evaluation Metric:
We will use the Root Mean Square Error value to judge your response
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by selvam mts
Released under MIT
The annual Retail store data CD-ROM is an easy-to-use tool for quickly discovering retail trade patterns and trends. The current product presents results from the 1999 and 2000 Annual Retail Store and Annual Retail Chain surveys. This product contains numerous cross-classified data tables using the North American Industry Classification System (NAICS). The data tables provide access to a wide range of financial variables, such as revenues, expenses, inventory, sales per square footage (chain stores only) and the number of stores. Most data tables contain detailed information on industry (as low as 5-digit NAICS codes), geography (Canada, provinces and territories) and store type (chains, independents, franchises). The electronic product also contains survey metadata, questionnaires, information on industry codes and definitions, and the list of retail chain store respondents.
Note:- Only publicly available data can be worked upon
In today's ever-evolving Ecommerce landscape, success hinges on the ability to harness the power of data. APISCRAPY is your strategic ally, dedicated to providing a comprehensive solution for extracting critical Ecommerce data, including Ecommerce market data, Ecommerce product data, and Ecommerce datasets. With the Ecommerce arena being more competitive than ever, having a data-driven approach is no longer a luxury but a necessity.
APISCRAPY's forte lies in its ability to unearth valuable Ecommerce market data. We recognize that understanding the market dynamics, trends, and fluctuations is essential for making informed decisions.
APISCRAPY's AI-driven ecommerce data scraping service presents several advantages for individuals and businesses seeking comprehensive insights into the ecommerce market. Here are key benefits associated with their advanced data extraction technology:
Ecommerce Product Data: APISCRAPY's AI-driven approach ensures the extraction of detailed Ecommerce Product Data, including product specifications, images, and pricing information. This comprehensive data is valuable for market analysis and strategic decision-making.
Data Customization: APISCRAPY enables users to customize the data extraction process, ensuring that the extracted ecommerce data aligns precisely with their informational needs. This customization option adds versatility to the service.
Efficient Data Extraction: APISCRAPY's technology streamlines the data extraction process, saving users time and effort. The efficiency of the extraction workflow ensures that users can obtain relevant ecommerce data swiftly and consistently.
Realtime Insights: Businesses can gain real-time insights into the dynamic Ecommerce Market by accessing rapidly extracted data. This real-time information is crucial for staying ahead of market trends and making timely adjustments to business strategies.
Scalability: The technology behind APISCRAPY allows scalable extraction of ecommerce data from various sources, accommodating evolving data needs and handling increased volumes effortlessly.
Beyond the broader market, a deeper dive into specific products can provide invaluable insights. APISCRAPY excels in collecting Ecommerce product data, enabling businesses to analyze product performance, pricing strategies, and customer reviews.
To navigate the complexities of the Ecommerce world, you need access to robust datasets. APISCRAPY's commitment to providing comprehensive Ecommerce datasets ensures businesses have the raw materials required for effective decision-making.
Our primary focus is on Amazon data, offering businesses a wealth of information to optimize their Amazon presence. By doing so, we empower our clients to refine their strategies, enhance their products, and make data-backed decisions.
[Tags: Ecommerce data, Ecommerce Data Sample, Ecommerce Product Data, Ecommerce Datasets, Ecommerce market data, Ecommerce Market Datasets, Ecommerce Sales data, Ecommerce Data API, Amazon Ecommerce API, Ecommerce scraper, Ecommerce Web Scraping, Ecommerce Data Extraction, Ecommerce Crawler, Ecommerce data scraping, Amazon Data, Ecommerce web data]
This dataset was created by Waqas Ali Naqvi
It contains the following files:
Envestnet®| Yodlee®'s Consumer Behavior Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.
Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.
We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.
Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?
Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.
Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking
Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)
Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence
Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis
Company Datasets for valuable business insights!
Discover new business prospects, identify investment opportunities, track competitor performance, and streamline your sales efforts with comprehensive Company Datasets.
These datasets are sourced from top industry providers, ensuring you have access to high-quality information:
We provide fresh and ready-to-use company data, eliminating the need for complex scraping and parsing. Our data includes crucial details such as:
You can choose your preferred data delivery method, including various storage options, delivery frequency, and input/output formats.
Receive datasets in CSV, JSON, and other formats, with storage options like AWS S3 and Google Cloud Storage. Opt for one-time, monthly, quarterly, or bi-annual data delivery.
With Oxylabs Datasets, you can count on:
Pricing Options:
Standard Datasets: choose from various ready-to-use datasets with standardized data schemas, priced from $1,000/month.
Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.
Experience a seamless journey with Oxylabs:
Unlock the power of data with Oxylabs' Company Datasets and supercharge your business insights today!
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The given dataset appears to be a sales dataset containing information about different orders. Here is a description of the data:
The dataset provides detailed information about each order, including customer details, product details, sales information, and shipping information. It can be used to analyze various aspects of the sales data, such as profitability, customer segments, product categories, and regional sales performance.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was generated for use with Nile's Sales Assistant example: https://github.com/niledatabase/niledatabase/tree/main/examples/ai/sales_insight It includes:
Simulated sales conversations for 5 different fictional companies. Chunked and embedded version of these conversations (embeddings use OpenAI's text-embedding-3-small model).
The chunks and embeddings can be directly loaded to a vector databases and searched using vector similarity methods. The example's ./ingest directory… See the full description on the dataset page: https://huggingface.co/datasets/gwenshap/sales-transcripts.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
📊 Supplement Sales Data (2020–2025) Overview This dataset contains weekly sales data for a variety of health and wellness supplements from January 2020 to April 2025. The data includes products in categories like Protein, Vitamins, Omega, and Amino Acids, among others, and covers multiple e-commerce platforms such as Amazon, Walmart, and iHerb. The dataset also tracks sales in several locations including the USA, UK, and Canada.
Dataset Details Time Range: January 2020 to April 2025
Frequency: Weekly (Every Monday)
Number of Rows: 4,384
Columns:
Date: The week of the sale.
Product Name: The name of the supplement (e.g., Whey Protein, Vitamin C, etc.).
Category: The category of the supplement (e.g., Protein, Vitamin, Omega).
Units Sold: The number of units sold in that week.
Price: The selling price of the product.
Revenue: The total revenue generated (Units Sold * Price).
Discount: The discount applied on the product (as a percentage of original price).
Units Returned: The number of units returned in that week.
Location: The location of the sale (USA, UK, or Canada).
Platform: The e-commerce platform (Amazon, Walmart, iHerb).
Use Cases This dataset is ideal for:
Time-series forecasting and sales trend analysis 📈
Price vs. demand analysis and revenue prediction 📊
Sentiment analysis and impact of promotions (Discounts) on sales 🛍️
Product performance tracking across different platforms and locations 🛒
Business optimization in the health and wellness e-commerce sector 💼
Potential Applications Build predictive models to forecast future sales 📅
Analyze the effectiveness of discounts and promotions 💸
Create recommendation systems for supplement products 🧠
Perform exploratory data analysis (EDA) and uncover trends 🔍
Model return rates and their effect on overall revenue 📉
Why This Dataset? This dataset provides an excellent starting point for those interested in building business intelligence tools, e-commerce forecasting models, or exploring health & wellness trends. It also serves as a perfect dataset for data science learners looking to apply regression, time-series analysis, and predictive modeling techniques.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Sales data for all Islanders Board Of Industry & Service (IBIS) stores.
Dataset Card for "sales-conversations"
This dataset was created for the purpose of training a sales agent chatbot that can convince people. The initial idea came from: textbooks is all you need https://arxiv.org/abs/2306.11644 gpt-3.5-turbo was used for the generation
Structure
The conversations have a customer and a salesman which appear always in changing order. customer, salesman, customer, salesman, etc. The customer always starts the conversation Who ends the… See the full description on the dataset page: https://huggingface.co/datasets/goendalf666/sales-conversations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Turnover and volume of sales in wholesale and retail trade - monthly data
This dataset contains a list of sales and movement data by item and department appended monthly. Update Frequency : Monthly