49 datasets found
  1. Data from: Online Retail Store Dataset

    • kaggle.com
    Updated May 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankush Kashyap (2024). Online Retail Store Dataset [Dataset]. https://www.kaggle.com/datasets/kashyapankush/online-retail-store-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 2, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ankush Kashyap
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Ankush Kashyap

    Released under Apache 2.0

    Contents

  2. d

    Warehouse and Retail Sales

    • catalog.data.gov
    • data.montgomerycountymd.gov
    • +3more
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.montgomerycountymd.gov (2025). Warehouse and Retail Sales [Dataset]. https://catalog.data.gov/dataset/warehouse-and-retail-sales
    Explore at:
    Dataset updated
    Oct 11, 2025
    Dataset provided by
    data.montgomerycountymd.gov
    Description

    This dataset contains a list of sales and movement data by item and department appended monthly. Update Frequency : Monthly

  3. Retail Analysis on Large Dataset

    • kaggle.com
    Updated Jun 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil Prajapati (2024). Retail Analysis on Large Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/8693643
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sahil Prajapati
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Description:

    • The dataset represents retail transactional data. It contains information about customers, their purchases, products, and transaction details. The data includes various attributes such as customer ID, name, email, phone, address, city, state, zipcode, country, age, gender, income, customer segment, last purchase date, total purchases, amount spent, product category, product brand, product type, feedback, shipping method, payment method, and order status.

    Key Points:

    Customer Information:

    • Includes customer details like ID, name, email, phone, address, city, state, zipcode, country, age, and gender. Customer segments are categorized into Premium, Regular, and New. ##Transaction Details:
    • Transaction-specific data such as transaction ID, last purchase date, total purchases, amount spent, total purchase amount, feedback, shipping method, payment method, and order status. ##Product Information:
    • Contains product-related details such as product category, brand, and type. Products are categorized into electronics, clothing, grocery, books, and home decor. ##Geographic Information:
    • Contains location details including city, state, and country. Available for various countries including USA, UK, Canada, Australia, and Germany. ##Temporal Information:
    • Last purchase date is provided along with separate columns for year, month, date, and time. Allows analysis based on temporal patterns and trends. ##Data Quality:
    • Some rows contain null values, and others are duplicates, which may need to be handled during data preprocessing. Null values are randomly distributed across rows. Duplicate rows are available at different parts of the dataset. ##Potential Analysis:
    • Customer segmentation analysis based on demographics, purchase behavior, and feedback. Sales trend analysis over time to identify peak seasons or trends. Product performance analysis to determine popular categories, brands, or types. Geographic analysis to understand regional preferences and trends. Payment and shipping method analysis to optimize services. Customer satisfaction analysis based on feedback and order status. ##Data Preprocessing:
    • Handling null values and duplicates. Parsing and formatting temporal data. Encoding categorical variables. Scaling numerical variables if required. Splitting data into training and testing sets for modeling.
  4. Retail Sales Forecasting

    • kaggle.com
    Updated Jul 31, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TEVEC Systems (2017). Retail Sales Forecasting [Dataset]. https://www.kaggle.com/datasets/tevecsystems/retail-sales-forecasting
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    TEVEC Systems
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    This dataset contains lot of historical sales data. It was extracted from a Brazilian top retailer and has many SKUs and many stores. The data was transformed to protect the identity of the retailer.

    Content

    [TBD]

    Acknowledgements

    This data would not be available without the full collaboration from our customers who understand that sharing their core and strategical information has more advantages than possible hazards. They also support our continuos development of innovative ML systems across their value chain.

    Inspiration

    Every retail business in the world faces a fundamental question: how much inventory should I carry? In one hand to mush inventory means working capital costs, operational costs and a complex operation. On the other hand lack of inventory leads to lost sales, unhappy customers and a damaged brand.

    Current inventory management models have many solutions to place the correct order, but they are all based in a single unknown factor: the demand for the next periods.

    This is why short-term forecasting is so important in retail and consumer goods industry.

    We encourage you to seek for the best demand forecasting model for the next 2-3 weeks. This valuable insight can help many supply chain practitioners to correctly manage their inventory levels.

  5. Data from: Retail Sales Analysis

    • kaggle.com
    Updated Jun 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahir Maharaj (2024). Retail Sales Analysis [Dataset]. https://www.kaggle.com/datasets/sahirmaharajj/retail-sales-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sahir Maharaj
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains a list of sales and movement data by item and department appended monthly.

    It is rich in information that can be leveraged for various data science applications. For instance, analyzing this dataset can offer insights into consumer behavior, such as preferences for specific types of beverages (e.g., wine, beer) during different times of the year. Furthermore, the dataset can be used to identify trends in sales and transfers, highlighting seasonal effects or the impact of certain suppliers on the market.

    One could start with exploratory data analysis (EDA) to understand the basic distribution of sales and transfers across different item types and suppliers. Time series analysis can provide insights into seasonal trends and sales forecasts. Cluster analysis might reveal groups of suppliers or items with similar sales patterns, which can be useful for targeted marketing and inventory management.

  6. E-Commerce Sales Dataset

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). E-Commerce Sales Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-profits-with-e-commerce-sales-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    E-Commerce Sales Dataset

    Analyzing and Maximizing Online Business Performance

    By ANil [source]

    About this dataset

    This dataset provides an in-depth look at the profitability of e-commerce sales. It contains data on a variety of sales channels, including Shiprocket and INCREFF, as well as financial information on related expenses and profits. The columns contain data such as SKU codes, design numbers, stock levels, product categories, sizes and colors. In addition to this we have included the MRPs across multiple stores like Ajio MRP , Amazon MRP , Amazon FBA MRP , Flipkart MRP , Limeroad MRP Myntra MRP and PaytmMRP along with other key parameters like amount paid by customer for the purchase , rate per piece for every individual transaction Also we have added transactional parameters like Date of sale months category fulfilledby B2b Status Qty Currency Gross amt . This is a must-have dataset for anyone trying to uncover the profitability of e-commerce sales in today's marketplace

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides a comprehensive overview of e-commerce sales data from different channels covering a variety of products. Using this dataset, retailers and digital marketers can measure the performance of their campaigns more accurately and efficiently.

    The following steps help users make the most out of this dataset: - Analyze the general sales trends by examining info such as month, category, currency, stock level, and customer for each sale. This will give you an idea about how your e-commerce business is performing in each channel.
    - Review the Shiprocket and INCREF data to compare and analyze profitability via different fulfilment methods. This comparison would enable you to make better decisions towards maximizing profit while minimizing costs associated with each method’s referral fees and fulfillment rates.
    - Compare prices between various channels such as Amazon FBA MRP, Myntra MRP, Ajio MRP etc using the corresponding columns for each store (Amazon MRP etc). You can judge which stores are offering more profitable margins without compromising on quality by analyzing these pricing points in combination with other information related to product sales (TP1/TP2 - cost per piece).
    - Look at customer specific data such as TP 1/TP 2 combination wise Gross Amount or Rate info in terms price per piece or total gross amount generated by any SKU dispersed over multiple customers with relevant dates associated to track individual item performance relative to others within its category over time periods shortlisted/filtered appropriately.. Have an eye on items commonly utilized against offers or promotional discounts offered hence crafting strategies towards inventory optimization leading up-selling operations.?
    - Finally Use Overall ‘Stock’ details along all the P & L Data including Yearly Expenses_IIGF information record for takeaways which might be aimed towards essential cost cutting measures like switching amongst delivery options carefully chosen out of Shiprocket & INCREFF leadings away from manual inspections catering savings under support personnel outsourcing structures.?

    By employing a comprehensive understanding on how our internal subsidiaries perform globally unless attached respective audits may provide us remarkably lower operational costs servicing confidence; costing far lesser than being incurred taking into account entire pallet shipments tracking sheets representing current level supply chains efficiencies achieved internally., then one may finally scale profits exponentially increases cut down unseen losses followed up introducing newer marketing campaigns necessarily tailored according playing around multiple goods based spectrums due powerful backing suitable transportation boundaries set carefully

    Research Ideas

    • Analysing the difference in profitability between sales made through Shiprocket and INCREFF. This data can be used to see where the biggest profit margins lie, and strategize accordingly.
    • Examining the Complete Cost structure of a product with all its components and their contribution towards revenue or profitability, i.e., TP 1 & 2, MRP Old & Final MRP Old together with Platform based MRP - Amazon, Myntra and Paytm etc., Currency based Profit Margin etc.
    • Building a predictive model using Machine Learning by leveraging historical data to predict future sales volume and profits for e-commerce products across multiple categories/devices/platforms such as Amazon, Flipkart, Myntra etc as well providing m...
  7. t

    Evaluating FAIR Models for Rossmann Store Sales Prediction: Insights and...

    • test.researchdata.tuwien.at
    bin, csv, json +1
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dilara Çakmak; Dilara Çakmak; Dilara Çakmak; Dilara Çakmak (2025). Evaluating FAIR Models for Rossmann Store Sales Prediction: Insights and Performance Analysis [Dataset]. http://doi.org/10.70124/f5t2d-xt904
    Explore at:
    bin, json, text/markdown, csvAvailable download formats
    Dataset updated
    Apr 28, 2025
    Dataset provided by
    TU Wien
    Authors
    Dilara Çakmak; Dilara Çakmak; Dilara Çakmak; Dilara Çakmak
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 2025
    Description

    Context and Methodology

    Research Domain:
    The dataset is part of a project focused on retail sales forecasting. Specifically, it is designed to predict daily sales for Rossmann, a chain of over 3,000 drug stores operating across seven European countries. The project falls under the broader domain of time series analysis and machine learning applications for business optimization. The goal is to apply machine learning techniques to forecast future sales based on historical data, which includes factors like promotions, competition, holidays, and seasonal trends.

    Purpose:
    The primary purpose of this dataset is to help Rossmann store managers predict daily sales for up to six weeks in advance. By making accurate sales predictions, Rossmann can improve inventory management, staffing decisions, and promotional strategies. This dataset serves as a training set for machine learning models aimed at reducing forecasting errors and supporting decision-making processes across the company’s large network of stores.

    How the Dataset Was Created:
    The dataset was compiled from several sources, including historical sales data from Rossmann stores, promotional calendars, holiday schedules, and external factors such as competition. The data is split into multiple features, such as the store's location, promotion details, whether the store was open or closed, and weather information. The dataset is publicly available on platforms like Kaggle and was initially created for the Kaggle Rossmann Store Sales competition. The data is made accessible via an API for further analysis and modeling, and it is structured to help machine learning models predict future sales based on various input variables.

    Technical Details

    Dataset Structure:

    The dataset consists of three main files, each with its specific role:

    1. Train:
      This file contains the historical sales data, which is used to train machine learning models. It includes daily sales information for each store, as well as various features that could influence the sales (e.g., promotions, holidays, store type, etc.).

      https://handle.test.datacite.org/10.82556/yb6j-jw41
      PID: b1c59499-9c6e-42c2-af8f-840181e809db
    2. Test2:
      The test dataset mirrors the structure of train.csv but does not include the actual sales values (i.e., the target variable). This file is used for making predictions using the trained machine learning models. It is used to evaluate the accuracy of predictions when the true sales data is unknown.

      https://handle.test.datacite.org/10.82556/jerg-4b84
      PID: 7cbb845c-21dd-4b60-b990-afa8754a0dd9
    3. Store:
      This file provides metadata about each store, including information such as the store’s location, type, and assortment level. This data is essential for understanding the context in which the sales data is gathered.

      https://handle.test.datacite.org/10.82556/nqeg-gy34
      PID: 9627ec46-4ee6-4969-b14a-bda555fe34db

    Data Fields Description:

    • Id: A unique identifier for each (Store, Date) combination within the test set.

    • Store: A unique identifier for each store.

    • Sales: The daily turnover (target variable) for each store on a specific day (this is what you are predicting).

    • Customers: The number of customers visiting the store on a given day.

    • Open: An indicator of whether the store was open (1 = open, 0 = closed).

    • StateHoliday: Indicates if the day is a state holiday, with values like:

      • 'a' = public holiday,

      • 'b' = Easter holiday,

      • 'c' = Christmas,

      • '0' = no holiday.

    • SchoolHoliday: Indicates whether the store is affected by school closures (1 = yes, 0 = no).

    • StoreType: Differentiates between four types of stores: 'a', 'b', 'c', 'd'.

    • Assortment: Describes the level of product assortment in the store:

      • 'a' = basic,

      • 'b' = extra,

      • 'c' = extended.

    • CompetitionDistance: Distance (in meters) to the nearest competitor store.

    • CompetitionOpenSince[Month/Year]: The month and year when the nearest competitor store opened.

    • Promo: Indicates whether the store is running a promotion on a particular day (1 = yes, 0 = no).

    • Promo2: Indicates whether the store is participating in Promo2, a continuing promotion for some stores (1 = participating, 0 = not participating).

    • Promo2Since[Year/Week]: The year and calendar week when the store started participating in Promo2.

    • PromoInterval: Describes the months when Promo2 is active, e.g., "Feb,May,Aug,Nov" means the promotion starts in February, May, August, and November.

    Software Requirements

    To work with this dataset, you will need to have specific software installed, including:

    • DBRepo Authorization: This is required to access the datasets via the DBRepo API. You may need to authenticate with an API key or login credentials to retrieve the datasets.

    • Python Libraries: Key libraries for working with the dataset include:

      • pandas for data manipulation,

      • numpy for numerical operations,

      • matplotlib and seaborn for data visualization,

      • scikit-learn for machine learning algorithms.

    Additional Resources

    Several additional resources are available for working with the dataset:

    1. Presentation:
      A presentation summarizing the exploratory data analysis (EDA), feature engineering process, and key insights from the analysis is provided. This presentation also includes visualizations that help in understanding the dataset’s trends and relationships.

    2. Jupyter Notebook:
      A Jupyter notebook, titled Retail_Sales_Prediction_Capstone_Project.ipynb, is provided, which details the entire machine learning pipeline, from data loading and cleaning to model training and evaluation.

    3. Model Evaluation Results:
      The project includes a detailed evaluation of various machine learning models, including their performance metrics like training and testing scores, Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE). This allows for a comparison of model effectiveness in forecasting sales.

    4. Trained Models (.pkl files):
      The models trained during the project are saved as .pkl files. These files contain the trained machine learning models (e.g., Random Forest, Linear Regression, etc.) that can be loaded and used to make predictions without retraining the models from scratch.

    5. sample_submission.csv:
      This file is a sample submission file that demonstrates the format of predictions expected when using the trained model. The sample_submission.csv contains predictions made on the test dataset using the trained Random Forest model. It provides an example of how the output should be structured for submission.

    These resources provide a comprehensive guide to implementing and analyzing the sales forecasting model, helping you understand the data, methods, and results in greater detail.

  8. h

    store-sales-time-series-forecasting

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tiana, store-sales-time-series-forecasting [Dataset]. https://huggingface.co/datasets/t4tiana/store-sales-time-series-forecasting
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Tiana
    Description

    taken from this Kaggle competition:

      Dataset Description
    

    In this competition, you will predict sales for the thousands of product families sold at Favorita stores located in Ecuador. The training data includes dates, store and product information, whether that item was being promoted, as well as the sales numbers. Additional files include supplementary information that may be useful in building your models.

      File Descriptions and Data Field Information
    

    train.csv… See the full description on the dataset page: https://huggingface.co/datasets/t4tiana/store-sales-time-series-forecasting.

  9. d

    Grocery Store Locations

    • catalog.data.gov
    • opendata.dc.gov
    • +2more
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of the Chief Technology Officer (2025). Grocery Store Locations [Dataset]. https://catalog.data.gov/dataset/grocery-store-locations
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Office of the Chief Technology Officer
    Description

    This dataset was originally created in 2012 by the Office of the Chief Technology Officer. OCTO staff used the Alcoholic Beverage and Cannabis Administration’s (ABCA) definition of Full-Service Grocery Stores which outlines criteria for a business to obtain licenses to sell beer, wine, and spirits. Visit abca.dc.gov for full definition.OCTO staff then reviewed the Office of Planning DC Food Policy’s 2018 Food System Assessment listing grocery stores in Appendix D, and comparing these to the ABCA definition. This led to additional locations that meet, or come very close to, the full-service grocery store criteria. The criteria in section one of ABCA’s full-service grocery store determined the initial locations included in this dataset. View the full assessment at dcfoodpolicycouncil.org.Since the initial creation of this dataset, OCTO and the Deputy Mayor for Planning and Economic Development (DMPED) staff confirm grocery store operations by comparing datasets from DLCP, media outlets, commercially licensed datasets, and onsite visits.Please review supplemental metadata for more details.

  10. A

    ‘Big Mart Sales’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Big Mart Sales’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-big-mart-sales-132a/55ae27c6/?iid=037-342&v=presentation
    Explore at:
    Dataset updated
    Nov 12, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Big Mart Sales’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/akashdeepkuila/big-mart-sales on 12 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    The data scientists at Big Mart have collected 2013 sales data for 1559 products across 10 stores in different cities. Also, certain attributes of each product and store have been defined. The aim is to build a predictive model and predict the sales of each product at a particular outlet.

    Using this model, Big Mart will try to understand the properties of products and outlets which play a key role in increasing sales.

    Please note that the data may have missing values as some stores might not report all the data due to technical glitches. Hence, it will be required to treat them accordingly.

    Content

    The dataset provides the product details and the outlet information of the products purchased with their sales value split into a train set (8523) and a test (5681) set. Train file: CSV containing the item outlet information with sales value Test file: CSV containing item outlet combinations for which sales need to be forecasted

    Variable Description

    • ProductID : unique product ID
    • Weight : weight of products
    • FatContent : specifies whether the product is low on fat or not
    • Visibility : percentage of total display area of all products in a store allocated to the particular product
    • ProductType : the category to which the product belongs
    • MRP : Maximum Retail Price (listed price) of the products
    • OutletID : unique store ID
    • EstablishmentYear : year of establishment of the outlets
    • OutletSize : the size of the store in terms of ground area covered
    • LocationType : the type of city in which the store is located
    • OutletType : specifies whether the outlet is just a grocery store or some sort of supermarket
    • OutletSales : (target variable) sales of the product in the particular store

    Inspiration

    Sales of a given product at a retail store can depend both on store attributes as well as product attributes. The dataset is ideal to explore and build a data science model to predict the future sales.

    --- Original source retains full ownership of the source dataset ---

  11. ZARA UK Fashion dataset

    • crawlfeeds.com
    csv, zip
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). ZARA UK Fashion dataset [Dataset]. https://crawlfeeds.com/datasets/zara-uk-fashion-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    ZARA UK Fashion Dataset offers an extensive collection of fashion product data from ZARA's UK online store, providing a detailed overview of available items. This dataset is valuable for analyzing the European fashion retail market, particularly in the UK, and includes fields such as product titles, URLs, SKUs, MPNs, brands, prices, currency, images, breadcrumbs, country, availability, unique IDs, and timestamps for when the data was scraped.

    Key Features:

    • Product Details: Includes title, URL, SKU (Stock Keeping Unit), MPN (Manufacturer Part Number), and brand for each product, helping to uniquely identify and differentiate items.
    • Pricing Information: Features the price of each product along with the currency used (GBP) to understand the pricing strategies of ZARA in the UK market.
    • Visual Data: High-quality images of each product, essential for visual merchandising analysis and online consumer behavior studies.
    • Categorical Information: Breadcrumbs data provide context on the product's placement within ZARA's website structure, helping to analyze navigation and product hierarchy.
    • Geographical Focus: Specific to the UK market, making it relevant for studies on British fashion retail and consumer trends.
    • Availability Status: Includes real-time availability data, which is crucial for understanding stock levels, popular products, and restocking practices.
    • Unique Identifiers: Each product is tagged with a uniq_id, ensuring data integrity and making it easier to track and analyze over time.
    • Data Collection Timestamp: The scraped_at field records the exact time and date when the data was collected, aiding in time-based analysis of inventory and pricing.

    Potential Use Cases:

    • Market Research: Analyze UK and European fashion trends, consumer preferences, and competitive positioning within the fast fashion sector.
    • E-commerce Analysis: Study ZARA's product placement, pricing, and availability to optimize online retail strategies.
    • Stock Management: Use SKU and availability data to predict inventory needs and enhance supply chain efficiency.
    • Brand Analysis: Examine the impact of brand identity on consumer choices and product performance in the UK market.
    • Academic Research: Ideal for research projects focused on fashion retail, marketing strategies, and consumer behavior in Europe.

    Data Sources:

    The data is meticulously collected from ZARA's official UK website and other reliable retail databases, reflecting the latest product offerings and market dynamics specific to the UK and European fashion markets.

    • ZARA US Retail Products Dataset: Explore over 10,000 product records from ZARA's USA online store, including titles, prices, images, and availability.

    • Fashion Products Dataset from GAP.com: Access detailed product information from GAP's online store, featuring over 4,500 fashion items with attributes like price, brand, color, reviews, and images.

    • Myntra Fashion Products Dataset: A comprehensive dataset from Myntra.com, offering over 12,000 fashion products with detailed attributes for in-depth analysis.
  12. Zara UK Products Dataset - Complete Fashion E-commerce Data

    • crawlfeeds.com
    csv, zip
    Updated Aug 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Zara UK Products Dataset - Complete Fashion E-commerce Data [Dataset]. https://crawlfeeds.com/datasets/zara-uk-products-dataset-complete-fashion-e-commerce-data
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 17, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    16,000 Zara UK Fashion Products in CSV Format

    Unlock fashion retail intelligence with our comprehensive Zara UK products dataset. This premium collection contains 16,000 products from Zara's UK online store, providing detailed insights into one of the world's leading fast-fashion retailers. Perfect for fashion trend analysis, pricing strategies, competitive research, and machine learning applications.

    Dataset Overview

    • Language: English
    • Coverage: Men's, women's, and children's fashion
    • File Size: ~30MB
    • Data Freshness: Recently collected (2025)

    Complete Data Fields Included

    Product Information

    • name: Complete product titles and descriptions
    • brand: Brand identification (Zara)
    • category: Product categories (tops, bottoms, dresses, accessories)
    • description: Detailed item descriptions and features
    • composition: Fabric composition and material details
    • breadcrumbs: Navigation path and product hierarchy

    Pricing and Promotions

    • price: Current prices in GBP
    • old_price: Original prices before discounts
    • discount: Discount percentages and savings
    • promotions: Active promotional campaigns
    • currency: GBP for UK market analysis

    Product Attributes

    • color: Available color variations
    • sizes: Size ranges and availability
    • images: High-resolution product image URLs
    • url: Direct product page links

    Technical Fields

    • uniq_id: Unique product identifiers
    • scraped_at: Data collection timestamps

    Key Use Cases

    Fashion Trend Analysis

    • Track seasonal trends and popular styles
    • Analyze color preferences and combinations
    • Monitor fashion trend evolution
    • Predict upcoming fashion movements

    Competitive Intelligence

    • Study Zara's pricing strategies
    • Analyze product mix and category focus
    • Monitor inventory and availability patterns
    • Compare market positioning

    E-commerce Analytics

    • Category performance analysis
    • Price optimization strategies
    • Inventory planning insights
    • Customer preference mapping

    Machine Learning Applications

    • Fashion recommendation systems
    • Price prediction models
    • Trend forecasting algorithms
    • Image recognition training data

    Data Quality Features

    • Clean, Validated Data: Pre-processed and error-checked
    • Consistent Formatting: Standardized structure across records
    • No Duplicates: Unique products only
    • Complete Coverage: Entire Zara UK catalog included
    • Fresh Collection: Recently scraped for current relevance

    Target Industries

    Fashion Retailers

    • Competitive benchmarking
    • Trend adoption strategies
    • Pricing optimization
    • Product development insights

    Technology Companies

    • AI training datasets
    • Fashion analytics platforms
    • E-commerce enhancement
    • Style recommendation engines

    Market Research

    • Industry analysis reports
    • Brand performance tracking
    • Consumer behavior studies
    • Trend forecasting services

    Academic Research

    • Fashion industry studies
    • Business case studies
    • Data science applications
    • Sustainability research

    Licensing Options

    Commercial License

    • Full business usage rights
    • Team sharing permissions
    • Resale of processed insights
    • API integration allowed

    Academic License

    • Non-commercial research use
    • Educational institution sharing
    • Publication rights included
    • Discounted pricing available

    Delivery Methods

    • Instant

  13. Grocery Sales Database

    • kaggle.com
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrex Ibiza, MBA (2025). Grocery Sales Database [Dataset]. https://www.kaggle.com/datasets/andrexibiza/grocery-sales-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 31, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Andrex Ibiza, MBA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Grocery Sales Database - Data Card

    Overview

    The Grocery Sales Database is a structured relational dataset designed for analyzing sales transactions, customer demographics, product details, employee records, and geographical information across multiple cities and countries. This dataset is ideal for data analysts, data scientists, and machine learning practitioners looking to explore sales trends, customer behaviors, and business insights.

    Database Schema

    The dataset consists of seven interconnected tables:

    File NameDescription
    categories.csvDefines the categories of the products.
    cities.csvContains city-level geographic data.
    countries.csvStores country-related metadata.
    customers.csvContains information about the customers who make purchases.
    employees.csvStores details of employees handling sales transactions.
    products.csvStores details about the products being sold.
    sales.csvContains transactional data for each sale.

    Table Descriptions

    1. categories

    KeyColumn NameData TypeDescription
    PKCategoryIDINTUnique identifier for each product category.
    CategoryNameVARCHAR(45)Name of the product category.

    2. cities

    KeyColumn NameData TypeDescription
    PKCityIDINTUnique identifier for each city.
    CityNameVARCHAR(45)Name of the city.
    ZipcodeDECIMAL(5,0)Population of the city.
    FKCountryIDINTReference to the corresponding country.

    3. countries

    KeyColumn NameData TypeDescription
    PKCountryIDINTUnique identifier for each country.
    CountryNameVARCHAR(45)Name of the country.
    CountryCodeVARCHAR(2)Two-letter country code.

    4. customers

    KeyColumn NameData TypeDescription
    PKCustomerIDINTUnique identifier for each customer.
    FirstNameVARCHAR(45)First name of the customer.
    MiddleInitialVARCHAR(1)Middle initial of the customer.
    LastNameVARCHAR(45)Last name of the customer.
    FKcityIDINTCity of the customer.
    AddressVARCHAR(90)Residential address of the customer.

    5. employees

    KeyColumn NameData TypeDescription
    PKEmployeeIDINTUnique identifier for each employee.
    FirstNameVARCHAR(45)First name of the employee.
    MiddleInitialVARCHAR(1)Middle initial of the employee.
    LastNameVARCHAR(45)Last name of the employee.
    BirthDateDATEDate of birth of the employee.
    GenderVARCHAR(10)Gender of the employee.
    FKCityIDINTunique identifier for city
    HireDateDATEDate when the employee was hired.

    6. products

    KeyColumn NameData TypeDescription
    PKProductIDINTUnique identifier for each product.
    ProductNameVARCHAR(45)Name of the product.
    PriceDECIMAL(4,0)Price per unit of the product.
    CategoryIDINTunique category identifier
    Class ...
  14. Food and drinks items dataset from Tesco uk

    • crawlfeeds.com
    csv, zip
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Food and drinks items dataset from Tesco uk [Dataset]. https://crawlfeeds.com/datasets/food-and-drinks-items-dataset-from-tesco-uk
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Aug 1, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Tesco UK Food & Drinks Dataset (CSV Format)

    Structured Grocery Data – 10,000+ Products

    The Tesco UK Food and Drinks Dataset delivers over 10,000 records of food, beverage, and household grocery items available from Tesco’s UK online store. This dataset is ideal for businesses and researchers looking to analyze product listings, monitor pricing, or build grocery-related tools and applications.

    Included Fields

    Each record includes:

    • Core Details: Name, SKU, GTIN13, price (GBP and USD), currency

    • Product Info: Pack size, ounces, serving size, availability

    • Categorization: Category, allergy information, meat-based flag

    • Metadata: Product URL for verification and linking

    Total Fields: 14
    Format: CSV (ZIP-compressed)

    💡 Use Cases

    • Nutrition & Allergen Apps – Leverage serving size and allergy data

    • Price Intelligence – Monitor Tesco grocery pricing over time

    • AI Product Matching – Match SKU/GTIN13 across multiple retailers

    • Supply Chain Enrichment – Add product context to grocery inventory systems

    • Retail Analytics – Study trends in pack sizes, availability, and product types

  15. d

    Point of Interest (POI) Data | Indonesia | Track Store Openings and Closures...

    • datarade.ai
    .csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GapMaps, Point of Interest (POI) Data | Indonesia | Track Store Openings and Closures for Leading Retail Brands | Business Location Data | Location Data [Dataset]. https://datarade.ai/data-products/gapmaps-indonesia-point-of-interest-data-all-categories-2-gapmaps
    Explore at:
    .csvAvailable download formats
    Dataset authored and provided by
    GapMaps
    Area covered
    Indonesia
    Description

    GapMaps Point of Interest (POI) Data for Indonesia includes the most up-to-date view of business location data for over 200 leading retail brands across 16 categories including fast food, cafe, fitness, supermarket/grocery. Detailed Point of Interest (POI) attributes provided for each location include:

    • Category
    • Business Name
    • Postal code
    • Latitude and longitude
    • Street address, city, and state
    • Location Status (open or closed)
    • Location Updated Date
    • Location Validated Date

    Leading brands across Fast Food, Cafe, Fitness, Supermarket/Grocery sectors are updated monthly.

    Dataset can be supplied as CSV file or API

  16. y

    Number of Retail Sector Non Domestic Rateable Properties - Dataset - York...

    • data.yorkopendata.org
    Updated Nov 29, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Number of Retail Sector Non Domestic Rateable Properties - Dataset - York Open Data [Dataset]. https://data.yorkopendata.org/dataset/kpi-bur05
    Explore at:
    Dataset updated
    Nov 29, 2019
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    York
    Description

    Number of Retail Sector Non Domestic Rateable Properties

  17. TESCO products dataset

    • crawlfeeds.com
    csv, zip
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). TESCO products dataset [Dataset]. https://crawlfeeds.com/datasets/tesco-products-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 1, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Access our extensive Tesco products dataset, featuring detailed information on a wide array of products available at Tesco.

    This comprehensive dataset includes product names, categories, descriptions, prices, and availability, offering a thorough view of Tesco's product range.

    Ideal for market analysis, competitive research, and business intelligence, this dataset enables businesses and analysts to track pricing trends, monitor inventory levels, and gain valuable insights into consumer preferences.

    Enhance your understanding of the retail landscape with this valuable collection of Tesco product data.

  18. d

    Point of Interest (POI) Data | Asia / MENA | Monitor Store Openings and...

    • datarade.ai
    .csv
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GapMaps, Point of Interest (POI) Data | Asia / MENA | Monitor Store Openings and Closures for Leading Retail Brands | Business Location Data | Location Data [Dataset]. https://datarade.ai/data-products/gapmaps-asia-and-mena-point-of-interest-poi-data-all-cat-gapmaps
    Explore at:
    .csvAvailable download formats
    Dataset authored and provided by
    GapMaps
    Area covered
    India, Philippines, Singapore, Indonesia, Malaysia, Saudi Arabia
    Description

    GapMaps premium Point of Interest (POI) Data for Asia includes the most up-to-date view of store locations for over 850 leading retail brands covering 700k+ locations across Asia and MENA including Indonesia, India, Philippines, Malaysia, Singapore and Saudi Arabia.

    Point of Interest (POI) categories include Fast Food, Cafe, Fitness, Supermarket/grocery sectors which are updated monthly. Brands in other sectors are updated quarterly.

    Detailed attributes provided for each Point of Interest (POI) location include:

    • Category
    • Business Name
    • Postal code
    • Latitude and longitude
    • Street address, city, and state
    • Location Status (open or closed)
    • Location Updated Date
    • Location Validated Date

    Point of Interest (POI) datasets will be supplied as CSV file.

  19. ALDI groceries data in CSV format

    • crawlfeeds.com
    csv, zip
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2024). ALDI groceries data in CSV format [Dataset]. https://crawlfeeds.com/datasets/aldi-groceries-data-in-csv-format
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Aldi is supermarket chain stores operating over 10,000 stores. Crawl Feeds team extracted more than 11K+ groceries information from Aldi.

    Available data format CSV

    18 data points

    Dataset will update based on request

    Last extracted on 17 jun 2022

    ---

    Site compleity: Difficult

  20. Data from: Retail Sales Analysis:

    • kaggle.com
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Talha khalid (2024). Retail Sales Analysis: [Dataset]. https://www.kaggle.com/datasets/talhachoudary/sales-of-company/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Talha khalid
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview This collection of datasets is designed to provide a comprehensive overview of a retail business's operations, focusing on calendar information, customer demographics, order details, and product information. These datasets are ideal for performing in-depth sales analysis, customer segmentation, demand forecasting, and inventory management.

    Dataset Descriptions Calendar.csv

    Description: This file contains detailed calendar information to assist with time-based analysis. It includes important dates, such as holidays, weekends, and fiscal periods, which can be critical for analyzing sales trends, seasonality, and promotional impacts. Key Columns: Date: The specific date. Day of Week: The day of the week (e.g., Monday, Tuesday). Month: The month corresponding to the date. Quarter: The fiscal quarter (Q1, Q2, etc.). Year: The year of the date. Holiday Flag: Indicates if the date is a public holiday. Customer.csv

    Description: This dataset contains demographic information about the customers. It’s useful for customer segmentation, lifetime value analysis, and targeted marketing campaigns. Key Columns: Customer ID: A unique identifier for each customer. Name: The full name of the customer. Age: The age of the customer. Gender: The gender of the customer. Location: The geographic location (city/state) of the customer. Loyalty Tier: The loyalty program tier of the customer (e.g., Bronze, Silver, Gold). Order.csv

    Description: This dataset tracks individual customer orders, including transaction details. It is essential for sales analysis, order fulfillment tracking, and revenue analysis. Key Columns: Order ID: A unique identifier for each order. Customer ID: The ID of the customer who placed the order (linking to Customer.csv). Order Date: The date the order was placed. Product ID: The ID of the product ordered (linking to Product.csv). Quantity: The quantity of the product ordered. Total Price: The total price of the order. Product.csv

    Description: This dataset provides detailed information on the products available in the retail store. It includes categories, pricing, and supplier information, making it useful for inventory management and product performance analysis. Key Columns: Product ID: A unique identifier for each product. Product Name: The name of the product. Category: The category under which the product falls (e.g., Electronics, Clothing). Supplier ID: The ID of the supplier providing the product. Unit Price: The price per unit of the product. Stock Quantity: The number of units available in stock. Usability These datasets can be utilized for various business analytics tasks, including:

    Sales and Revenue Analysis: By linking the Order.csv and Product.csv, one can analyze sales performance by product category, identify best-sellers, and determine revenue drivers. Customer Segmentation: Using Customer.csv, segment customers based on demographics or purchase behavior to tailor marketing efforts. Demand Forecasting: Integrate Calendar.csv to model seasonality effects and predict future sales trends. Provenance These datasets are typically generated from an ERP system or CRM and are structured to support a variety of business intelligence applications. Users may need to perform data cleaning or transformation depending on the specific use case.

    Licensing and Coverage The datasets are provided without a specific license. Users are encouraged to verify and attribute the source as needed. Coverage typically includes the entire operational history of the retail business, though users should check for any specific time range covered.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ankush Kashyap (2024). Online Retail Store Dataset [Dataset]. https://www.kaggle.com/datasets/kashyapankush/online-retail-store-dataset
Organization logo

Data from: Online Retail Store Dataset

Related Article
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 2, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ankush Kashyap
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Dataset

This dataset was created by Ankush Kashyap

Released under Apache 2.0

Contents

Search
Clear search
Close search
Google apps
Main menu