Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The Superstore Sales Data dataset, available in an Excel format as "Superstore.xlsx," is a comprehensive collection of sales and customer-related information from a retail superstore. This dataset comprises* three distinct tables*, each providing specific insights into the store's operations and customer interactions.
Facebook
TwitterThis dataset was created by Truong Dai
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
The ādataset contains sales transaction records from a retail business specializing in āoffice supplies, furniture, and technology products. It includes ā999 transactionsā with detailed information on orders, customers, products, and profitability.
This dataset is commonly used for ādata visualization, sales analysis, and business intelligence training, particularly in tools like āTableau, Excel, and Power BI.
This dataset is āideal for practicing sales analytics, dashboard creation, and business insights. It provides a structured way to explore retail performance across products, customers, and regions.
Facebook
TwitterWith growing demands and cut-throat competitions in the market, a Superstore Giant is seeking your knowledge in understanding what works best for them. They would like to understand which products, regions, categories and customer segments they should target or avoid.
Retail dataset of a global superstore for 4 years.
You can even take this a step further and try and build a Regression model to predict Sales or Profit.
Go crazy with the dataset, but also make sure to provide some business insights to improve.
Order ID => Unique Order ID for each Customer.
Order Date => Order Date of the product.
Ship Date => Shipping Date of the Product.
Ship Mode=> Shipping Mode specified by the Customer.
Customer Name => Name of the Customer.
Segment => The segment where the Customer belongs.
State => State of residence of the Customer.
Country => Country of residence of the Customer.
Market => The market place of the product.
Region => Region where the Customer belong.
Product ID => Unique ID of the Product.
Category => Category of the product ordered.
Sub-Category => Sub-Category of the product ordered.
Product Name => Name of the Product
Unit Price => The price for one unit.
Quantity => Quantity of the Product.
Discount => Discount provided.
Shipping Cost => The cost for shipping
Order Priority => Items shipped via priority are shipped by air which results in faster delivery times.
Sales => Sales of the Product.
Expenses => The expense is the cost of operations that a company incurs to generate revenue.
Revenue => The Revenue refers to the total earnings.
Year => Year of the Sales.
I do not own this data. I merely found it from the Tableau website and add some row. All credits to the original authors/creators. For educational purposes only.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset is an expanded version of the popular "Sample - Superstore Sales" dataset, commonly used for introductory data analysis and visualization. It contains detailed transactional data for a US-based retail company, covering orders, products, and customer information.
This version is specifically designed for practicing Data Quality (DQ) and Data Wrangling skills, featuring a unique set of real-world "dirty data" problems (like those encountered in tools like SPSS Modeler, Tableau Prep, or Alteryx) that must be cleaned before any analysis or machine learning can begin.
This dataset combines the original Superstore data with 15,000 plausibly generated synthetic records, totaling 25,000 rows of transactional data. It includes 21 columns detailing: - Order Information: Order ID, Order Date, Ship Date, Ship Mode. - Customer Information: Customer ID, Customer Name, Segment. - Geographic Information: Country, City, State, Postal Code, Region. - Product Information: Product ID, Category, Sub-Category, Product Name. - Financial Metrics: Sales, Quantity, Discount, and Profit.
This dataset is intentionally corrupted to provide a robust practice environment for data cleaning. Challenges include: Missing/Inconsistent Values: Deliberate gaps in Profit and Discount, and multiple inconsistent entries (-- or blank) in the Region column.
Data Type Mismatches: Order Date and Ship Date are stored as text strings, and the Profit column is polluted with comma-formatted strings (e.g., "1,234.56"), forcing the entire column to be read as an object (string) type.
Categorical Inconsistencies: The Category field contains variations and typos like "Tech", "technologies", "Furni", and "OfficeSupply" that require standardization.
Outliers and Invalid Data: Extreme outliers have been added to the Sales and Profit fields, alongside a subset of transactions with an invalid Sales value of 0.
Duplicate Records: Over 200 rows are duplicated (with slight financial variations) to test your deduplication logic.
This dataset is ideal for:
Data Wrangling/Cleaning (Primary Focus): Fix all the intentional data quality issues before proceeding.
Exploratory Data Analysis (EDA): Analyze sales distribution by region, segment, and category.
Regression: Predict the Profit based on Sales, Discount, and product features.
Classification: Build an RFM Model (Recency, Frequency, Monetary) and create a target variable (HighValueCustomer = 1 if total sales are* $>$ $1000$*) to be predicted by logistical regression or decision trees.
Time Series Analysis: Aggregate sales by month/year to perform forecasting.
This dataset is an expanded and corrupted derivative of the original Sample Superstore dataset, credited to Tableau and widely shared for educational purposes. All synthetic records were generated to follow the plausible distribution of the original data.
Facebook
TwitterThe Superstore data set comes with Tableau. It contains information about products, sales, profits, and so on that you can use to identify key areas for improvement within this fictitious company.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is a cleaned version of the Superstore Excel file, containing only the primary worksheet. It includes sales data from a fictional superstore, covering details like order ID, product categories, shipping information, profit, and region-wise performance. It is widely used for practicing data analysis, data visualization, and machine learning tasks such as forecasting and classification.
Features include:
Order Details (Order ID, Order Date, Ship Date)
Customer Information (Customer ID, Segment)
Geographic Data (City, State, Region)
Product Categories
Sales, Profit, Quantity, Discount
Shipping Mode
Ideal for learning and practicing:
Data Cleaning
EDA (Exploratory Data Analysis)
Data Visualization
Dashboarding (Tableau, Power BI)
Machine Learning Projects
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the revised dataset based on the Excel file from tableau.com, which includes the sales data for a superstore from January 2014 to December 2017 (4 years' data).
This dataset can be used to learn time series analysis, cohort analysis, etc.
Facebook
TwitterThis is a data set that is sales data from a retail store. It is often used with Tableau.
The data set was downloaded from the Tableau data set website. It represents one calendar year.
Thank you to the Tableau Public Resources
What are the projected sales and profit for the coming year?
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
š Context of the Dataset The āSuperstoreā dataset is a fictional retail dataset designed to simulate real-world business operations. It includes data on: ļ¬Sales, profit, and quantity across product categories ļ¬Customer segments and regions ļ¬Order dates and shipping methods ļ¬Geographic distribution of performance
š Source & Structure ļ¬Origin: Tableauās sample dataset, often bundled with Tableau Desktop ļ¬Format: CSV or Excel file with ~10,000 rows of transactional data ļ¬Fields: Order ID, Customer Name, Segment, Category, Sub-Category, Sales, Profit, Region, Ship Date, etc.
š” Inspiration & Application ļ¬Inspired by: Tableauās training materials and real-world retail analytics ļ¬Used for: Skill demonstration in data visualization, dashboard design, and executive reporting ļ¬Potential Application: Retail strategy, inventory optimization, regional sales planning
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset represents a Snowflake Schema model built from the popular Tableau Superstore dataset which exists primarily in a denormalized (flat) format.
This version is fully structured into fact and dimension tables, making it ready for data warehouse design, SQL analytics, and BI visualization projects.
The dataset was modeled to demonstrate dimensional modeling best practices, showing how the original flat Superstore data can be normalized into related dimensions and a central fact table.
Use this dataset to: - Practice SQL joins and schema design - Build ETL pipelines or dbt models - Design Power BI dashboards - Learn data warehouse normalization (3NF ā Snowflake) concepts - Simulate enterprise data warehouse reporting environments
Iām open to suggestions or improvements from the community ā feel free to share ideas on additional dimensions, measures, or transformations that could improve and make this dataset even more useful for learning and analysis.
Transformation was done using dbt, check out the models and the entire project.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Superstore Sales Dataset (Synthetic) Description This dataset contains synthetic sales transaction records for a fictional retail company, modeled after the popular āSuperstoreā dataset commonly used in data analytics and visualization projects.
It includes 1,000 orders placed between 2018 and 2020, covering multiple product categories, customer segments, shipping modes, and geographic regions.
This dataset is ideal for practicing: SQL queries (aggregations, joins, window functions) Data visualization in Power BI, Tableau, or Excel Business analytics techniques such as sales trend analysis, customer segmentation, and profitability studies
Note: All data is synthetically generated and does not contain any real customer or company information.
Facebook
TwitterManufacture dataset of a furniture superstore for 2 years. Contain dummy data and global superstore data from http://www.tableau.com/sites/default/files/training/global_superstore.zip Perform EDA and Predict the sales by using the Training dataset!
The dataset is easy to understand and is self-explanatory
Facebook
TwitterFinancial data from a superstore located in many different regions. Good for training on Tableau to build different charts, dashboards, filters, and actions.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
In the case study titled "Blinkit: Grocery Product Analysis," a dataset called 'Grocery Sales' contains 12 columns with information on sales of grocery items across different outlets. Using Tableau, you as a data analyst can uncover customer behavior insights, track sales trends, and gather feedback. These insights will drive operational improvements, enhance customer satisfaction, and optimize product offerings and store layout. Tableau enables data-driven decision-making for positive outcomes at Blinkit.
The table Grocery Sales is a .CSV file and has the following columns, details of which are as follows:
⢠Item_Identifier: A unique ID for each product in the dataset. ⢠Item_Weight: The weight of the product. ⢠Item_Fat_Content: Indicates whether the product is low fat or not. ⢠Item_Visibility: The percentage of the total display area in the store that is allocated to the specific product. ⢠Item_Type: The category or type of product. ⢠Item_MRP: The maximum retail price (list price) of the product. ⢠Outlet_Identifier: A unique ID for each store in the dataset. ⢠Outlet_Establishment_Year: The year in which the store was established. ⢠Outlet_Size: The size of the store in terms of ground area covered. ⢠Outlet_Location_Type: The type of city or region in which the store is located. ⢠Outlet_Type: Indicates whether the store is a grocery store or a supermarket. ⢠Item_Outlet_Sales: The sales of the product in the particular store. This is the outcome variable that we want to predict.
Facebook
TwitterDataset Overview This fictional dataset, generated by ChatGPT, is designed for those interested in learning and practicing data visualization, dashboard creation, and data analysis. It contains 10,000 rows of data reflecting the inventory and sales patterns of a typical supermarket, spanning a timeframe from January 1, 2024, to June 30, 2024.
The dataset aims to mimic real-world inventory dynamics and includes product details, stock levels, sales data, supplier performance, and restocking schedules. It's perfect for creating interactive dashboards in tools like Excel, Tableau, or Power BI or for practicing data cleaning and exploratory data analysis (EDA).
Key Features Comprehensive Columns:
Date: Record date. ProductID: Unique identifier for products. ProductName: Product names across diverse supermarket categories. Category: Categories like Dairy, Meat, Produce, etc. Supplier: Fictional supplier names for products. UnitPrice: Realistic product pricing. StockQuantity: Current stock levels. StockValue: Total value of inventory for each product. ReorderLevel: Threshold for triggering a reorder. ReorderQuantity: Recommended reorder quantity. UnitsSold: Number of units sold. SalesValue: Total sales value for each product. LastSoldDate: Last date of sale. LastRestockDate: Date of the last restock. NextRestockDate: Scheduled date for the next restock. DeliveryTimeDays: Delivery lead time from suppliers. DeliveryStatus: Status of the latest delivery (e.g., On Time, Delayed).
Realistic Data Generation:
Products include 50 common supermarket items across 9 categories (Dairy, Bakery, Beverages, Meat, Produce, Frozen, Snacks, Cleaning Supplies, Health & Beauty). Reflects seasonal trends and realistic stock replenishment behaviors. Randomized yet logical patterns for pricing, sales, and stock levels.
Versatile Use Cases:
Ideal for data visualization projects. Suitable for inventory management simulation. Can be used to practice time-series analysis.
Why Use This Dataset? This dataset is a learning resource, crafted to provide aspiring data enthusiasts and professionals with a sandbox to hone their skills in:
Building dashboards in Tableau, Power BI, or Excel. Analyzing inventory trends and forecasting demand. Visualizing data insights using tools like Matplotlib, Seaborn, or Plotly.
Disclaimer This dataset is entirely fictional and was generated by ChatGPT, a large language model created by OpenAI. While the data reflects patterns of a real supermarket, it is not based on any actual business or proprietary data.
Shoutout to ChatGPT for generating this comprehensive dataset and making it available to the Kaggle community! š
Acknowledgments If you find this dataset helpful, feel free to share your visualizations and insights! Letās make learning data visualization engaging and fun.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
The Superstore Sales Data dataset, available in an Excel format as "Superstore.xlsx," is a comprehensive collection of sales and customer-related information from a retail superstore. This dataset comprises* three distinct tables*, each providing specific insights into the store's operations and customer interactions.