45 datasets found

Tableau Sample Superstore
kaggle.com
zip
Updated Dec 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Truong Dai (2021). Tableau Sample Superstore [Dataset]. https://www.kaggle.com/datasets/truongdai/tableau-sample-superstore
Explore at:
zip(1017586 bytes)Available download formats
Dataset updated
Dec 16, 2021
Authors
Truong Dai
Description
Dataset

This dataset was created by Truong Dai

Contents
Superstore Dataset
kaggle.com
zip
Updated Sep 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivam Amrutkar (2023). Superstore Dataset [Dataset]. https://www.kaggle.com/datasets/yesshivam007/superstore-dataset
Explore at:
zip(2119716 bytes)Available download formats
Dataset updated
Sep 25, 2023
Authors
Shivam Amrutkar
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
The Superstore Sales Data dataset, available in an Excel format as "Superstore.xlsx," is a comprehensive collection of sales and customer-related information from a retail superstore. This dataset comprises* three distinct tables*, each providing specific insights into the store's operations and customer interactions.
Sales Data
kaggle.com
zip
Updated Jan 17, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LenaPiter (2018). Sales Data [Dataset]. https://www.kaggle.com/lenapiter/sales-data
Explore at:
zip(241448 bytes)Available download formats
Dataset updated
Jan 17, 2018
Authors
LenaPiter
Description
Acknowledgements

Click here for original dataset: https://community.tableau.com/docs/DOC-1236
Tableau Dashboard on HR Dataset
kaggle.com
Updated Aug 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rupesh Patil (2023). Tableau Dashboard on HR Dataset [Dataset]. https://www.kaggle.com/datasets/rupeshpatil1997/tableau-hr-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 24, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rupesh Patil
Description
The HR dataset contains employee-related information, such as personal details, job roles, salaries, and performance metrics. It's used by organizations to manage human resources, make informed staffing decisions, and analyze workforce trends. The dataset aids in optimizing employee satisfaction, productivity, and organizational growth. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15826402%2F6f621dd7a72a2d8c6d0df659c6604189%2FHR%20Dashboard.jpg?generation=1692882310646646&alt=media" alt="">
Tableau: Access Socrata Data using OData
data.edmonton.ca
csv, xlsx, xml
Updated Aug 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Socrata Client Center (2021). Tableau: Access Socrata Data using OData [Dataset]. https://data.edmonton.ca/dataset/Tableau-Access-Socrata-Data-using-OData/jqmu-nqmc
Explore at:
csv, xlsx, xmlAvailable download formats
Dataset updated
Aug 25, 2021
Dataset provided by
Socratahttp://www.blist.com/
Authors
Socrata Client Center
Description
Socrata datasets, including private datasets, can be accessed through a unique OData endpoint, allowing users to seamlessly connect to their data through a number of different tools, including Tableau Desktop. ↙︎↙︎CLICK THE "MORE" LINK↙︎↙︎

See the links below for relevant documentation.

Note that Socrata OData endpoints support basic filtering, for example to retrieve just the currently active Edmonton Public School (EPSB) ward boundaries from the dataset that contains the historical data as well:

https://data.edmonton.ca/OData.svc/y5qu-dj6t?$filter=effdt_type eq 'Current'

To retrieve the EPSB ward boundaries as they were in 2014:

https://data.edmonton.ca/OData.svc/y5qu-dj6t?$filter=year(effective_start_date) Le 2014 and year(effective_end_date) Gt 2014

This kind of filtering may be better achieved in Tableau though.
Streaming Service Data
kaggle.com
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chad Wambles (2024). Streaming Service Data [Dataset]. https://www.kaggle.com/datasets/chadwambles/streaming-service-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 19, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Chad Wambles
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
A dataset I generated to showcase a sample set of user data for a fictional streaming service. This data is great for practicing SQL, Excel, Tableau, or Power BI.

1000 rows and 25 columns of connected data.

See below for column descriptions.

Enjoy :)
Tableau Dummy Dataset for Practice
kaggle.com
Updated Aug 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Piush Dave (2025). Tableau Dummy Dataset for Practice [Dataset]. https://www.kaggle.com/datasets/piyushdave/tableau-dummy-dataset-for-practice
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 21, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Piush Dave
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Domain-Specific Dataset and Visualization Guide

This package contains 20 realistic datasets in CSV format across different industries, along with 20 text files suggesting visualization ideas. Each dataset includes about 300 rows of synthetic but domain-appropriate data. They are designed for data analysis, visualization practice, machine learning projects, and dashboard building.

What’s inside

20 CSV files, one for each domain:

Education

E-Commerce

Healthcare

Finance

Retail

Social Media

Manufacturing

Sports

Transport

Hospitality

Telecom

Banking

Real Estate

Gaming

Agriculture

Automobile

Energy

Insurance

Government

Entertainment

20 TXT files, each listing 10 relevant graphing options for the dataset.

MASTER_INDEX.csv, which summarizes all domains with their column names.

Use cases

Practice data cleaning, exploration, and visualization in Excel, Tableau, Power BI, or Python.

Build dashboards for specific industries.

Train beginner-level machine learning models such as classification and regression.

Use in classroom teaching or workshops as ready-made datasets.

Example

Education dataset has columns like StudentName, Class, Subject, Marks, AttendancePercent. Suggested graphs: bar chart of average marks by subject, scatter plot of marks vs attendance percent, line chart of attendance over time.

E-Commerce dataset has columns like OrderDate, Product, Category, Price, Quantity, Total. Suggested graphs: line chart of revenue trend, bar chart of revenue by category, pie chart of payment mode share.
D
Replication Data for: Explainable reasoning with a controlled natural...
dataverse.nl
csv, pdf
Updated Aug 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofia Gavasi; Nico Roos; Sofia Gavasi; Nico Roos (2025). Replication Data for: Explainable reasoning with a controlled natural language [Dataset]. http://doi.org/10.34894/0ANEPG
Explore at:
csv(1811), csv(5672), pdf(56780), csv(4932), csv(8259), csv(6372), csv(5558)Available download formats
Unique identifier
https://doi.org/10.34894/0ANEPG
Dataset updated
Aug 28, 2025
Dataset provided by
DataverseNL
Authors
Sofia Gavasi; Nico Roos; Sofia Gavasi; Nico Roos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Six datasets that have been used to evaluate reasoning with a Controlled Natural Language. A seventh file contains the grammar specification of the Controlled Natural Language. * 'BNF Grammar.pdf' contains the grammar specification of the Controlled Natural Language in Backus-Naur Form. * 'syllogism_dataset.csv' contains the adapted Kaggle dataset. * 'puzzles_dataset.csv' contains logical puzzles found on the internet and reformulated in the Controlled Natural Language if necessary. * 'gpt_dataset_easy.csv', 'gpt_dataset_hard.csv' and 'deepseek_dataset.csv' contain reasoning examples generated by Large Language Models. * 'interface_examples.csv' contains the inference example used in the user tests.

Netflix Data: Cleaning, Analysis and Visualization

kaggle.com

zip

Updated Aug 26, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Abdulrasaq Ariyo (2022). Netflix Data: Cleaning, Analysis and Visualization [Dataset]. https://www.kaggle.com/datasets/ariyoomotade/netflix-data-cleaning-analysis-and-visualization

Explore at:

zip(276607 bytes)Available download formats

Dataset updated

Aug 26, 2022

Authors

Abdulrasaq Ariyo

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Netflix is a popular streaming service that offers a vast catalog of movies, TV shows, and original contents. This dataset is a cleaned version of the original version which can be found here. The data consist of contents added to Netflix from 2008 to 2021. The oldest content is as old as 1925 and the newest as 2021. This dataset will be cleaned with PostgreSQL and visualized with Tableau. The purpose of this dataset is to test my data cleaning and visualization skills. The cleaned data can be found below and the Tableau dashboard can be found here .

Data Cleaning

We are going to: 1. Treat the Nulls 2. Treat the duplicates 3. Populate missing rows 4. Drop unneeded columns 5. Split columns Extra steps and more explanation on the process will be explained through the code comments

--View dataset

SELECT * 
FROM netflix;

--The show_id column is the unique id for the dataset, therefore we are going to check for duplicates
                                  
SELECT show_id, COUNT(*)                                                                                      
FROM netflix 
GROUP BY show_id                                                                                              
ORDER BY show_id DESC;

--No duplicates

--Check null values across columns

SELECT COUNT(*) FILTER (WHERE show_id IS NULL) AS showid_nulls,
    COUNT(*) FILTER (WHERE type IS NULL) AS type_nulls,
    COUNT(*) FILTER (WHERE title IS NULL) AS title_nulls,
    COUNT(*) FILTER (WHERE director IS NULL) AS director_nulls,
    COUNT(*) FILTER (WHERE movie_cast IS NULL) AS movie_cast_nulls,
    COUNT(*) FILTER (WHERE country IS NULL) AS country_nulls,
    COUNT(*) FILTER (WHERE date_added IS NULL) AS date_addes_nulls,
    COUNT(*) FILTER (WHERE release_year IS NULL) AS release_year_nulls,
    COUNT(*) FILTER (WHERE rating IS NULL) AS rating_nulls,
    COUNT(*) FILTER (WHERE duration IS NULL) AS duration_nulls,
    COUNT(*) FILTER (WHERE listed_in IS NULL) AS listed_in_nulls,
    COUNT(*) FILTER (WHERE description IS NULL) AS description_nulls
FROM netflix;

We can see that there are NULLS. 
director_nulls = 2634
movie_cast_nulls = 825
country_nulls = 831
date_added_nulls = 10
rating_nulls = 4
duration_nulls = 3

The director column nulls is about 30% of the whole column, therefore I will not delete them. I will rather find another column to populate it. To populate the director column, we want to find out if there is relationship between movie_cast column and director column

-- Below, we find out if some directors are likely to work with particular cast

WITH cte AS
(
SELECT title, CONCAT(director, '---', movie_cast) AS director_cast 
FROM netflix
)

SELECT director_cast, COUNT(*) AS count
FROM cte
GROUP BY director_cast
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

With this, we can now populate NULL rows in directors 
using their record with movie_cast

UPDATE netflix 
SET director = 'Alastair Fothergill'
WHERE movie_cast = 'David Attenborough'
AND director IS NULL ;

--Repeat this step to populate the rest of the director nulls
--Populate the rest of the NULL in director as "Not Given"

UPDATE netflix 
SET director = 'Not Given'
WHERE director IS NULL;

--When I was doing this, I found a less complex and faster way to populate a column which I will use next

Just like the director column, I will not delete the nulls in country. Since the country column is related to director and movie, we are going to populate the country column with the director column

--Populate the country using the director column

SELECT COALESCE(nt.country,nt2.country) 
FROM netflix AS nt
JOIN netflix AS nt2 
ON nt.director = nt2.director 
AND nt.show_id <> nt2.show_id
WHERE nt.country IS NULL;
UPDATE netflix
SET country = nt2.country
FROM netflix AS nt2
WHERE netflix.director = nt2.director and netflix.show_id <> nt2.show_id 
AND netflix.country IS NULL;


--To confirm if there are still directors linked to country that refuse to update

SELECT director, country, date_added
FROM netflix
WHERE country IS NULL;

--Populate the rest of the NULL in director as "Not Given"

UPDATE netflix 
SET country = 'Not Given'
WHERE country IS NULL;

The date_added rows nulls is just 10 out of over 8000 rows, deleting them cannot affect our analysis or visualization

--Show date_added nulls

SELECT show_id, date_added
FROM netflix_clean
WHERE date_added IS NULL;

--DELETE nulls

DELETE F...

Sample Super Store - Tableau
kaggle.com
zip
Updated Aug 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Srikanth Regadi (2021). Sample Super Store - Tableau [Dataset]. https://www.kaggle.com/srikanthregadi/sample-super-store-tableau
Explore at:
zip(7005767 bytes)Available download formats
Dataset updated
Aug 16, 2021
Authors
Srikanth Regadi
Description
Context

This data set is from Tableau which is the default sample data for all tableau dashboard development. It is extensively helpful for all analyses.

Content

Can have orders, Returns, and Customer details.

Acknowledgements

This is from Tableau

Inspiration

Combine all Orders, Returns, and customers' data and how to improve the avg. revenue on MoM and how to reduce returns. Predict the return products and losses.
Sales & Customer Analytics – Interactive dashboard
kaggle.com
zip
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grace Egbe (2025). Sales & Customer Analytics – Interactive dashboard [Dataset]. https://www.kaggle.com/datasets/graceegbe12/sales-and-customer-analytics-interactive-dashboard
Explore at:
zip(106656 bytes)Available download formats
Dataset updated
Feb 3, 2025
Authors
Grace Egbe
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📊 Sales & Customer Analytics – Tableau Dashboard (PDF & Interactive) 🔍 Overview This dataset includes a Tableau project analysing sales trends & customer insights with an interactive dashboard switch.

The dashboards provide actionable insights into: ✅ Sales performance & revenue trends 📈 ✅ Top-performing products & regions 🌍 ✅ Customer segmentation & behavior analysis 🛍️ ✅ Retention strategies & marketing impact 🎯

📂 Files Included 📄 Sales & Customer Analytics Dashboard (PDF Report) – A full summary of insights. 🎨 Tableau Workbook (.twbx) – The interactive dashboards (requires Tableau). 🖼️ Screenshots – Previews of the dashboards.

🔗 Explore the Interactive Dashboards on Tableau Public :

Sales Dashboard:[https://public.tableau.com/app/profile/egbe.grace/viz/SalesCustomerDashboardsDynamic_17385906491570/CustomerDashboard] Customer Dashboard: [https://public.tableau.com/app/profile/egbe.grace/viz/SalesCustomerDashboardsDynamic_17385906491570/CustomerDashboard]

📌 Key Insights from the Dashboards ✅ Revenue trends show peak sales periods & seasonal demand shifts. ✅ Top-selling products & regions help businesses optimize their strategies. ✅ Customer segmentation identifies high-value buyers for targeted marketing. ✅ Retention analysis provides insights into repeat customer behaviour.

💡 How This Can Help: This dataset and Tableau project can help businesses & analysts uncover key patterns in sales and customer behavior, allowing them to make data-driven decisions to improve growth and customer retention.

💬 Would love to hear your feedback! Let’s discuss the impact of sales analytics in business strategy.

📢 #DataAnalytics #Tableau #SalesAnalysis #CustomerInsights #BusinessIntelligence #DataVisualization
Scooter Sales - Excel Project
kaggle.com
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ann Truong (2023). Scooter Sales - Excel Project [Dataset]. https://www.kaggle.com/datasets/bvanntruong/scooter-sales-excel-project
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 8, 2023
Dataset provided by
Kaggle
Authors
Ann Truong
Description
The link for the Excel project to download can be found on GitHub here. It includes the raw data, Pivot Tables, and an interactive dashboard with Pivot Charts and Slicers. The project also includes business questions and the formulas I used to answer. The image below is included for ease. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2F61e460b5f6a1fa73cfaaa33aa8107bd5%2FBusinessQuestions.png?generation=1686190703261971&alt=media" alt=""> The link for the Tableau adjusted dashboard can be found here.

A screenshot of the interactive Excel dashboard is also included below for ease. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12904052%2Fe581f1fce8afc732f7823904da9e4cce%2FScooter%20Dashboard%20Image.png?generation=1686190815608343&alt=media" alt="">
Bank Loan Analysis Project in Tableau. twbx
kaggle.com
zip
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Bank Loan Analysis Project in Tableau. twbx [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project-in-tableau-twbx
Explore at:
zip(2258932 bytes)Available download formats
Dataset updated
Jul 4, 2024
Authors
Sanjana Murthy
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
About Datasets:

Domain : Finance Project: Bank loan of customers Datasets: Finance_1.xlsx & Finance_2.xlsx Dataset Type: Excel Data Dataset Size: Each Excel file has 39k+ records

KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains bar chart, text, stacked bar chart, dashboard, horizontal bars, donut chart, area chart, treemap, slicers, table, image.
Car Sales Analysis Dashboard|Tableau Visualization
kaggle.com
zip
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Safae Ahb (2025). Car Sales Analysis Dashboard|Tableau Visualization [Dataset]. https://www.kaggle.com/datasets/safaeahb/car-sales-analysis-dashboard
Explore at:
zip(1000907 bytes)Available download formats
Dataset updated
Feb 4, 2025
Authors
Safae Ahb
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This interactive Tableau dashboard provides a detailed analysis of car sales trends from 2022 to 2023. It explores key metrics such as total sales, average car prices, and sales distribution by car type, color, and region.

Key Features: 📊 Sales Overview: Total sales, quantity, and price analysis. 📈 Monthly Trends: A time-series visualization of sales growth. 🎨 Car Color Preferences: Pie chart showing distribution by color. 🌍 Regional Sales Breakdown: Geospatial analysis of sales across the U.S. 🏆 Model-wise Performance: Sales comparison across different car brands. ⚙️ Engine & Transmission Impact: Filtering options to analyze impact by car type. This dashboard is ideal for automotive industry analysts, data enthusiasts, and business decision-makers interested in sales performance insights.

📌 Tools Used: Tableau, Data Cleaning & Preparation.
AdventureWorks Sample Mfg Database Tables
kaggle.com
zip
Updated Feb 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Brown (2023). AdventureWorks Sample Mfg Database Tables [Dataset]. https://www.kaggle.com/datasets/universalanalyst/adventureworks-sample-mfg-database-tables
Explore at:
zip(3689556 bytes)Available download formats
Dataset updated
Feb 24, 2023
Authors
Michael Brown
Description
In order to practice writing SQL queries in a semi-realistic database, I discovered and imported Microsoft's AdventureWorks sample database into Microsoft SQL Server Express. The Adventure Works [fictious] company represents a bicycle manufacturer that sells bicycles and accessories to global markets. Queries were written for developing and testing a Tableau dashboard.

The dataset presented here represents a fraction of the entire manufacturing relational database. Tables within the dataset include product, purchasing, work order, and transaction data.

The full database sample can be found on Microsoft SQL Docs website: https://learn.microsoft.com/en-us/sql/samples/ and additionally on Github: https://github.com/microsoft/sql-server-samples
Blinkit dataset
kaggle.com
zip
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mukesh gadri (2024). Blinkit dataset [Dataset]. https://www.kaggle.com/datasets/mukeshgadri/blinkit-dataset
Explore at:
zip(695160 bytes)Available download formats
Dataset updated
Jul 18, 2024
Authors
mukesh gadri
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
In the case study titled "Blinkit: Grocery Product Analysis," a dataset called 'Grocery Sales' contains 12 columns with information on sales of grocery items across different outlets. Using Tableau, you as a data analyst can uncover customer behavior insights, track sales trends, and gather feedback. These insights will drive operational improvements, enhance customer satisfaction, and optimize product offerings and store layout. Tableau enables data-driven decision-making for positive outcomes at Blinkit.

The table Grocery Sales is a .CSV file and has the following columns, details of which are as follows:

• Item_Identifier: A unique ID for each product in the dataset. • Item_Weight: The weight of the product. • Item_Fat_Content: Indicates whether the product is low fat or not. • Item_Visibility: The percentage of the total display area in the store that is allocated to the specific product. • Item_Type: The category or type of product. • Item_MRP: The maximum retail price (list price) of the product. • Outlet_Identifier: A unique ID for each store in the dataset. • Outlet_Establishment_Year: The year in which the store was established. • Outlet_Size: The size of the store in terms of ground area covered. • Outlet_Location_Type: The type of city or region in which the store is located. • Outlet_Type: Indicates whether the store is a grocery store or a supermarket. • Item_Outlet_Sales: The sales of the product in the particular store. This is the outcome variable that we want to predict.
Store Sales Dataset
kaggle.com
zip
Updated Sep 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nimisha Davis (2025). Store Sales Dataset [Dataset]. https://www.kaggle.com/datasets/drnimishadavis/store-sales-dataset
Explore at:
zip(562846 bytes)Available download formats
Dataset updated
Sep 22, 2025
Authors
Nimisha Davis
Description
This dataset contains retail sales records from a superstore, including detailed information on orders, products, categories, sales, discounts, profits, customers, and regions.

It is widely used for business intelligence, data visualization, and machine learning projects. With features such as order date, ship mode, customer segment, and geographic region, the dataset is excellent for:

Sales forecasting

Profitability analysis

Market basket analysis

Customer segmentation

Data visualization practice (Tableau, Power BI, Excel, Python, R)

Inspiration:

Great dataset for learning how to build dashboards.

Commonly used in case studies for predictive analytics and decision-making.

Source: Originally inspired by a sample dataset frequently used in Tableau training and BI case studies.
Visualizing Chicago Crime Data
kaggle.com
zip
Updated Jul 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elijah Toumoua (2022). Visualizing Chicago Crime Data [Dataset]. https://www.kaggle.com/datasets/elijahtoumoua/chicago-analysis-of-crime-data-dashboard
Explore at:
zip(94861784 bytes)Available download formats
Dataset updated
Jul 1, 2022
Authors
Elijah Toumoua
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Chicago
Description
Prelude

This dataset is a cleaned version of the Chicago Crime Dataset, which can be found here. All rights for the dataset go to the original owners. The purpose of this dataset is to display my skills in visualizations and creating dashboards. To be specific, I will attempt to create a dashboard that will allow users to see metrics for a specific crime within a given year using filters and metrics. Due to this, there will not be much of a focus on the analysis of the data, but there will be portions discussing the validity of the dataset, the steps I took to clean the data, and how I organized it. The cleaned datasets can be found below, the Query (which utilized BigQuery) can be found here and the Tableau dashboard can be found here.

About the Dataset

Important Facts

The dataset comes directly from the City of Chicago's website under the page "City Data Catalog." The data is gathered directly from the Chicago Police's CLEAR (Citizen Law Enforcement Analysis and Reporting) and is updated daily to present the information accurately. This means that a crime on a specific date may be changed to better display the case. The dataset represents crimes starting all the way from 2001 to seven days prior to today's date.

Reliability

Using the ROCCC method, we can see that: * The data has high reliability: The data covers the entirety of Chicago from a little over 2 decades. It covers all the wards within Chicago and even gives the street names. While we may not have an idea for how big the sample size is, I do believe that the dataset has high reliability since it geographically covers the entirety of Chicago. * The data has high originality: The dataset was gained directly from the Chicago Police Dept. using their database, so we can say this dataset is original. * The data is somewhat comprehensive: While we do have important information such as the types of crimes committed and their geographic location, I do not think this gives us proper insights as to why these crimes take place. We can pinpoint the location of the crime, but we are limited by the information we have. How hot was the day of the crime? Did the crime take place in a neighborhood with low-income? I believe that these key factors prevent us from getting proper insights as to why these crimes take place, so I would say that this dataset is subpar with how comprehensive it is. * The data is current: The dataset is updated frequently to display crimes that took place seven days prior to today's date and may even update past crimes as more information comes to light. Due to the frequent updates, I do believe the data is current. * The data is cited: As mentioned prior, the data is collected directly from the polices CLEAR system, so we can say that the data is cited.

Processing the Data

Cleaning the Dataset

The purpose of this step is to clean the dataset such that there are no outliers in the dashboard. To do this, we are going to do the following: * Check for any null values and determine whether we should remove them. * Update any values where there may be typos. * Check for outliers and determine if we should remove them.

The following steps will be explained in the code segments below. (I used BigQuery for this so the coding will follow BigQuery's syntax) ```

Examining the dataset

There are over 7.5 million rows of data

Putting a limit so it does not take a long time to run

SELECT * FROM portfolioproject-350601.ChicagoCrime.Crime LIMIT 1000;

Seeing which points are null

There are 85,000 null points so we can exclude them as it's not a significant amount since it is only ~1.3% of the dataset

Most of the null points are in the lat and long, which we will need later

Because we don't have the full address, we can't estimate the lat and long in SQL so we will have to delete the rows with Null Data

SELECT * FROM portfolioproject-350601.ChicagoCrime.Crime WHERE unique_key IS NULL OR case_number IS NULL OR date IS NULL OR primary_type IS NULL OR location_description IS NULL OR arrest IS NULL OR longitude IS NULL OR latitude IS NULL;

Deleting all null rows

DELETE FROM portfolioproject-350601.ChicagoCrime.Crime WHERE
unique_key IS NULL OR case_number IS NULL OR date IS NULL OR primary_type IS NULL OR location_description IS NULL OR arrest IS NULL OR longitude IS NULL OR latitude IS NULL;

Checking for any duplicates in the unique keys

None to be found

SELECT unique_key, COUNT(unique_key) FROM `portfolioproject-350601.ChicagoCrime....
Advanced Superstore Project
kaggle.com
zip
Updated Nov 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad (2025). Advanced Superstore Project [Dataset]. https://www.kaggle.com/datasets/mohammawajedali/advanced-superstore-project
Explore at:
zip(1435431 bytes)Available download formats
Dataset updated
Nov 7, 2025
Authors
Mohammad
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
📊 Context of the Dataset The “Superstore” dataset is a fictional retail dataset designed to simulate real-world business operations. It includes data on: Sales, profit, and quantity across product categories Customer segments and regions Order dates and shipping methods Geographic distribution of performance

📌 Source & Structure Origin: Tableau’s sample dataset, often bundled with Tableau Desktop Format: CSV or Excel file with ~10,000 rows of transactional data Fields: Order ID, Customer Name, Segment, Category, Sub-Category, Sales, Profit, Region, Ship Date, etc.

💡 Inspiration & Application Inspired by: Tableau’s training materials and real-world retail analytics Used for: Skill demonstration in data visualization, dashboard design, and executive reporting Potential Application: Retail strategy, inventory optimization, regional sales planning
Superstore Snowflake Schema Modeling Dataset
kaggle.com
zip
Updated Oct 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chik0di (2025). Superstore Snowflake Schema Modeling Dataset [Dataset]. https://www.kaggle.com/datasets/chik0di/superstore-snowflake-schema-modeling-dataset
Explore at:
zip(474167 bytes)Available download formats
Dataset updated
Oct 30, 2025
Authors
Chik0di
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset represents a Snowflake Schema model built from the popular Tableau Superstore dataset which exists primarily in a denormalized (flat) format.

This version is fully structured into fact and dimension tables, making it ready for data warehouse design, SQL analytics, and BI visualization projects.

The dataset was modeled to demonstrate dimensional modeling best practices, showing how the original flat Superstore data can be normalized into related dimensions and a central fact table.

Use this dataset to: - Practice SQL joins and schema design - Build ETL pipelines or dbt models - Design Power BI dashboards - Learn data warehouse normalization (3NF → Snowflake) concepts - Simulate enterprise data warehouse reporting environments

I’m open to suggestions or improvements from the community — feel free to share ideas on additional dimensions, measures, or transformations that could improve and make this dataset even more useful for learning and analysis.

Transformation was done using dbt, check out the models and the entire project.

Facebook

Twitter

Click to copy link

Link copied

Cite

Truong Dai (2021). Tableau Sample Superstore [Dataset]. https://www.kaggle.com/datasets/truongdai/tableau-sample-superstore

Tableau Sample Superstore

Tableau Sample Superstore Original Dataset

Explore at:

zip(1017586 bytes)Available download formats

Dataset updated

Dec 16, 2021

Authors

Truong Dai

Description

Dataset

This dataset was created by Truong Dai

Clear search

Close search

Google apps

Main menu

Tableau Sample Superstore

Dataset

Contents

Superstore Dataset

Sales Data

Acknowledgements

Tableau Dashboard on HR Dataset

Tableau: Access Socrata Data using OData

Streaming Service Data

Tableau Dummy Dataset for Practice

Replication Data for: Explainable reasoning with a controlled natural...

Netflix Data: Cleaning, Analysis and Visualization

Data Cleaning

Sample Super Store - Tableau

Context

Content

Acknowledgements

Inspiration

Sales & Customer Analytics – Interactive dashboard

Scooter Sales - Excel Project

Bank Loan Analysis Project in Tableau. twbx

Car Sales Analysis Dashboard|Tableau Visualization

AdventureWorks Sample Mfg Database Tables

Blinkit dataset

Store Sales Dataset

Visualizing Chicago Crime Data

Prelude

About the Dataset

Important Facts

Reliability

Processing the Data

Cleaning the Dataset

Examining the dataset

There are over 7.5 million rows of data

Putting a limit so it does not take a long time to run

Seeing which points are null

There are 85,000 null points so we can exclude them as it's not a significant amount since it is only ~1.3% of the dataset

Most of the null points are in the lat and long, which we will need later

Because we don't have the full address, we can't estimate the lat and long in SQL so we will have to delete the rows with Null Data

Deleting all null rows

Checking for any duplicates in the unique keys

None to be found

Advanced Superstore Project

Superstore Snowflake Schema Modeling Dataset

Tableau Sample SuperstoreSee More Versions

Tableau Sample Superstore Original Dataset

Dataset

Contents

Tableau Sample Superstore