80 datasets found

Tableau Dashboard on HR Dataset
kaggle.com
Updated Aug 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rupesh Patil (2023). Tableau Dashboard on HR Dataset [Dataset]. https://www.kaggle.com/datasets/rupeshpatil1997/tableau-hr-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 24, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rupesh Patil
Description
The HR dataset contains employee-related information, such as personal details, job roles, salaries, and performance metrics. It's used by organizations to manage human resources, make informed staffing decisions, and analyze workforce trends. The dataset aids in optimizing employee satisfaction, productivity, and organizational growth. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15826402%2F6f621dd7a72a2d8c6d0df659c6604189%2FHR%20Dashboard.jpg?generation=1692882310646646&alt=media" alt="">

Netflix Data: Cleaning, Analysis and Visualization

kaggle.com

zip

Updated Aug 26, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Abdulrasaq Ariyo (2022). Netflix Data: Cleaning, Analysis and Visualization [Dataset]. https://www.kaggle.com/datasets/ariyoomotade/netflix-data-cleaning-analysis-and-visualization

Explore at:

zip(276607 bytes)Available download formats

Dataset updated

Aug 26, 2022

Authors

Abdulrasaq Ariyo

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Netflix is a popular streaming service that offers a vast catalog of movies, TV shows, and original contents. This dataset is a cleaned version of the original version which can be found here. The data consist of contents added to Netflix from 2008 to 2021. The oldest content is as old as 1925 and the newest as 2021. This dataset will be cleaned with PostgreSQL and visualized with Tableau. The purpose of this dataset is to test my data cleaning and visualization skills. The cleaned data can be found below and the Tableau dashboard can be found here .

Data Cleaning

We are going to: 1. Treat the Nulls 2. Treat the duplicates 3. Populate missing rows 4. Drop unneeded columns 5. Split columns Extra steps and more explanation on the process will be explained through the code comments

--View dataset

SELECT * 
FROM netflix;

--The show_id column is the unique id for the dataset, therefore we are going to check for duplicates
                                  
SELECT show_id, COUNT(*)                                                                                      
FROM netflix 
GROUP BY show_id                                                                                              
ORDER BY show_id DESC;

--No duplicates

--Check null values across columns

SELECT COUNT(*) FILTER (WHERE show_id IS NULL) AS showid_nulls,
    COUNT(*) FILTER (WHERE type IS NULL) AS type_nulls,
    COUNT(*) FILTER (WHERE title IS NULL) AS title_nulls,
    COUNT(*) FILTER (WHERE director IS NULL) AS director_nulls,
    COUNT(*) FILTER (WHERE movie_cast IS NULL) AS movie_cast_nulls,
    COUNT(*) FILTER (WHERE country IS NULL) AS country_nulls,
    COUNT(*) FILTER (WHERE date_added IS NULL) AS date_addes_nulls,
    COUNT(*) FILTER (WHERE release_year IS NULL) AS release_year_nulls,
    COUNT(*) FILTER (WHERE rating IS NULL) AS rating_nulls,
    COUNT(*) FILTER (WHERE duration IS NULL) AS duration_nulls,
    COUNT(*) FILTER (WHERE listed_in IS NULL) AS listed_in_nulls,
    COUNT(*) FILTER (WHERE description IS NULL) AS description_nulls
FROM netflix;

We can see that there are NULLS. 
director_nulls = 2634
movie_cast_nulls = 825
country_nulls = 831
date_added_nulls = 10
rating_nulls = 4
duration_nulls = 3

The director column nulls is about 30% of the whole column, therefore I will not delete them. I will rather find another column to populate it. To populate the director column, we want to find out if there is relationship between movie_cast column and director column

-- Below, we find out if some directors are likely to work with particular cast

WITH cte AS
(
SELECT title, CONCAT(director, '---', movie_cast) AS director_cast 
FROM netflix
)

SELECT director_cast, COUNT(*) AS count
FROM cte
GROUP BY director_cast
HAVING COUNT(*) > 1
ORDER BY COUNT(*) DESC;

With this, we can now populate NULL rows in directors 
using their record with movie_cast

UPDATE netflix 
SET director = 'Alastair Fothergill'
WHERE movie_cast = 'David Attenborough'
AND director IS NULL ;

--Repeat this step to populate the rest of the director nulls
--Populate the rest of the NULL in director as "Not Given"

UPDATE netflix 
SET director = 'Not Given'
WHERE director IS NULL;

--When I was doing this, I found a less complex and faster way to populate a column which I will use next

Just like the director column, I will not delete the nulls in country. Since the country column is related to director and movie, we are going to populate the country column with the director column

--Populate the country using the director column

SELECT COALESCE(nt.country,nt2.country) 
FROM netflix AS nt
JOIN netflix AS nt2 
ON nt.director = nt2.director 
AND nt.show_id <> nt2.show_id
WHERE nt.country IS NULL;
UPDATE netflix
SET country = nt2.country
FROM netflix AS nt2
WHERE netflix.director = nt2.director and netflix.show_id <> nt2.show_id 
AND netflix.country IS NULL;


--To confirm if there are still directors linked to country that refuse to update

SELECT director, country, date_added
FROM netflix
WHERE country IS NULL;

--Populate the rest of the NULL in director as "Not Given"

UPDATE netflix 
SET country = 'Not Given'
WHERE country IS NULL;

The date_added rows nulls is just 10 out of over 8000 rows, deleting them cannot affect our analysis or visualization

--Show date_added nulls

SELECT show_id, date_added
FROM netflix_clean
WHERE date_added IS NULL;

--DELETE nulls

DELETE F...

Sales & Customer Analytics – Interactive dashboard
kaggle.com
zip
Updated Feb 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Grace Egbe (2025). Sales & Customer Analytics – Interactive dashboard [Dataset]. https://www.kaggle.com/datasets/graceegbe12/sales-and-customer-analytics-interactive-dashboard
Explore at:
zip(106656 bytes)Available download formats
Dataset updated
Feb 3, 2025
Authors
Grace Egbe
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
📊 Sales & Customer Analytics – Tableau Dashboard (PDF & Interactive) 🔍 Overview This dataset includes a Tableau project analysing sales trends & customer insights with an interactive dashboard switch.

The dashboards provide actionable insights into: ✅ Sales performance & revenue trends 📈 ✅ Top-performing products & regions 🌍 ✅ Customer segmentation & behavior analysis 🛍️ ✅ Retention strategies & marketing impact 🎯

📂 Files Included 📄 Sales & Customer Analytics Dashboard (PDF Report) – A full summary of insights. 🎨 Tableau Workbook (.twbx) – The interactive dashboards (requires Tableau). 🖼️ Screenshots – Previews of the dashboards.

🔗 Explore the Interactive Dashboards on Tableau Public :

Sales Dashboard:[https://public.tableau.com/app/profile/egbe.grace/viz/SalesCustomerDashboardsDynamic_17385906491570/CustomerDashboard] Customer Dashboard: [https://public.tableau.com/app/profile/egbe.grace/viz/SalesCustomerDashboardsDynamic_17385906491570/CustomerDashboard]

📌 Key Insights from the Dashboards ✅ Revenue trends show peak sales periods & seasonal demand shifts. ✅ Top-selling products & regions help businesses optimize their strategies. ✅ Customer segmentation identifies high-value buyers for targeted marketing. ✅ Retention analysis provides insights into repeat customer behaviour.

💡 How This Can Help: This dataset and Tableau project can help businesses & analysts uncover key patterns in sales and customer behavior, allowing them to make data-driven decisions to improve growth and customer retention.

💬 Would love to hear your feedback! Let’s discuss the impact of sales analytics in business strategy.

📢 #DataAnalytics #Tableau #SalesAnalysis #CustomerInsights #BusinessIntelligence #DataVisualization
Bank Loan Analysis Project in Tableau. twbx
kaggle.com
zip
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Bank Loan Analysis Project in Tableau. twbx [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/bank-loan-analysis-project-in-tableau-twbx
Explore at:
zip(2258932 bytes)Available download formats
Dataset updated
Jul 4, 2024
Authors
Sanjana Murthy
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
About Datasets:

Domain : Finance Project: Bank loan of customers Datasets: Finance_1.xlsx & Finance_2.xlsx Dataset Type: Excel Data Dataset Size: Each Excel file has 39k+ records

KPI's: 1. Year wise loan amount Stats 2. Grade and sub grade wise revol_bal 3. Total Payment for Verified Status Vs Total Payment for Non Verified Status 4. State wise loan status 5. Month wise loan status 6. Get more insights based on your understanding of the data

Process: 1. Understanding the problem 2. Data Collection 3. Data Cleaning 4. Exploring and analyzing the data 5. Interpreting the results

This data contains bar chart, text, stacked bar chart, dashboard, horizontal bars, donut chart, area chart, treemap, slicers, table, image.
USA Bank Financial Data
kaggle.com
zip
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VISHAL SINGH SANGRAL (2024). USA Bank Financial Data [Dataset]. https://www.kaggle.com/datasets/vishalsinghsangral/usa-bank-financial-data
Explore at:
zip(20684 bytes)Available download formats
Dataset updated
Jun 28, 2024
Authors
VISHAL SINGH SANGRAL
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Description:

The myusabank.csv dataset contains daily financial data for a fictional bank (MyUSA Bank) over a two-year period. It includes various key financial metrics such as interest income, interest expense, average earning assets, net income, total assets, shareholder equity, operating expenses, operating income, market share, and stock price. The data is structured to simulate realistic scenarios in the banking sector, including outliers, duplicates, and missing values for educational purposes.

Potential Student Tasks:

Data Cleaning and Preprocessing:

Handle missing values, duplicates, and outliers to ensure data integrity.

Normalize or scale data as needed for analysis.

Exploratory Data Analysis (EDA):

Visualize trends and distributions of financial metrics over time.

Identify correlations between different financial indicators.

Calculating Key Performance Indicators (KPIs):

Compute metrics such as Net Interest Margin (NIM), Return on Assets (ROA), Return on Equity (ROE), and Cost-to-Income Ratio using calculated fields.

Analyze the financial health and performance of MyUSA Bank based on these KPIs.

Building Tableau Dashboards:

Design interactive dashboards to present insights and trends.

Include summary cards, bar charts, line charts, and pie charts to visualize financial performance metrics.

Forecasting and Predictive Modeling:

Use historical data to forecast future financial performance.

Apply regression or time series analysis to predict market share or stock price movements.

Business Insights and Reporting:

Interpret findings to derive actionable insights for bank management.

Prepare reports or presentations summarizing key findings and recommendations.

Educational Goals:

The dataset aims to provide hands-on experience in data preprocessing, analysis, and visualization within the context of banking and finance. It encourages students to apply data science techniques to real-world financial data, enhancing their skills in data-driven decision-making and strategic analysis.
world's librareis data - dashboard with tableau
kaggle.com
Updated Jan 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
prajapati meet (2023). world's librareis data - dashboard with tableau [Dataset]. https://www.kaggle.com/datasets/prajapatimeet/worlds-librareis-data-dashboard-with-tableau
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 20, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
prajapati meet
Area covered
World
Description
In this project, I have made a dashboard about the world's libraries and their expenses
🎬 IMDB 2020 Top Movies – Tableau Dashboard
kaggle.com
zip
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sahil Raj (2025). 🎬 IMDB 2020 Top Movies – Tableau Dashboard [Dataset]. https://www.kaggle.com/datasets/ssrai7/imdb-2020-top-movies-tableau-dashboard
Explore at:
zip(499974 bytes)Available download formats
Dataset updated
Jul 30, 2025
Authors
Sahil Raj
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
🎬 IMDB 2020 – Tableau Dashboard Project

https://github.com/ssrAiLab/IMDB-2020-Tableau-Dashboard/blob/main/Dashboard%20Screenshot.png?raw=true" alt="Dashboard Preview">

📊 Project Overview

The IMDB Top 1000 Movies of 2020 dataset provides a rich canvas for exploring the world of cinema — and this Tableau project transforms that data into stunning visuals and insights.

I’ve designed a dynamic and visually appealing dashboard using Tableau that highlights movie trends, ratings, genres, and key metrics from 2020’s cinematic landscape.

🧠 Key Insights Covered

✅ Top 20 Movies by IMDB Rating
✅ Distribution of Movies by Genre
✅ Top Directors with Most Hits
✅ Language & Country-wise Movie Count
✅ Gross Earnings vs Ratings
✅ Runtime Distribution Analysis
✅ Certificate-wise Movie Breakdown
✅ Year-wise Trend in Popularity

🛠️ Tools & Technologies Used

Tableau Public – for creating the interactive dashboard

Excel – for data cleaning and transformation

Kaggle & GitHub – for hosting and sharing the project

Design Thinking – for dashboard layout and visual balance

🗂️ Files Included

File Description
IMDB_2020_Dashboard.twb Tableau workbook file
imdb_top_1000.csv Cleaned dataset used
Dashboard Screenshot.png Snapshot of the final dashboard
archive.zip Contains all the files in one place

🚀 How to Use

Download the .twb file from this dataset

Open it in Tableau Desktop or Tableau Public (free version)

Explore the dashboard and insights interactively

Customize or expand the analysis with your own creativity

👨‍💻 About the Creator

Sahil Raj
Data Analyst | Tableau Storyteller | Movie Enthusiast 🎥
🔗 LinkedIn | GitHub | Kaggle

“Cinema is more than entertainment — it’s culture, storytelling, and data waiting to be visualized.”

⭐ Show Some Love

If you like the project, give it an upvote 💖

Share your feedback or forks

Connect on LinkedIn or GitHub for collaborations

📌 This project is for educational and portfolio purposes only. IMDB data is publicly available and curated for non-commercial use.
Superstore Dataset
kaggle.com
zip
Updated Sep 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivam Amrutkar (2023). Superstore Dataset [Dataset]. https://www.kaggle.com/datasets/yesshivam007/superstore-dataset
Explore at:
zip(2119716 bytes)Available download formats
Dataset updated
Sep 25, 2023
Authors
Shivam Amrutkar
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
The Superstore Sales Data dataset, available in an Excel format as "Superstore.xlsx," is a comprehensive collection of sales and customer-related information from a retail superstore. This dataset comprises* three distinct tables*, each providing specific insights into the store's operations and customer interactions.
Car Sales Analysis Dashboard|Tableau Visualization
kaggle.com
zip
Updated Feb 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Safae Ahb (2025). Car Sales Analysis Dashboard|Tableau Visualization [Dataset]. https://www.kaggle.com/datasets/safaeahb/car-sales-analysis-dashboard
Explore at:
zip(1000907 bytes)Available download formats
Dataset updated
Feb 4, 2025
Authors
Safae Ahb
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This interactive Tableau dashboard provides a detailed analysis of car sales trends from 2022 to 2023. It explores key metrics such as total sales, average car prices, and sales distribution by car type, color, and region.

Key Features: 📊 Sales Overview: Total sales, quantity, and price analysis. 📈 Monthly Trends: A time-series visualization of sales growth. 🎨 Car Color Preferences: Pie chart showing distribution by color. 🌍 Regional Sales Breakdown: Geospatial analysis of sales across the U.S. 🏆 Model-wise Performance: Sales comparison across different car brands. ⚙️ Engine & Transmission Impact: Filtering options to analyze impact by car type. This dashboard is ideal for automotive industry analysts, data enthusiasts, and business decision-makers interested in sales performance insights.

📌 Tools Used: Tableau, Data Cleaning & Preparation.
Atlix - Data Cleaning to Data Viz
kaggle.com
zip
Updated Apr 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
vikram amin (2023). Atlix - Data Cleaning to Data Viz [Dataset]. https://www.kaggle.com/datasets/vikramamin/atlix-data-cleaning-to-data-viz
Explore at:
zip(177969 bytes)Available download formats
Dataset updated
Apr 8, 2023
Authors
vikram amin
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The reference for the dataset and the dashboard was Youtube Channel codebasics. I have used a fictitious company called Atlix where the Sales Director want the sales data to be in a proper format which can help in decision making.

We have a total of 5 tables namely customers, products, markets, date & transactions. The data is exported from Mysql to Tableau.

In tableau , inner joins were used.

In the transactions table, we notice that sum sales amount figures are either negative or zero while the sales qty is either 1 or more. This cannot be right. Therefore, we filter the sales amount table in Tableau by having the least sales amount as minimum 1.

When currency column from transactions table was grouped in MySql, we could see ‘USD’ and ‘INR’ showing up. We cannot have a sales data showing two currencies. This was rectified by converting the USD sales amount into INR by taking the latest exchange rate at Rs.81.

We make the above change in tableau by creating a new calculated field called ‘Normalised Sales Amount’. If [Sales Amount] == ‘USD’ then [Sales Amount] * 81 else [Sales Amount] End.

Conclusion: The dashboard prepared is an interactive dashboard with filters. For eg. By Clicking on Mumbai under “Sales by Markets” we will see the results change in the other charts as well as they Will now show the results pertaining only to Mumbai. This can be done by year , month, customers , products etc. Parameter with filter has also been created for top customers and top products. This produces a slider which can be used to view the top 10 customers and products and slide it accordingly.

Following information can be passed on to the sales team or director.

Total Sales: from Jun’17 to Feb’20 has been INR 12.83 million. There is a drop of 57% in the sales revenue from 2018 to 2019. The year 2020 has not been considered as it only account for 2 months data. Markets: Mumbai which is the top most performing market and accounts for 51% of the total sales market has seen a drop in sales of almost 64% from 2018 to 2019. Top Customers: Path was on 2nd position in terms of sales in the year 2018. It accounted for 19% of the total sales after Electricalslytical which accounted for 21% of the total sales. But in year 2019, both Electricalslytical and Path were the 2nd and 4th highest customers by sales. By targeting the specific markets and customers through new ideas such as promotions, discounts etc we can look to reverse the trend of decreasing sales.
AirBnb NYC Storytelling
kaggle.com
zip
Updated Jan 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Priyanka Akavaram (2023). AirBnb NYC Storytelling [Dataset]. https://www.kaggle.com/datasets/priyankaakavaram/airbnb-nyc-storytelling
Explore at:
zip(4793817 bytes)Available download formats
Dataset updated
Jan 4, 2023
Authors
Priyanka Akavaram
Area covered
New York
Description
The different leaders at Airbnb want to understand some important insights based on various attributes in the dataset so as to increase the revenue such as -

Which type of hosts to acquire more and where? The categorization of customers based on their preferences. What are the neighborhoods they need to target? What are the pricing ranges preferred by customers? The various kinds of properties that exist w.r.t. customer preferences. Adjustments in the existing properties to make it m more customer-oriented. What are the most famous localities and properties in New York currently? How to get unpopular properties more traction? and so on...

To prepare for the next best steps Airbnb needs to take as a business, you have been asked to analyze a dataset of various Airbnb listings in New York. Based on this analysis, Two presentations to the following groups need to be given. 1. Data Analysis Managers and Lead Data Analyst 2. Head of Acquisitions and Operations, NYC, and Head of User Experience, NYC.
Brand Affiliate Dataset (TABLEAU)
kaggle.com
zip
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kehinde Y Adediran (2024). Brand Affiliate Dataset (TABLEAU) [Dataset]. https://www.kaggle.com/datasets/deremmy/brand-affiliate-dataset
Explore at:
zip(239419 bytes)Available download formats
Dataset updated
Apr 2, 2024
Authors
Kehinde Y Adediran
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Brand Affiliate Dataset is a comprehensive collection which emphasizes on a 3 months feedback analysis from January to March 2024 for five(5) Brands(Nesine, Bilyoner, Idda, Betboo, and CommissionLounge)

These brands provided various insights into Affiliate Impressions, Clicks, Signups, Earnings; offering valuable analysis into a Paid Advertising/Marketing Campaign

Several Measures were generated to ascertain the performance KPIs of each of the products: - Return on Investment (ROI) - New Customer Acquisition Rate - Earnings per Click (EPC) - Net Revenue per Click (RPC) - Earnings per Click (EPC) - Conversion Rate (CR) - Click-Through Rate (CTR) - Net Revenue

Metrics/Columns: Brand
Brand ID
Month and Year
Affiliate
Impressions Clicks
Signups NDC fdt Net Revenue Earnings
Human Resources Data Set
kaggle.com
zip
Updated Oct 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Rich (2020). Human Resources Data Set [Dataset]. https://www.kaggle.com/datasets/rhuebner/human-resources-data-set/discussion
Explore at:
zip(17041 bytes)Available download formats
Dataset updated
Oct 19, 2020
Authors
Dr. Rich
Description
Updated 30 January 2023

Version 14 of Dataset

License Update:

There has been some confusion around licensing for this data set. Dr. Carla Patalano and Dr. Rich Huebner are the original authors of this dataset.

We provide a license to anyone who wishes to use this dataset for learning or teaching. For the purposes of sharing, please follow this license:

CC-BY-NC-ND This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Codebook

https://rpubs.com/rhuebner/hrd_cb_v14

PLEASE NOTE -- I recently updated the codebook - please use the above link. A few minor discrepancies were identified between the codebook and the dataset. Please feel free to contact me through LinkedIn (www.linkedin.com/in/RichHuebner) to report discrepancies and make requests.

Context

HR data can be hard to come by, and HR professionals generally lag behind with respect to analytics and data visualization competency. Thus, Dr. Carla Patalano and I set out to create our own HR-related dataset, which is used in one of our graduate MSHRM courses called HR Metrics and Analytics, at New England College of Business. We created this data set ourselves. We use the data set to teach HR students how to use and analyze the data in Tableau Desktop - a data visualization tool that's easy to learn.

This version provides a variety of features that are useful for both data visualization AND creating machine learning / predictive analytics models. We are working on expanding the data set even further by generating even more records and a few additional features. We will be keeping this as one file/one data set for now. There is a possibility of creating a second file perhaps down the road where you can join the files together to practice SQL/joins, etc.

Note that this dataset isn't perfect. By design, there are some issues that are present. It is primarily designed as a teaching data set - to teach human resources professionals how to work with data and analytics.

Content

We have reduced the complexity of the dataset down to a single data file (v14). The CSV revolves around a fictitious company and the core data set contains names, DOBs, age, gender, marital status, date of hire, reasons for termination, department, whether they are active or terminated, position title, pay rate, manager name, and performance score.

Recent additions to the data include: - Absences - Most Recent Performance Review Date - Employee Engagement Score

Acknowledgements

Dr. Carla Patalano provided the baseline idea for creating this synthetic data set, which has been used now by over 200 Human Resource Management students at the college. Students in the course learn data visualization techniques with Tableau Desktop and use this data set to complete a series of assignments.

Inspiration

We've included some open-ended questions that you can explore and try to address through creating Tableau visualizations, or R or Python analyses. Good luck and enjoy the learning!

Is there any relationship between who a person works for and their performance score?

What is the overall diversity profile of the organization?

What are our best recruiting sources if we want to ensure a diverse organization?

Can we predict who is going to terminate and who isn't? What level of accuracy can we achieve on this?

Are there areas of the company where pay is not equitable?

There are so many other interesting questions that could be addressed through this interesting data set. Dr. Patalano and I look forward to seeing what we can come up with.

If you have any questions or comments about the dataset, please do not hesitate to reach out to me on LinkedIn: http://www.linkedin.com/in/RichHuebner

You can also reach me via email at: Richard.Huebner@go.cambridgecollege.edu
Blinkit dataset
kaggle.com
zip
Updated Jul 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mukesh gadri (2024). Blinkit dataset [Dataset]. https://www.kaggle.com/datasets/mukeshgadri/blinkit-dataset
Explore at:
zip(695160 bytes)Available download formats
Dataset updated
Jul 18, 2024
Authors
mukesh gadri
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
In the case study titled "Blinkit: Grocery Product Analysis," a dataset called 'Grocery Sales' contains 12 columns with information on sales of grocery items across different outlets. Using Tableau, you as a data analyst can uncover customer behavior insights, track sales trends, and gather feedback. These insights will drive operational improvements, enhance customer satisfaction, and optimize product offerings and store layout. Tableau enables data-driven decision-making for positive outcomes at Blinkit.

The table Grocery Sales is a .CSV file and has the following columns, details of which are as follows:

• Item_Identifier: A unique ID for each product in the dataset. • Item_Weight: The weight of the product. • Item_Fat_Content: Indicates whether the product is low fat or not. • Item_Visibility: The percentage of the total display area in the store that is allocated to the specific product. • Item_Type: The category or type of product. • Item_MRP: The maximum retail price (list price) of the product. • Outlet_Identifier: A unique ID for each store in the dataset. • Outlet_Establishment_Year: The year in which the store was established. • Outlet_Size: The size of the store in terms of ground area covered. • Outlet_Location_Type: The type of city or region in which the store is located. • Outlet_Type: Indicates whether the store is a grocery store or a supermarket. • Item_Outlet_Sales: The sales of the product in the particular store. This is the outcome variable that we want to predict.
World Hapiness Report Analysis (Py, SPSS, Tableau)
kaggle.com
zip
Updated May 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdullah Muhammad Al Kamal (2023). World Hapiness Report Analysis (Py, SPSS, Tableau) [Dataset]. https://www.kaggle.com/datasets/abdullahalkamal/world-hapiness-report-2015-2019
Explore at:
zip(34758 bytes)Available download formats
Dataset updated
May 3, 2023
Authors
Abdullah Muhammad Al Kamal
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Area covered
World
Description
Context The World Happiness Report is a landmark survey of the state of global happiness. The first report was published in 2012, the second in 2013, the third in 2015, and the fourth in the 2016 Update. The World Happiness 2017, which ranks 155 countries by their happiness levels, was released at the United Nations at an event celebrating International Day of Happiness on March 20th. The report continues to gain global recognition as governments, organizations and civil society increasingly use happiness indicators to inform their policy-making decisions. Leading experts across fields – economics, psychology, survey analysis, national statistics, health, public policy and more – describe how measurements of well-being can be used effectively to assess the progress of nations. The reports review the state of happiness in the world today and show how the new science of happiness explains personal and national variations in happiness.

Content The happiness scores and rankings use data from the Gallup World Poll. The scores are based on answers to the main life evaluation question asked in the poll. This question, known as the Cantril ladder, asks respondents to think of a ladder with the best possible life for them being a 10 and the worst possible life being a 0 and to rate their own current lives on that scale. The scores are from nationally representative samples for the years 2013-2016 and use the Gallup weights to make the estimates representative. The columns following the happiness score estimate the extent to which each of six factors – economic production, social support, life expectancy, freedom, absence of corruption, and generosity – contribute to making life evaluations higher in each country than they are in Dystopia, a hypothetical country that has values equal to the world’s lowest national averages for each of the six factors. They have no impact on the total score reported for each country, but they do explain why some countries rank higher than others.

Indicators/Factors Explain: 1. Rank, is the country ranking 2. Score, is the happiness score of the country 3. GDP, is the gross domestic product of the country 4. Family, is the indicator that shows family support to each citizen in the country 5. Life Expectancy, shows the healthiness level of the country 6. Freedom, is an indicator that shows the citizen freedom to choose their life path, job or etc 7. Trust, shows the level of trust from the citizen in the government (influenced by the corruption level and performance of the government) 8. Generosity, an indicator that shows the generosity level of the citizen of the country

Source: The World Happiness Report is a publication of the Sustainable Development Solutions Network, powered by the Gallup World Poll data.
Hotel Reservations Data
kaggle.com
zip
Updated Mar 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dimitris Angelides (2024). Hotel Reservations Data [Dataset]. https://www.kaggle.com/datasets/dimitrisangelide/hotel-reservations-data
Explore at:
zip(2567615 bytes)Available download formats
Dataset updated
Mar 4, 2024
Authors
Dimitris Angelides
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Tourism and travel holds more than 10% of the GDP worldwide, and is trending towards capturing higher stakes of the global pie. At the same time, it's an industry that generates huge volume of data and getting advantage of it could help businesses to stand out from the crowd.

Content

The dataset provides reservations data for two consecutive seasons (2021 - 2023) of a luxury hotel.

Source

ChatGPT 3.5 (OpenAI) is the main creator of the dataset. Minor adjustments were performed by myself to ensure that the dataset contains the desired fields and values.

Inspiration

• How effectively is the hotel performing across key metrics? • How are bookings distributed across different channels (e.g., Booking Platform, Phone, Walk-in, and Website)? • What is the current occupancy rate and how does it compare to the same period last year? • What are the demographics of the current guests (e.g., nationality)? • What is the average daily rate (ADR) per room?

These are examples of interesting questions that could be answered by analyzing this dataset.

If you are interested, please have a look at the Tableau dashboard that I have created to help answer the above questions. Tableau dashboard: https://public.tableau.com/app/profile/dimitris.angelides/viz/HotelExecutiveDashboards/HotelExecutiveSummaryReport?publish=yes
Visualizing Chicago Crime Data
kaggle.com
zip
Updated Jul 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elijah Toumoua (2022). Visualizing Chicago Crime Data [Dataset]. https://www.kaggle.com/datasets/elijahtoumoua/chicago-analysis-of-crime-data-dashboard
Explore at:
zip(94861784 bytes)Available download formats
Dataset updated
Jul 1, 2022
Authors
Elijah Toumoua
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Chicago
Description
Prelude

This dataset is a cleaned version of the Chicago Crime Dataset, which can be found here. All rights for the dataset go to the original owners. The purpose of this dataset is to display my skills in visualizations and creating dashboards. To be specific, I will attempt to create a dashboard that will allow users to see metrics for a specific crime within a given year using filters and metrics. Due to this, there will not be much of a focus on the analysis of the data, but there will be portions discussing the validity of the dataset, the steps I took to clean the data, and how I organized it. The cleaned datasets can be found below, the Query (which utilized BigQuery) can be found here and the Tableau dashboard can be found here.

About the Dataset

Important Facts

The dataset comes directly from the City of Chicago's website under the page "City Data Catalog." The data is gathered directly from the Chicago Police's CLEAR (Citizen Law Enforcement Analysis and Reporting) and is updated daily to present the information accurately. This means that a crime on a specific date may be changed to better display the case. The dataset represents crimes starting all the way from 2001 to seven days prior to today's date.

Reliability

Using the ROCCC method, we can see that: * The data has high reliability: The data covers the entirety of Chicago from a little over 2 decades. It covers all the wards within Chicago and even gives the street names. While we may not have an idea for how big the sample size is, I do believe that the dataset has high reliability since it geographically covers the entirety of Chicago. * The data has high originality: The dataset was gained directly from the Chicago Police Dept. using their database, so we can say this dataset is original. * The data is somewhat comprehensive: While we do have important information such as the types of crimes committed and their geographic location, I do not think this gives us proper insights as to why these crimes take place. We can pinpoint the location of the crime, but we are limited by the information we have. How hot was the day of the crime? Did the crime take place in a neighborhood with low-income? I believe that these key factors prevent us from getting proper insights as to why these crimes take place, so I would say that this dataset is subpar with how comprehensive it is. * The data is current: The dataset is updated frequently to display crimes that took place seven days prior to today's date and may even update past crimes as more information comes to light. Due to the frequent updates, I do believe the data is current. * The data is cited: As mentioned prior, the data is collected directly from the polices CLEAR system, so we can say that the data is cited.

Processing the Data

Cleaning the Dataset

The purpose of this step is to clean the dataset such that there are no outliers in the dashboard. To do this, we are going to do the following: * Check for any null values and determine whether we should remove them. * Update any values where there may be typos. * Check for outliers and determine if we should remove them.

The following steps will be explained in the code segments below. (I used BigQuery for this so the coding will follow BigQuery's syntax) ```

Examining the dataset

There are over 7.5 million rows of data

Putting a limit so it does not take a long time to run

SELECT * FROM portfolioproject-350601.ChicagoCrime.Crime LIMIT 1000;

Seeing which points are null

There are 85,000 null points so we can exclude them as it's not a significant amount since it is only ~1.3% of the dataset

Most of the null points are in the lat and long, which we will need later

Because we don't have the full address, we can't estimate the lat and long in SQL so we will have to delete the rows with Null Data

SELECT * FROM portfolioproject-350601.ChicagoCrime.Crime WHERE unique_key IS NULL OR case_number IS NULL OR date IS NULL OR primary_type IS NULL OR location_description IS NULL OR arrest IS NULL OR longitude IS NULL OR latitude IS NULL;

Deleting all null rows

DELETE FROM portfolioproject-350601.ChicagoCrime.Crime WHERE
unique_key IS NULL OR case_number IS NULL OR date IS NULL OR primary_type IS NULL OR location_description IS NULL OR arrest IS NULL OR longitude IS NULL OR latitude IS NULL;

Checking for any duplicates in the unique keys

None to be found

SELECT unique_key, COUNT(unique_key) FROM `portfolioproject-350601.ChicagoCrime....
Life Expectancy data analysis project
kaggle.com
zip
Updated Mar 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anuja Dixit12 (2023). Life Expectancy data analysis project [Dataset]. https://www.kaggle.com/datasets/anujadixit12/life-expectancy-data-analysis-project
Explore at:
zip(2775109 bytes)Available download formats
Dataset updated
Mar 20, 2023
Authors
Anuja Dixit12
Description
INTRODUCTION This is my first data analysis project . This project is aim to find the average life expectancy in each country .The dataset used in this is life expectancy which freely available on Kaggle. I used R and Tableau in this project.
eCommerce Transactions
kaggle.com
zip
Updated Jan 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chad Wambles (2025). eCommerce Transactions [Dataset]. https://www.kaggle.com/datasets/chadwambles/ecommerce-transactions
Explore at:
zip(245430 bytes)Available download formats
Dataset updated
Jan 3, 2025
Authors
Chad Wambles
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This data set is perfect for practicing your analytical skills for Power BI, Tableau, Excel, or transform it into a CSV to practice SQL.

This use case mimics transactions for a fictional eCommerce website named EverMart Online. The 3 tables in this data set are all logically connected together with IDs.

My Power BI Use Case Explanation - Using Microsoft Power BI, I made dynamic data visualizations for revenue reporting and customer behavior reporting.

Revenue Reporting Visuals - Data Card Visual that dynamically shows Total Products Listed, Total Unique Customers, Total Transactions, and Total Revenue by Total Sales, Product Sales, or Categorical Sales. - Line Graph Visual that shows Total Revenue by Month of the entire year. This graph also changes to calculate Total Revenue by Month for the Total Sales by Product and Total Sales by Category if selected. - Bar Graph Visual showcasing Total Sales by Product. - Donut Chart Visual showcasing Total Sales by Category of Product.

Customer Behavior Reporting Visuals - Data Card Visual that dynamically shows Total Products Listed, Total Unique Customers, Total Transactions, and Total Revenue by Total or by continent selected on the map. - Interactive Map Visual showing key statistics for the continent selected. - The key statistics are presented on the tool tip when you select a continent, and the following statistics show for that continent: - Continent Name - Customer Total - Percentage of Products Sold - Percentage of Total Customers - Percentage of Total Transactions - Percentage of Total Revenue
Streaming Service Data
kaggle.com
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chad Wambles (2024). Streaming Service Data [Dataset]. https://www.kaggle.com/datasets/chadwambles/streaming-service-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 19, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Chad Wambles
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
A dataset I generated to showcase a sample set of user data for a fictional streaming service. This data is great for practicing SQL, Excel, Tableau, or Power BI.

1000 rows and 25 columns of connected data.

See below for column descriptions.

Enjoy :)

File	Description
`IMDB_2020_Dashboard.twb`	Tableau workbook file
`imdb_top_1000.csv`	Cleaned dataset used
`Dashboard Screenshot.png`	Snapshot of the final dashboard
`archive.zip`	Contains all the files in one place

Facebook

Twitter

Click to copy link

Link copied

Cite

Rupesh Patil (2023). Tableau Dashboard on HR Dataset [Dataset]. https://www.kaggle.com/datasets/rupeshpatil1997/tableau-hr-dataset

Tableau Dashboard on HR Dataset

Tableau Dashboard for HR Dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Aug 24, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Rupesh Patil

Description

The HR dataset contains employee-related information, such as personal details, job roles, salaries, and performance metrics. It's used by organizations to manage human resources, make informed staffing decisions, and analyze workforce trends. The dataset aids in optimizing employee satisfaction, productivity, and organizational growth. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15826402%2F6f621dd7a72a2d8c6d0df659c6604189%2FHR%20Dashboard.jpg?generation=1692882310646646&alt=media" alt="">

Clear search

Close search

Google apps

Main menu

Tableau Dashboard on HR Dataset

Netflix Data: Cleaning, Analysis and Visualization

Data Cleaning

Sales & Customer Analytics – Interactive dashboard

Bank Loan Analysis Project in Tableau. twbx

USA Bank Financial Data

world's librareis data - dashboard with tableau

🎬 IMDB 2020 Top Movies – Tableau Dashboard

🎬 IMDB 2020 – Tableau Dashboard Project

📊 Project Overview

🧠 Key Insights Covered

🛠️ Tools & Technologies Used

🗂️ Files Included

🚀 How to Use

👨‍💻 About the Creator

⭐ Show Some Love

Superstore Dataset

Car Sales Analysis Dashboard|Tableau Visualization

Atlix - Data Cleaning to Data Viz

AirBnb NYC Storytelling

Brand Affiliate Dataset (TABLEAU)

Human Resources Data Set

Version 14 of Dataset

License Update:

Codebook

Context

Content

Acknowledgements

Inspiration

Blinkit dataset

World Hapiness Report Analysis (Py, SPSS, Tableau)

Hotel Reservations Data

Visualizing Chicago Crime Data

Prelude

About the Dataset

Important Facts

Reliability

Processing the Data

Cleaning the Dataset

Examining the dataset

There are over 7.5 million rows of data

Putting a limit so it does not take a long time to run

Seeing which points are null

There are 85,000 null points so we can exclude them as it's not a significant amount since it is only ~1.3% of the dataset

Most of the null points are in the lat and long, which we will need later

Because we don't have the full address, we can't estimate the lat and long in SQL so we will have to delete the rows with Null Data

Deleting all null rows

Checking for any duplicates in the unique keys

None to be found

Life Expectancy data analysis project

eCommerce Transactions

Streaming Service Data

Tableau Dashboard on HR Dataset

Tableau Dashboard for HR Dataset