13 datasets found

Data Cleaning Portfolio Project
kaggle.com
zip
Updated Apr 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepali Sukhdeve (2024). Data Cleaning Portfolio Project [Dataset]. https://www.kaggle.com/datasets/deepalisukhdeve/data-cleaning-portfolio-project
Explore at:
zip(6053781 bytes)Available download formats
Dataset updated
Apr 2, 2024
Authors
Deepali Sukhdeve
Description
Dataset

This dataset was created by Deepali Sukhdeve

Contents
SQL Data Cleaning Portfolio V2
kaggle.com
zip
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Hurairah (2023). SQL Data Cleaning Portfolio V2 [Dataset]. https://www.kaggle.com/datasets/mohammadhurairah/sql-cleaning-portfolio-v2/discussion
Explore at:
zip(6054498 bytes)Available download formats
Dataset updated
Jun 16, 2023
Authors
Mohammad Hurairah
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Data Cleaning from Public Nashville Housing Data:

Standardize the Date Format

Populate Property Address data

Breaking out Addresses into Individual Columns (Address, City, State)

Change Y and N to Yes and No in the "Sold as Vacant" field

Remove Duplicates

Delete Unused Columns
SQL Data Exploration COVID Portfolio V1
kaggle.com
zip
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Hurairah (2023). SQL Data Exploration COVID Portfolio V1 [Dataset]. https://www.kaggle.com/datasets/mohammadhurairah/covid-portfolio-project-sql-v1
Explore at:
zip(61483158 bytes)Available download formats
Dataset updated
Jun 16, 2023
Authors
Mohammad Hurairah
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Data exploration, cleaning, and arrangement with Covid Death and Covid Vaccination which is involved:

Data that going to be using

Shows the likelihood of dying if you contract covid in your country

Show what percentage of the population got Covid

Looking at Countries with the Highest Infection Rate compared to the Population

Showing the Country with the Highest Death Count per Population

Break things down by continent

Continents with the Highest death count per population

Looking at Total Population vs Vaccinations

Used CTE and Temp Table

Creating View to store data for later visualizations
Cleaning Data in SQL Portfolio Project
kaggle.com
zip
Updated Apr 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Austin Kennell (2023). Cleaning Data in SQL Portfolio Project [Dataset]. https://www.kaggle.com/austinkennell/cleaning-data-in-sql-portfolio-project
Explore at:
zip(6054868 bytes)Available download formats
Dataset updated
Apr 19, 2023
Authors
Austin Kennell
Description
The dataset contained information on housing data in the Nashville, TN area. I used SQL Server to clean the data to make it easier to use. For example, I converted some dates to remove unnecessary timestamps; I populated data for null values; I changed address columns from containing all of the address, city and state into separate columns; I changed a column that had different representations of the same data into consistent usage; I removed duplicate rows; and I deleted unused columns.
Employee Attrition Case Study
kaggle.com
zip
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hunter Gonzalez (2023). Employee Attrition Case Study [Dataset]. https://www.kaggle.com/datasets/huntergonzalez247/employee-attrition-case-study
Explore at:
zip(9887 bytes)Available download formats
Dataset updated
Aug 8, 2023
Authors
Hunter Gonzalez
Description
This is a in-depth analysis I have created using data pulled from an open source (ODbL) data project that was provided on Kaggle:

Pavansubhash. (2017). IBM HR Analytics Employee Attrition & Performance, Version 1. Retrieved August 3rd, 2023 from https://www.kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset.

Problem: The VP of People Operations/HR at [Company] wants to better understand what efforts they can make to retain more employees every year.

Question: How does education, job involvement, and work life balance effect employee attrition?

Metrics

A Survey was sent out 2068 current and past employees which asked a series of clear and consist questions inquiring about different variables involving the workplace. The surveys where anonymous to assure that employees answered truthfully and protecting the integrity of the data collected.

Education: 1)Below College 2)Some College 3)Bachelor 4)Master 5)Doctor

Job Involvement: 1)Low 2)Medium 3)High 4)Very High

Work Life Balance: 1)Bad 2)Good 3)Better 4)Best
SQL Integrity Journey: Unleashing Data Constraints
kaggle.com
zip
Updated Oct 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Radha Gandhi (2023). SQL Integrity Journey: Unleashing Data Constraints [Dataset]. https://www.kaggle.com/datasets/radhagandhi/sql-integrity-journey-unleashing-data-constraints
Explore at:
zip(13817 bytes)Available download formats
Dataset updated
Oct 9, 2023
Authors
Radha Gandhi
Description
**Title: **Practical Exploration of SQL Constraints: Building a Foundation in Data Integrity Introduction: Welcome to my Data Analysis project, where I focus on mastering SQL constraints—a pivotal aspect of database management. This project centers on hands-on experience with SQL's Data Definition Language (DDL) commands, emphasizing constraints such as PRIMARY KEY, FOREIGN KEY, UNIQUE, CHECK, and DEFAULT. In this project, I aim to demonstrate my foundational understanding of enforcing data integrity and maintaining a structured database environment. Purpose: The primary purpose of this project is to showcase my proficiency in implementing and managing SQL constraints for robust data governance. By delving into the realm of constraints, you'll gain insights into my SQL skills and how I utilize constraints to ensure data accuracy, consistency, and reliability within relational databases. What to Expect: Within this project, you will find a series of projects that focus on the implementation and utilization of SQL constraints. These projects highlight my command over the following key constraint types: NOT NULL: The NOT NULL constraint is crucial for ensuring the presence of essential data in a column. PRIMARY KEY: Ensuring unique identification of records for data integrity. FOREIGN KEY: Establishing relationships between tables to maintain referential integrity. UNIQUE: Guaranteeing the uniqueness of values within specified columns. CHECK: Implementing custom conditions to validate data entries. DEFAULT: Setting default values for columns to enhance data reliability. Each exercise within this project is accompanied by clear and concise SQL scripts, explanations of the intended outcomes, and practical insights into the application of these constraints. My goal is to showcase how SQL constraints serve as crucial tools for creating a structured and dependable database foundation. I invite you to explore these projects in detail, where I provide hands-on examples that highlight the importance and utility of SQL constraints. Together, these projects underscore my commitment to upholding data quality, ensuring data accuracy, and harnessing the power of SQL constraints for informed decision-making in data analysis. 3.1 CONSTRAINT - ENFORCING NOT NULL CONSTRAINT WHILE CREATING NEW TABLE. 3.2 CONSTRAINT- ENFORCE NOT NULL CONSTRAINT ON EXISTING COLUMN. 3.3 CONSTRAINT - ENFORCING PRIMARY KEY CONSTRAINT WHILE CREATING A NEW TABLE. 3.4 CONSTRAINT - ENFORCE PRIMARY KEY CONSTRAINT ON EXISTING COLUMN. 3.5 CONSTRAINT - ENFORCING FOREIGN KEY CONSTRAINT WHILE CREATING NEW TABLE. 3.6 CONSTRAINT - ENFORCE FOREIGN KEY CONSTRAINT ON EXISTING COLUMN. 3.7CONSTRAINT - ENFORCING UNIQUE CONSTRAINTS WHILE CREATING A NEW TABLE. 3.8 CONSTRAINT - ENFORCING UNIQUE CONSTRAINT IN EXISTING TABLE. 3.9 CONSTRAINT - ENFORCING CHECK CONSTRAINT IN NEW TABLE. 3.10 CONSTRAINT - ENFORCING CHECK CONSTRAINT IN THE EXISTING TABLE. 3.11 CONSTRAINT - ENFORCING DEFAULT CONSTRAINT IN THE NEW TABLE. 3.12 CONSTRAINT - ENFORCING DEFAULT CONSTRAINT IN THE EXISTING TABLE.
SuperMarket Sales
kaggle.com
zip
Updated Dec 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chad Wambles (2024). SuperMarket Sales [Dataset]. https://www.kaggle.com/datasets/chadwambles/supermarket-sales
Explore at:
zip(37361 bytes)Available download formats
Dataset updated
Dec 17, 2024
Authors
Chad Wambles
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
I generated a database for sales data of a supermarket in order to practice determining KPI's and make data visualizations.

This data set includes: - Unique sales id for each row. - Branch of the supermarket (New York, Chicago, and Los Angeles). - City of the supermarket (New York, Chicago, and Los Angeles). - Customer Type (Member or Normal). Members receive reward points. - Gender (Male or Female) - Product name of the product sold. - Product category of the product sold. - Unit price of each product sold. - Quantity of the product sold. - 7% sales tax of each product. - Total price of the product after tax. - Reward points for only members customer type.

The Creation Queries.sql file will have the creation query for the Sales table and Insert queries. The data provided here is the same as what is found in the sales.csv file.

The Sales and Revenue KPIs.sql file will have the queries I used to perform my analysis on key performance indicators relating to sales and revenue of this fictional company.

The Customer Behavior KPIs.sql file will have the queries I used to perform my analysis on key performance indicators relating to customer behavior of this fictional company.

The Product Performance KPIs.sql file will have the queries I used to perform my analysis on key performance indicators relating to product performance of this fictional company.
Mastering the Essentials:Hands-On DDL Command Prac
kaggle.com
zip
Updated Sep 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Radha Gandhi (2023). Mastering the Essentials:Hands-On DDL Command Prac [Dataset]. https://www.kaggle.com/datasets/radhagandhi/1practical-exercise-in-ddl-commands/code
Explore at:
zip(7378 bytes)Available download formats
Dataset updated
Sep 25, 2023
Authors
Radha Gandhi
Description
The Practical Exercise in SQL Data Definition Language (DDL) Commands is a hands-on project designed to help you gain a deep understanding of fundamental DDL commands in SQL, including:

CREATE TABLE,

ALTER(ADD, RENAME, DROP)TABLE,

TRUNCATE TABLE.

This project aims to enhance your proficiency in using SQL to create, modify, and manage database structures effectively.

1.1 DDL-CREATE TABLE

1.2 DDL-ALTER TABLE(ADD)

1.3 DDL-ALTER(RENAME COLUMN NAME)

1.4 DDL-ALTER(RENAME TABLE NAME)

1.5 DDL-ALTER(DROP COLUMN FROM TABLE)

1.6 DDL-ALTER(DROP TABLE)

1.7 DDL- TRUNCATE TABLE
Sales Executive Dashboard Report
kaggle.com
zip
Updated Aug 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jogleen Calipon (2025). Sales Executive Dashboard Report [Dataset]. https://www.kaggle.com/datasets/joelearns/sales-executive-dashboard-report
Explore at:
zip(3092158 bytes)Available download formats
Dataset updated
Aug 1, 2025
Authors
Jogleen Calipon
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This project is built on the AdventureWorks dataset, originally provided by Microsoft for SQL Server samples. This comprehensive dataset models a bicycle manufacturer and its sales to global markets, offering a realistic foundation for a data analytics portfolio.

The raw data can be accessed and downloaded directly from the official Microsoft GitHub repository: https://github.com/microsoft/sql-server-samples/tree/master/samples/databases/adventure-works

Project Overview

The work presented in this portfolio project demonstrates my end-to-end data analysis skills, from initial data cleaning and modeling to creating an interactive, insight-driven dashboard. Within this project, you will find examples of various data visualizations and a dashboard layout that follows the F-pattern for optimized user experience.

I encourage you to download the dataset and follow along with my analysis. Feel free to replicate my work, critique my methods, or build upon it with your own creative insights and improvements. Your feedback and engagement are highly welcomed!
cyclistic-bike-share-2022-2024-clean
kaggle.com
zip
Updated Nov 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chathuranga Sudusinghe (2025). cyclistic-bike-share-2022-2024-clean [Dataset]. https://www.kaggle.com/datasets/indrajithsudusinghe/cyclistic-bike-share-2022-2024-clean
Explore at:
zip(579891587 bytes)Available download formats
Dataset updated
Nov 28, 2025
Authors
Chathuranga Sudusinghe
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Cyclistic Bike-Share Dataset (2022–2024) – Cleaned & Merged

This dataset contains three full years (2022, 2023, and 2024) of publicly available Cyclistic bike-share trip data. All yearly files have been cleaned, standardized, and merged into a single high-quality master dataset for easy analysis.

The dataset is ideal for:

Data Analysis & Visualization

SQL Projects

Python (Pandas) Practice

Power BI, Tableau Dashboards

Machine Learning Feature Engineering

🔹 Key Cleaning & Processing Steps - Removed duplicate records - Handled missing values - Standardized column names - Converted date-time formats - Created calculated columns (ride length, day, month, etc.) - Merged yearly datasets into one master CSV file (3.17 GB)

🔹 What You Can Analyze - Member vs Casual rider behavior - Peak riding hours and days - Monthly & seasonal trends - Trip duration patterns - Station usage & demand forecasting

This dataset is especially useful for data analyst portfolio projects and technical interview preparation.
Retail Sales, Returns & Shipping Dataset
kaggle.com
zip
Updated Aug 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kunal malviya (2025). Retail Sales, Returns & Shipping Dataset [Dataset]. https://www.kaggle.com/datasets/kunalmalviya06/retail-sales-returns-and-shipping-dataset
Explore at:
zip(632399 bytes)Available download formats
Dataset updated
Aug 15, 2025
Authors
kunal malviya
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset provides a comprehensive view of retail operations, combining sales transactions, return records, and shipping cost details into one analysis-ready package. It’s ideal for data analysts, business intelligence professionals, and students looking to practice Power BI, Tableau, or SQL projects focusing on sales performance, profitability, and operational cost analysis.

Dataset Structure

Orders Table – Detailed transactional data

Row ID

Order ID

Order Date, Ship Date, Delivery Duration

Ship Mode

Customer ID, Customer Name, Segment, Country, City, State, Postal Code, Region

Product ID, Category, Sub-Category, Product Name

Sales, Quantity, Discount, Discount Value, Profit, COGS

Returns Table – Return records by Order ID

Returned (Yes/No)

Order ID

Shipping Cost Table – State-level shipping expenses

State

Shipping Cost Per Unit

Potential Use Cases

Calculate gross vs. net profit after considering returns and shipping costs.

Perform regional sales and profit analysis.

Identify high-return products and loss-making categories.

Visualize KPIs in Power BI or Tableau.

Build predictive models for returns or shipping costs.

Source & Context The dataset is designed for educational and analytical purposes. It is inspired by retail and e-commerce operations data and was prepared for data analytics portfolio projects.

License Open for use in learning, analytics projects, and data visualization practice.
Logistics Operations Database
kaggle.com
zip
Updated Nov 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yogape Rodriguez (2025). Logistics Operations Database [Dataset]. https://www.kaggle.com/datasets/yogape/logistics-operations-database
Explore at:
zip(15059576 bytes)Available download formats
Dataset updated
Nov 23, 2025
Authors
Yogape Rodriguez
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Kaggle Dataset: Synthetic Logistics Operations Database (2022-2024)

About this Dataset

What's Inside

A complete operational database from a fictional Class 8 trucking company spanning three years. This isn't scraped web data or simplified tutorial content—it's a realistic simulation built from 12 years of real-world logistics experience, designed specifically for analysts transitioning into supply chain and transportation domains.

The dataset contains 85,000+ records across 14 interconnected tables covering everything from driver assignments and fuel purchases to maintenance schedules and delivery performance. Each table maintains proper foreign key relationships, making this ideal for practicing complex SQL queries, building data pipelines, or developing operational dashboards.

Who This Is For

SQL Learners: Master window functions, CTEs, and multi-table JOINs using realistic business scenarios rather than contrived examples.

Data Analysts: Build portfolio projects that demonstrate understanding of operational metrics: cost-per-mile analysis, fleet utilization optimization, driver performance scorecards.

Aspiring Supply Chain Analysts: Work with authentic logistics data patterns—seasonal freight volumes, equipment utilization rates, route profitability calculations—without NDA restrictions.

Data Science Students: Develop predictive models for maintenance scheduling, driver retention, or route optimization using time-series data with actual business context.

Career Changers: If you're moving from operations into analytics (like the dataset creator), this provides a bridge—your domain knowledge becomes a competitive advantage rather than a gap to explain.

Why This Dataset Exists

Most logistics datasets are either proprietary (unavailable) or overly simplified (unrealistic). This fills the gap: operational complexity without confidentiality concerns. The data reflects real industry patterns:

Fuel prices track the 2022 diesel spike and 2023-2024 decline

Driver turnover sits at 15% annually (industry standard)

Equipment utilization averages 65% (typical for dry van operations)

On-time delivery performance ranges 85-95% (realistic service levels)

Maintenance intervals follow Class 8 PM schedules

Dataset Structure

Core Entities (Reference Tables): - Drivers (150 records) - Demographics, employment history, CDL info - Trucks (120 records) - Fleet specs, acquisition dates, status - Trailers (180 records) - Equipment types, current assignments - Customers (200 records) - Shipper accounts, contract terms, revenue potential - Facilities (50 records) - Terminals and warehouses with geocoordinates - Routes (60+ records) - City pairs with distances and rate structures

Operational Transactions: - Loads (57,000+ records) - Shipment details, revenue, booking type - Trips (57,000+ records) - Driver-truck assignments, actual performance - Fuel Purchases (131,000+ records) - Transaction-level data with pricing - Maintenance Records (6,500+ records) - Service history, costs, downtime - Delivery Events (114,000+ records) - Pickup/delivery timestamps, detention - Safety Incidents (114 records) - Accidents, violations, claims

Aggregated Analytics: - Driver Monthly Metrics (5,400+ records) - Performance summaries - Truck Utilization Metrics (3,800+ records) - Equipment efficiency

Key Features

Temporal Coverage: January 2022 through December 2024 (3 years)

Geographic Scope: National operations across 25+ major US cities

Realistic Patterns: - Seasonal freight fluctuations (Q4 peaks) - Historical fuel price accuracy - Equipment lifecycle modeling - Driver retention dynamics - Service level variations

Data Quality: - Complete foreign key integrity - No orphaned records - Intentional 2% null rate in driver/truck assignments (reflects reality) - All timestamps properly sequenced - Financial calculations verified

Use Case Examples

Business Intelligence: Create executive dashboards showing revenue per truck, cost per mile, driver efficiency rankings, maintenance spend by equipment age, customer concentration risk.

Predictive Analytics: Build models forecasting equipment failures based on maintenance history, predict driver turnover using performance metrics, estimate route profitability for new lanes.

Operations Optimization: Analyze route efficiency, identify underutilized assets, optimize maintenance scheduling, calculate ideal fleet size, evaluate driver-to-truck ratios.

SQL Mastery: Practice window functions for running totals and rankings, write complex JOINs across 6+ tables, implement CTEs for hierarchical queries, perform cohort analysis on driver retention.

Sample Questions to Explore

Which routes generate the highest profit margin after fuel costs?

How does driver tenure correlate with fuel ef...
2025 Jobs and Salaries in Data Science
kaggle.com
zip
Updated Jan 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hina Ismail (2025). 2025 Jobs and Salaries in Data Science [Dataset]. https://www.kaggle.com/datasets/sonialikhan/2025-jobs-and-salaries-in-data-science/versions/1
Explore at:
zip(77972 bytes)Available download formats
Dataset updated
Jan 29, 2025
Authors
Hina Ismail
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
🚀 Data Science Careers in 2025: Jobs and Salary Trends in Pakistan 🚀 Data Science is one of the fastest-growing fields, and by 2025, the demand for skilled professionals in Pakistan will only increase. If you’re considering a career in Data Science, here’s what you need to know about the top jobs and salary trends.

🔍 Top Data Science Jobs in 2025 1) Data Scientist Avg Salary: PKR 1.2M - 2.5M/year (Entry-Level), PKR 3M - 6M/year (Experienced) Skills: Python, R, Machine Learning, Data Visualization

2) Data Analyst Avg Salary: PKR 800K - 1.5M/year (Entry-Level), PKR 2M - 3.5M/year (Experienced) Skills: SQL, Excel, Tableau, Power BI

3) Machine Learning Engineer Avg Salary: PKR 1.5M - 3M/year (Entry-Level), PKR 4M - 7M/year (Experienced) Skills: TensorFlow, PyTorch, Deep Learning, NLP

4)Business Intelligence Analyst Avg Salary: PKR 1M - 2M/year (Entry-Level), PKR 2.5M - 4M/year (Experienced) Skills: Data Warehousing, ETL, Dashboarding

5) AI Research Scientist Avg Salary: PKR 2M - 4M/year (Entry-Level), PKR 5M - 10M/year (Experienced) Skills: AI Algorithms, Research, Advanced Mathematic

💡 Why Choose Data Science? High Demand: Every industry in Pakistan needs data professionals. Attractive Salaries: Competitive pay based on technical expertise. Growth Opportunities: Unlimited career growth in this field.

📈 Salary Trends Entry-Level: PKR 800K - 1.5M/year Mid-Level: PKR 2M - 4M/year Senior-Level: PKR 5M+ (depending on expertise and industry)

🛠️ How to Get Started? Learn Skills: Focus on Python, SQL, Machine Learning, and Data Visualization. Build Projects: Work on real-world datasets to create a strong portfolio. Network: Connect with industry professionals and join Data Science communities.

work_year: The year in which the data was recorded. This field indicates the temporal context of the data, important for understanding salary trends over time.

job_title: The specific title of the job role, like 'Data Scientist', 'Data Engineer', or 'Data Analyst'. This column is crucial for understanding the salary distribution across various specialized roles within the data field.

job_category: A classification of the job role into broader categories for easier analysis. This might include areas like 'Data Analysis', 'Machine Learning', 'Data Engineering', etc.

salary_currency: The currency in which the salary is paid, such as USD, EUR, etc. This is important for currency conversion and understanding the actual value of the salary in a global context.

salary: The annual gross salary of the role in the local currency. This raw salary figure is key for direct regional salary comparisons.

salary_in_usd: The annual gross salary converted to United States Dollars (USD). This uniform currency conversion aids in global salary comparisons and analyses.

employee_residence: The country of residence of the employee. This data point can be used to explore geographical salary differences and cost-of-living variations.

experience_level: Classifies the professional experience level of the employee. Common categories might include 'Entry-level', 'Mid-level', 'Senior', and 'Executive', providing insight into how experience influences salary in data-related roles.

employment_type: Specifies the type of employment, such as 'Full-time', 'Part-time', 'Contract', etc. This helps in analyzing how different employment arrangements affect salary structures.

work_setting: The work setting or environment, like 'Remote', 'In-person', or 'Hybrid'. This column reflects the impact of work settings on salary levels in the data industry.

company_location: The country where the company is located. It helps in analyzing how the location of the company affects salary structures.

company_size: The size of the employer company, often categorized into small (S), medium (M), and large (L) sizes. This allows for analysis of how company size influences salary.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Deepali Sukhdeve (2024). Data Cleaning Portfolio Project [Dataset]. https://www.kaggle.com/datasets/deepalisukhdeve/data-cleaning-portfolio-project

Data Cleaning Portfolio Project

Cleaning Data with SQL Queries

Explore at:

zip(6053781 bytes)Available download formats

Dataset updated

Apr 2, 2024

Authors

Deepali Sukhdeve

Description

Dataset

This dataset was created by Deepali Sukhdeve

Clear search

Close search

Google apps

Main menu

Data Cleaning Portfolio Project

Dataset

Contents

SQL Data Cleaning Portfolio V2

SQL Data Exploration COVID Portfolio V1

Cleaning Data in SQL Portfolio Project

Employee Attrition Case Study

SQL Integrity Journey: Unleashing Data Constraints

SuperMarket Sales

Mastering the Essentials:Hands-On DDL Command Prac

Sales Executive Dashboard Report

Project Overview

cyclistic-bike-share-2022-2024-clean

Retail Sales, Returns & Shipping Dataset

Logistics Operations Database

Kaggle Dataset: Synthetic Logistics Operations Database (2022-2024)

About this Dataset

What's Inside

Who This Is For

Why This Dataset Exists

Dataset Structure

Key Features

Use Case Examples

Sample Questions to Explore

2025 Jobs and Salaries in Data Science

Data Cleaning Portfolio Project

Cleaning Data with SQL Queries

Dataset

Contents