84 datasets found

Capstone Project TikTok - EDA
kaggle.com
zip
Updated Nov 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sohail K. Nikouzad (2023). Capstone Project TikTok - EDA [Dataset]. https://www.kaggle.com/datasets/sohailnikouzad/capstone-pr0ject-tiktok-eda
Explore at:
zip(52324 bytes)Available download formats
Dataset updated
Nov 15, 2023
Authors
Sohail K. Nikouzad
Description
Dataset

This dataset was created by Sohail K. Nikouzad

Contents
Electronics Store Sales Dataset for EDA
kaggle.com
zip
Updated Feb 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sinjoy Saha (2021). Electronics Store Sales Dataset for EDA [Dataset]. https://www.kaggle.com/sinjoysaha/sales-analysis-dataset
Explore at:
zip(2505035 bytes)Available download formats
Dataset updated
Feb 13, 2021
Authors
Sinjoy Saha
Description
Content

This is a transactions data from an Electronics store chain in the US. The data contains 12 CSV files for each month of 2019. The naming convention is as follows: Sales_[MONTH_NAME]_2019 Each file contains anywhere from around 9000 to 26000 rows and 6 columns. The columns are as follows: Order ID, Product, Quantity Ordered, Price Each, Order Date, Purchase Address There are around 186851 data points combining all the 12-month files. There may be null values in some rows.

Inspiration

Keith Galli

Acknowledgements

Keith Galli's Youtube video - Solving real world data science tasks with Python Pandas!

Keith Galli's GitHub Repo - Pandas-Data-Science-Tasks
Pandas Practice Dataset
kaggle.com
zip
Updated Jan 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mrityunjay Pathak (2023). Pandas Practice Dataset [Dataset]. https://www.kaggle.com/datasets/themrityunjaypathak/pandas-practice-dataset/discussion
Explore at:
zip(493 bytes)Available download formats
Dataset updated
Jan 27, 2023
Authors
Mrityunjay Pathak
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
What is Pandas?

Pandas is a Python library used for working with data sets.

It has functions for analyzing, cleaning, exploring, and manipulating data.

The name "Pandas" has a reference to both "Panel Data", and "Python Data Analysis" and was created by Wes McKinney in 2008.

Why Use Pandas?

Pandas allows us to analyze big data and make conclusions based on statistical theories.

Pandas can clean messy data sets, and make them readable and relevant.

Relevant data is very important in data science.

What Can Pandas Do?

Pandas gives you answers about the data. Like:

Is there a correlation between two or more columns?

What is average value?

Max value?

Min value?
h
watches
huggingface.co
Updated Nov 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
gil (2025). watches [Dataset]. https://huggingface.co/datasets/yotam22/watches
Explore at:
Dataset updated
Nov 17, 2025
Authors
gil
Description
🕰️ Exploratory Data Analysis of Luxury Watch Prices

Overview

This project analyzes a large dataset of luxury watches to understand which factors influence price.We focus on brand, movement type, case material, size, gender, and production year.All work was done in Python (Pandas, NumPy, Matplotlib/Seaborn) on Google Colab.

Dataset

Rows: ~172,000
Columns: 14
Unit of observation: one watch listing

Main columns

name – watch/listing title
price – listed… See the full description on the dataset page: https://huggingface.co/datasets/yotam22/watches.
Cleaned Netflix Dataset for EDA
kaggle.com
zip
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikhil raman K (2025). Cleaned Netflix Dataset for EDA [Dataset]. https://www.kaggle.com/datasets/nikhilramank/cleaned-netflix-dataset-for-eda
Explore at:
zip(750797 bytes)Available download formats
Dataset updated
Jul 7, 2025
Authors
Nikhil raman K
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a cleaned version of a Netflix movies dataset prepared for exploratory data analysis (EDA). Missing values have been handled, invalid rows removed, and numerical + categorical columns cleaned for analysis using Python and Pandas.
Keith Galli's Sales Analysis Exercise
kaggle.com
zip
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zulkhairee Sulaiman (2022). Keith Galli's Sales Analysis Exercise [Dataset]. https://www.kaggle.com/datasets/zulkhaireesulaiman/sales-analysis-2019-excercise/discussion
Explore at:
zip(2505083 bytes)Available download formats
Dataset updated
Jan 28, 2022
Authors
Zulkhairee Sulaiman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This is the dataset required for Keith Galli's 'Solving real world data science tasks with Python Pandas!' video. Where he analyzes and answers business questions for 12 months worth of business data. The data contains hundreds of thousands of electronics store purchases broken down by month, product type, cost, purchase address, etc.

I decided to upload the data here so that I can carry out the exercise straight on Kaggle Notebooks. Making it ready for viewing as a portfolio project.

Content

12 .csv files containing sales data for each month of 2019.

Acknowledgements

Of course, all thanks goes to Keith Galli and the great work he does with his tutorials. He has several other amazing tutorials that you can follow and subscribe at his channel.
singapore
kaggle.com
zip
Updated Jul 30, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
saibharath (2020). singapore [Dataset]. https://www.kaggle.com/saibharath12/singapore
Explore at:
zip(116322 bytes)Available download formats
Dataset updated
Jul 30, 2020
Authors
saibharath
Area covered
Singapore
Description
This dataset has total population of dingapore basing on their ethnicity,gender . It is raw data which has mixed entities in columns . from year 1957 to 2018 population data is given . The main aim in uploading this data is to get skilled in python pandas for exploratory data analysis.
Play Store Data Analysis By Vaishnavi
kaggle.com
zip
Updated Apr 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vaishnavi Sahu (2021). Play Store Data Analysis By Vaishnavi [Dataset]. https://www.kaggle.com/vaishnavisahu/play-store-data-analysis-by-vaishnavi
Explore at:
zip(597350 bytes)Available download formats
Dataset updated
Apr 30, 2021
Authors
Vaishnavi Sahu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
**### Context

EDA using numpy and pandas

Content

In this Task i have to predict what factors makes an app perform well .whether its size , price , category or multiple factors together . what makes an app rank on the top in google Playstore .**

Column description: App : name of the application Category: category of the application Rating: rating of an application Reviews: reviews of that application Size: size of application Installs:how many users installed that application Type: Type of application Price: price of application content rating:rating of content of the application
Startup_India_EDA
kaggle.com
zip
Updated Apr 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aryan Mahabhoi (2022). Startup_India_EDA [Dataset]. https://www.kaggle.com/datasets/aryanmahabhoi/startup-india-eda
Explore at:
zip(97006 bytes)Available download formats
Dataset updated
Apr 30, 2022
Authors
Aryan Mahabhoi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Startup India - Exploratory Data Analysis

1- The dataset contains updated record of all startups from 1963 to 2021. 2- An Exploratory Data Analysis is performed our the record with different types of data visualizations.

Technologies Used: Python Numpy Pandas Matplotlib Seaborn
Aviation EDA - on plane accidents
kaggle.com
zip
Updated Nov 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
victor munyaradzi (2024). Aviation EDA - on plane accidents [Dataset]. https://www.kaggle.com/datasets/victormunyaradzi/aviation-eda-on-plane-accidents
Explore at:
zip(628563 bytes)Available download formats
Dataset updated
Nov 27, 2024
Authors
victor munyaradzi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
this is my first EDA analysis took the data off Kaggle took a sample of all accidents since 1919 did an EDA analysis on them using MATPLOTLIb, Python, Pandas and Numpy.

not so familiar with Git or kaggle as an aspiring Data Analysist/ scientist so please forgive any github errors
ZOMATO BANGALORE EDA
kaggle.com
zip
Updated Sep 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anshika Srivastava (2025). ZOMATO BANGALORE EDA [Dataset]. https://www.kaggle.com/datasets/anshikasri62/zomato-banglore-eda
Explore at:
zip(1246927 bytes)Available download formats
Dataset updated
Sep 15, 2025
Authors
Anshika Srivastava
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Bengaluru
Description
Exploratory Data Analysis (EDA) of ZOMATO BANGALORE DATASET using Python and its libraries (Pandas , Matplotlib and Seaborn ). Analyzed restaurant distribution ,top cuisines ,rating distribution, cost for two and other interesting insights.

Included files: - NOTEBOOK : : ZOMATO_EDA.ipynb -IMAGES : : Visualizations of key insights - requirement.txt : : Python dependencies
Shopping Mall Customer Data Segmentation Analysis
kaggle.com
zip
Updated Aug 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DataZng (2024). Shopping Mall Customer Data Segmentation Analysis [Dataset]. https://www.kaggle.com/datasets/datazng/shopping-mall-customer-data-segmentation-analysis
Explore at:
zip(5890828 bytes)Available download formats
Dataset updated
Aug 4, 2024
Authors
DataZng
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Demographic Analysis of Shopping Behavior: Insights and Recommendations

Dataset Information: The Shopping Mall Customer Segmentation Dataset comprises 15,079 unique entries, featuring Customer ID, age, gender, annual income, and spending score. This dataset assists in understanding customer behavior for strategic marketing planning.

Cleaned Data Details: Data cleaned and standardized, 15,079 unique entries with attributes including - Customer ID, age, gender, annual income, and spending score. Can be used by marketing analysts to produce a better strategy for mall specific marketing.

Challenges Faced: 1. Data Cleaning: Overcoming inconsistencies and missing values required meticulous attention. 2. Statistical Analysis: Interpreting demographic data accurately demanded collaborative effort. 3. Visualization: Crafting informative visuals to convey insights effectively posed design challenges.

Research Topics: 1. Consumer Behavior Analysis: Exploring psychological factors driving purchasing decisions. 2. Market Segmentation Strategies: Investigating effective targeting based on demographic characteristics.

Suggestions for Project Expansion: 1. Incorporate External Data: Integrate social media analytics or geographic data to enrich customer insights. 2. Advanced Analytics Techniques: Explore advanced statistical methods and machine learning algorithms for deeper analysis. 3. Real-Time Monitoring: Develop tools for agile decision-making through continuous customer behavior tracking. This summary outlines the demographic analysis of shopping behavior, highlighting key insights, dataset characteristics, team contributions, challenges, research topics, and suggestions for project expansion. Leveraging these insights can enhance marketing strategies and drive business growth in the retail sector.

References OpenAI. (2022). ChatGPT [Computer software]. Retrieved from https://openai.com/chatgpt. Mustafa, Z. (2022). Shopping Mall Customer Segmentation Data [Data set]. Kaggle. Retrieved from https://www.kaggle.com/datasets/zubairmustafa/shopping-mall-customer-segmentation-data Donkeys. (n.d.). Kaggle Python API [Jupyter Notebook]. Kaggle. Retrieved from https://www.kaggle.com/code/donkeys/kaggle-python-api/notebook Pandas-Datareader. (n.d.). Retrieved from https://pypi.org/project/pandas-datareader/
Cyclistic Bike - Data Analysis (Python)
kaggle.com
zip
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amirthavarshini (2023). Cyclistic Bike - Data Analysis (Python) [Dataset]. https://www.kaggle.com/datasets/amirthavarshini12/cyclistic-bike-data-analysis-python/code
Explore at:
zip(211278092 bytes)Available download formats
Dataset updated
Jun 19, 2023
Authors
Amirthavarshini
Description
Conducted an in-depth analysis of Cyclistic bike-share data to uncover customer usage patterns and trends. Cleaned and processed raw data using Python libraries such as pandas and NumPy to ensure data quality. Performed exploratory data analysis (EDA) to identify insights, including peak usage times, customer demographics, and trip duration patterns. Created visualizations using Matplotlib and Seaborn to effectively communicate findings. Delivered actionable recommendations to enhance customer engagement and optimize operational efficiency.
Classicmodels
kaggle.com
zip
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Javier Landaeta (2024). Classicmodels [Dataset]. https://www.kaggle.com/datasets/javierlandaeta/classicmodels
Explore at:
zip(65751 bytes)Available download formats
Dataset updated
Dec 15, 2024
Authors
Javier Landaeta
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Abstract This project presents a comprehensive analysis of a company's annual sales, using the classic dataset classicmodels as the database. Python is used as the main programming language, along with the Pandas, NumPy and SQLAlchemy libraries for data manipulation and analysis, and PostgreSQL as the database management system.

The main objective of the project is to answer key questions related to the company's sales performance, such as: Which were the most profitable products and customers? Were sales goals met? The results obtained serve as input for strategic decision making in future sales campaigns.

Methodology 1. Data Extraction:

A connection is established with the PostgreSQL database to extract the relevant data from the orders, orderdetails, customers, products and employees tables.

A reusable function is created to read each table and load it into a Pandas DataFrame.

2. Data Cleansing and Transformation:

An exploratory analysis of the data is performed to identify missing values, inconsistencies, and outliers.

New variables are calculated, such as the total value of each sale, cost, and profit.

Different DataFrames are joined using primary and foreign keys to obtain a complete view of sales.

3. Exploratory Data Analysis (EDA):

Key metrics such as total sales, number of unique customers, and average order value are calculated.

Data is grouped by different dimensions (products, customers, dates) to identify patterns and trends.

Results are visualized using relevant graphics (histograms, bar charts, etc.).

4. Modeling and Prediction:

Although the main focus of the project is descriptive, predictive modeling techniques (e.g., time series) could be explored to forecast future sales.

5. Report Generation:

Detailed reports are created in Pandas DataFrames format that answer specific business questions.

These reports are stored in new PostgreSQL tables for further analysis and visualization.

Results - Identification of top products and customers: The best-selling products and the customers that generate the most revenue are identified. - Analysis of sales trends: Sales trends over time are analyzed and possible factors that influence sales behavior are identified. - Calculation of key metrics: Metrics such as average profit margin and sales growth rate are calculated.

Conclusions This project demonstrates how Python and PostgreSQL can be effectively used to analyze large data sets and obtain valuable insights for business decision making. The results obtained can serve as a starting point for future research and development in the area of sales analysis.

Technologies Used - Python: Pandas, NumPy, SQLAlchemy, Matplotlib/Seaborn - Database: PostgreSQL - Tools: Jupyter Notebook - Keywords: data analysis, Python, PostgreSQL, Pandas, NumPy, SQLAlchemy, EDA, sales, business intelligence
💥 Data-cleaning-for-beginner-using-pandas💢💥
kaggle.com
zip
Updated Oct 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavan Tanniru (2022). 💥 Data-cleaning-for-beginner-using-pandas💢💥 [Dataset]. https://www.kaggle.com/datasets/pavantanniru/-datacleaningforbeginnerusingpandas/code
Explore at:
zip(654 bytes)Available download formats
Dataset updated
Oct 16, 2022
Authors
Pavan Tanniru
Description
This dataset helps you to increase the data-cleaning process using the pure python pandas library.

Indicators

Age

Salary

Rating

Location

Established

Easy Apply
Customer Sale Dataset for Data Visualization
kaggle.com
Updated Jun 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Atul (2025). Customer Sale Dataset for Data Visualization [Dataset]. https://www.kaggle.com/datasets/atulkgoyl/customer-sale-dataset-for-visualization
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Atul
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This synthetic dataset is designed specifically for practicing data visualization and exploratory data analysis (EDA) using popular Python libraries like Seaborn, Matplotlib, and Pandas.

Unlike most public datasets, this one includes a diverse mix of column types:

📅 Date columns (for time series and trend plots) 🔢 Numerical columns (for histograms, boxplots, scatter plots) 🏷️ Categorical columns (for bar charts, group analysis)

Whether you are a beginner learning how to visualize data or an intermediate user testing new charting techniques, this dataset offers a versatile playground.

Feel free to:

Create EDA notebooks Practice plotting techniques Experiment with filtering, grouping, and aggregations 🛠️ No missing values, no data cleaning needed — just download and start exploring!

Hope you find this helpful. Looking forward to hearing from you all.
IMDb Top 4070: Explore the Cinema Data
kaggle.com
zip
Updated Aug 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
K.T.S. Prabhu (2023). IMDb Top 4070: Explore the Cinema Data [Dataset]. https://www.kaggle.com/datasets/ktsprabhu/imdb-top-4070-explore-the-cinema-data/discussion
Explore at:
zip(1449581 bytes)Available download formats
Dataset updated
Aug 13, 2023
Authors
K.T.S. Prabhu
Description
Description: Dive into the world of exceptional cinema with our meticulously curated dataset, "IMDb's Gems Unveiled." This dataset is a result of an extensive data collection effort based on two critical criteria: IMDb ratings exceeding 7 and a substantial number of votes, surpassing 10,000. The outcome? A treasure trove of 4070 movies meticulously selected from IMDb's vast repository.

What sets this dataset apart is its richness and diversity. With more than 20 data points meticulously gathered for each movie, this collection offers a comprehensive insight into each cinematic masterpiece. Our data collection process leveraged the power of Selenium and Pandas modules, ensuring accuracy and reliability.

Cleaning this vast dataset was a meticulous task, combining both Excel and Python for optimum precision. Analysis is powered by Pandas, Matplotlib, and NLTK, enabling to uncover hidden patterns, trends, and themes within the realm of cinema.

Note: The data is collected as of April 2023. Future versions of this analysis include Movie recommendation system Please do connect for any queries, All Love, No Hate.
Convert Text to Pandas
kaggle.com
zip
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zeyad Usf (2024). Convert Text to Pandas [Dataset]. https://www.kaggle.com/datasets/zeyadusf/convert-text-to-pandas
Explore at:
zip(4333134 bytes)Available download formats
Dataset updated
Sep 22, 2024
Authors
Zeyad Usf
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
kaggle notebook
Github Repo

I found two datasets about converting text with context to pandas code on Hugging Face, but the challenge is in the context. The context in both datasets is different which reduces the results of the model. First let's mention the data I found and then show examples, solution and some other problems.

Rahima411/text-to-pandas:

The data is divided into Train with 57.5k and Test with 19.2k.

The data has two columns as you can see in the example:

"Input": Contains the context and the question together, in the context it shows the metadata about the data frame.

"Pandas Query": Pandas code txt Input | Pandas Query -----------------------------------------------------------|------------------------------------------- Table Name: head (age (object), head_id (object)) | result = management['head.age'].unique() Table Name: management (head_id (object), | temporary_acting (object)) | What are the distinct ages of the heads who are acting? |

hiltch/pandas-create-context:

It contains 17k rows with three columns:

question : text .

context : Code to create a data frame with column names, unlike the first data set which contains the name of the data frame, column names and data type.

answer : Pandas code.

question | context | answer ----------------------------------------|--------------------------------------------------------|--------------------------------------- What was the lowest # of total votes? | df = pd.DataFrame(columns=['_number_of_total_votes']) | df['_number_of_total_votes'].min()

As you can see, the problem with this data is that they are not similar as inputs and the structure of the context is different . My solution to this problem was: - Convert the first data set to become like the second in the context. I chose this because it is difficult to get the data type for the columns in the second data set. It was easy to convert the structure of the context from this shape Table Name: head (age (object), head_id (object)) to this head = pd.DataFrame(columns=['age','head_id']) through this code that I wrote. - Then separate the question from the context. This was easy because if you look at the data, you will find that the context always ends with "(" and then a blank and then the question. You will find all of this in this code. - You will also notice that more than one code or line can be returned to the context, and this has been engineered into the code. ```py def extract_table_creation(text:str)->(str,str): """ Extracts DataFrame creation statements and questions from the given text.

Args: text (str): The input text containing table definitions and questions. Returns: tuple: A tuple containing a concatenated DataFrame creation string and a question. """ # Define patterns table_pattern = r'Table Name: (\w+) \(([\w\s,()]+)\)' column_pattern = r'(\w+)\s*\((object|int64|float64)\)' # Find all table names and column definitions matches = re.findall(table_pattern, text) # Initialize a list to hold DataFrame creation statements df_creations = [] for table_name, columns_str in matches: # Extract column names columns = re.findall(column_pattern, columns_str) column_names = [col[0] for col in columns] # Format DataFrame creation statement df_creation = f"{table_name} = pd.DataFrame(columns={column_names})" df_creations.append(df_creation) # Concatenate all DataFrame creation statements df_creation_concat = '

'.join(df_creations)

# Extract and clean the question question = text[text.rindex(')')+1:].strip() return df_creation_concat, question

After both datasets were similar in structure, they were merged into one set and divided into _72.8K_ train and _18.6K_ test. We analyzed this dataset and you can see it all through the **[`notebook`](https://www.kaggle.com/code/zeyadusf/text-2-pandas-t5#Exploratory-Data-Analysis(EDA))**, but we found some problems in the dataset as well, such as > - `Answer` : `df['Id'].count()` has been repeated, but this is possible, so we do not need to dispense with these rows. > - `Context` : We see that it contains `147` rows that do not contain any text. We will see Through the experiment if this will affect the results negatively or positively. > - `Question` : It is ...
DataScience for Work - Human Resources
kaggle.com
zip
Updated Apr 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beytullah Soylev (2024). DataScience for Work - Human Resources [Dataset]. https://www.kaggle.com/datasets/soylevbeytullah/ds4work-human-resources
Explore at:
zip(51278 bytes)Available download formats
Dataset updated
Apr 28, 2024
Authors
Beytullah Soylev
Description
Case Study: Improving Human Resources with Data Science

Objective: Utilize data science to predict employee turnover and enhance the Human Resources department.

Key Learnings:

Leveraging Data Science for HR Transformation: Understand how data science can reduce employee turnover and revolutionize HR.

Logistic Regression and Random Forest Classifiers: Grasp the theory behind these classifiers and implement them using scikit-learn.

Sigmoid Functions and Pandas DataFrames: Extract probability values using sigmoid functions and manipulate datasets with Pandas.

Python Functions and Pandas Dataframe Applications: Develop and apply Python functions to Pandas dataframes.

Exploratory Data Analysis with Matplotlib and Seaborn: Perform EDA using Matplotlib and Seaborn, generating KDE plots, box plots, and count plots.

Categorical Variable Transformation and Data Set Division: Convert categorical variables into dummy variables and divide datasets into training and testing sets using scikit-learn.

Artificial Neural Networks for Classification: Understand the theory and application of artificial neural networks in classification tasks.

Classification Model Evaluation and Result Interpretation: Evaluate classification models using confusion matrices and classification reports, distinguishing between precision, recall, and F1 scores.

Embark on this data-driven journey to transform Human Resources!
Road Accident Severity in India
kaggle.com
zip
Updated Jan 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SHRIYANSHMESSI (2024). Road Accident Severity in India [Dataset]. https://www.kaggle.com/datasets/shriyanshmessi/road-accident-severity-in-india/code
Explore at:
zip(317927 bytes)Available download formats
Dataset updated
Jan 5, 2024
Authors
SHRIYANSHMESSI
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
India
Description
The dataset offers data on a number of variables related to Road Accident Severity in India, such as the time of day, the day of the week, the age range of drivers, gender, educational attainment, car attributes, driving history, road conditions, and the seriousness of accidents. We can learn more about the trends, connections, and possible risk factors associated with auto accidents by examining this dataset. The dataset offers valuable insights into the dynamics of road accidents, enabling authorities, policymakers, and researchers to make informed decisions regarding road safety measures and interventions.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sohail K. Nikouzad (2023). Capstone Project TikTok - EDA [Dataset]. https://www.kaggle.com/datasets/sohailnikouzad/capstone-pr0ject-tiktok-eda

Capstone Project TikTok - EDA

Using the Pandas package in Python for exploratory data analysis (EDA)

Explore at:

zip(52324 bytes)Available download formats

Dataset updated

Nov 15, 2023

Authors

Sohail K. Nikouzad

Description

Dataset

This dataset was created by Sohail K. Nikouzad

Clear search

Close search

Google apps

Main menu

Capstone Project TikTok - EDA

Dataset

Contents

Electronics Store Sales Dataset for EDA

Content

Inspiration

Acknowledgements

Pandas Practice Dataset

watches

Cleaned Netflix Dataset for EDA

Keith Galli's Sales Analysis Exercise

Context

Content

Acknowledgements

singapore

Play Store Data Analysis By Vaishnavi

Content

Startup_India_EDA

Aviation EDA - on plane accidents

ZOMATO BANGALORE EDA

Shopping Mall Customer Data Segmentation Analysis

Cyclistic Bike - Data Analysis (Python)

Classicmodels

💥 Data-cleaning-for-beginner-using-pandas💢💥

Indicators

Customer Sale Dataset for Data Visualization

IMDb Top 4070: Explore the Cinema Data

Convert Text to Pandas

DataScience for Work - Human Resources

Case Study: Improving Human Resources with Data Science

Road Accident Severity in India

Capstone Project TikTok - EDA

Using the Pandas package in Python for exploratory data analysis (EDA)

Dataset

Contents