100+ datasets found

Titanic Dataset
kaggle.com
Updated Apr 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sakshi Satre (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/sakshisatre/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 30, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sakshi Satre
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">

Data Description :-

The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.

Columns : -

Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.

Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.

Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).

Sex: The gender of the passenger, denoted as either male or female.

Age: The age of the passenger in years.

SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.

Parch: The number of parents or children aboard the Titanic for the respective passenger.

Ticket: The ticket number assigned to the passenger.

Fare: The fare paid by the passenger for the ticket.

Cabin: The cabin number assigned to the passenger, if available.

Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.

Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.

Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.

Home.dest: The destination or place of residence of the passenger.

These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.
Titanic Data set
kaggle.com
zip
Updated Mar 4, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hina Ismail (2024). Titanic Data set [Dataset]. https://www.kaggle.com/datasets/sonialikhan/titanic-data-set
Explore at:
zip(22544 bytes)Available download formats
Dataset updated
Mar 4, 2024
Authors
Hina Ismail
License
https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Description
Detail Description: The Titanic dataset offers a comprehensive glimpse into the passengers aboard the ill-fated RMS Titanic, which famously sank on its maiden voyage in April 1912 after colliding with an iceberg. This dataset contains a wealth of information about individual passengers, including demographics, ticket class, cabin information, family relationships, fare details, and most notably, survival outcomes.

Key attributes within the dataset include:

Passenger Class (Pclass): This categorical variable indicates the ticket class of each passenger, ranging from 1st class (wealthiest) to 3rd class (lower socioeconomic status).

Name: The names of passengers, providing insight into their identities.

Sex: Gender of passengers, categorized as male or female.

Age: Age of passengers, providing information about the demographic composition of the Titanic's passengers.

SibSp: Number of siblings/spouses aboard the Titanic, offering insight into family relationships.

Parch: Number of parents/children aboard the Titanic, indicating family size and composition.

Ticket: Ticket number, providing additional information about passenger accommodations and fare details.

Fare: Fare paid by each passenger, which can be indicative of their ticket class and economic status.

Cabin: Cabin number or location, offering insights into passenger accommodations.

**Embarked: **Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton), providing information about passengers' embarkation points.

**Survived: **This binary variable indicates whether a passenger survived the disaster (1) or not (0), serving as the primary outcome variable for analyses.

Researchers and data analysts frequently utilize the Titanic dataset for various purposes, including:

Exploratory data analysis to understand the demographic composition of passengers and their survival outcomes. Predictive modeling to develop algorithms that predict the likelihood of survival based on passenger characteristics. Feature engineering to derive new variables that may enhance predictive accuracy. Hypothesis testing to investigate factors associated with survival rates, such as passenger class, gender, age, and family size. Overall, the Titanic dataset serves as a valuable resource for understanding historical events, exploring data analysis techniques, and teaching machine learning concepts. Its accessibility and rich contextual information make it a popular choice for both educational and research purposes within the data science community.
Titanic Dataset
kaggle.com
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sajid (2023). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/dbdmobile/tita111
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sajid
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11299784%2F6530245ff6b6d097af8cb56c86b79943%2Fpxfuel.jpg?generation=1682007437079315&alt=media" alt="">The Titanic dataset is a widely used dataset that contains information on the passengers who were aboard the Titanic when it sank on its maiden voyage in 1912. The dataset includes features such as age, sex, passenger class, and fare paid, as well as whether or not the passenger survived the sinking. The dataset is often used for machine learning and data analysis tasks, such as predicting survival based on passenger characteristics or exploring patterns in the data. The Titanic dataset is a classic example of data analysis and is a great starting point for those new to data science.

The Titanic dataset is available in CSV format and contains two files, one for training and one for testing. The training file is used to build the machine learning model, while the testing file is used to test the performance of the model.

Column Description

PassengerId: unique identifier for each passenger Survived: whether the passenger survived (1) or not (0) Pclass: passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class) Name: name of the passenger Sex: gender of the passenger Age: age of the passenger (in years) SibSp: number of siblings or spouses aboard the Titanic Parch: number of parents or children aboard the Titanic Ticket: ticket number Fare: passenger fare Cabin: cabin number Embarked: port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)

MIT License

Copyright (c) [2023] [Md Kazi Sajiduddin]

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
h
titanic
huggingface.co
Updated Mar 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paulo Martins (2025). titanic [Dataset]. https://huggingface.co/datasets/paulopontesm/titanic
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 1, 2025
Authors
Paulo Martins
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for Titanic Survival Prediction

Dataset Details Dataset Description

This dataset is a copy of the original Kaggle Titanic dataset made to explore the Hugging Face Datasets feature. The Titanic Survival Prediction dataset is widely used in machine learning and statistics. It originates from the Titanic: Machine Learning from Disaster competition on Kaggle. The dataset consists of passenger details from the RMS Titanic disaster, including demographic… See the full description on the dataset page: https://huggingface.co/datasets/paulopontesm/titanic.
h
Kaggle-Titanic
huggingface.co
Updated Nov 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ensalada (2023). Kaggle-Titanic [Dataset]. https://huggingface.co/datasets/Tomate/Kaggle-Titanic
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 2, 2023
Authors
Ensalada
Description
Tomate/Kaggle-Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
c
Titanic Dataset
cubig.ai
zip
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Titanic Dataset [Dataset]. https://cubig.ai/store/products/393/titanic-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 29, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.

2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
h
titanic-dataset
huggingface.co
Updated Sep 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brian Su (2024). titanic-dataset [Dataset]. https://huggingface.co/datasets/BrianSuToronto/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2024
Authors
Brian Su
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
BrianSuToronto/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
titanic_dataset
kaggle.com
zip
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SURENDHAN (2024). titanic_dataset [Dataset]. https://www.kaggle.com/datasets/surendhan/titanic-dataset
Explore at:
zip(11516 bytes)Available download formats
Dataset updated
Jun 7, 2024
Authors
SURENDHAN
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The Titanic dataset on Kaggle is a well-known dataset used for machine learning and data science projects, especially for binary classification tasks. It includes data on the passengers of the Titanic, which sank on its maiden voyage in 1912. This dataset is often used to predict the likelihood of a passenger's survival based on various features. Here is a detailed description of the dataset:

Overview The Titanic dataset includes information about the passengers on the Titanic, such as their demographic information, class, fare, and whether they survived the disaster. The goal is to predict the survival of the passengers.

Files The dataset typically includes three files:

train.csv: The training set, which includes the features and the target variable (Survived). test.csv: The test set, which includes the features but not the target variable. You use this file to make predictions that can be submitted to Kaggle. gender_submission.csv: An example of a submission file in the correct format. Features The dataset contains the following columns:

PassengerId: Unique ID for each passenger. Survived: Target variable (0 = No, 1 = Yes) indicating if the passenger survived. Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd). Name: Name of the passenger. Sex: Gender of the passenger (male or female). Age: Age of the passenger in years. Fractional values indicate age in months for infants. SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Passenger fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).
Titanic_dataset
kaggle.com
zip
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marouan daghmoumi (2023). Titanic_dataset [Dataset]. https://www.kaggle.com/datasets/marouandaghmoumi/titanic-dataset
Explore at:
zip(96969 bytes)Available download formats
Dataset updated
Dec 28, 2023
Authors
Marouan daghmoumi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Titanic dataset is a popular dataset in the field of data science and machine learning. It contains information about the passengers aboard the RMS Titanic, which sank on its maiden voyage in 1912 after hitting an iceberg. The dataset is often used for predictive modeling and classification tasks.

Here are the key features or columns in the Titanic dataset:

- PassengerId: A unique identifier assigned to each passenger. - Survived: A binary variable indicating whether the passenger survived (1) or did not survive (0). - Pclass (Passenger Class): The ticket class of the passenger, which can be 1st (1), 2nd (2), or 3rd (3) class. - Name: The name of the passenger. - Sex: The gender of the passenger (male or female). - Age: The age of the passenger in years. It may contain missing values. - SibSp: The number of siblings or spouses the passenger had aboard the Titanic. - Parch: The number of parents or children the passenger had aboard the Titanic. - Ticket: The ticket number. - Fare: The amount of money the passenger paid for the ticket. - Cabin: The cabin number where the passenger stayed. It may contain missing values. - Embarked: The port at which the passenger boarded the Titanic (C for Cherbourg, Q for Queenstown, S for Southampton).
h
Data from: titanic-survival
huggingface.co
Updated Apr 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julien Chaumond (2023). titanic-survival [Dataset]. https://huggingface.co/datasets/julien-c/titanic-survival
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2023
Authors
Julien Chaumond
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Titanic Survival

from https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/problem12.html
h
Titanic-Dataset
huggingface.co
Updated Oct 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Javier Tomás Vicente (2024). Titanic-Dataset [Dataset]. https://huggingface.co/datasets/Javitron4257/Titanic-Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 21, 2024
Authors
Javier Tomás Vicente
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Javitron4257/Titanic-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Titanic Dataset
kaggle.com
zip
Updated Jul 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Waqar Ali (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/waqi786/titanic-dataset
Explore at:
zip(41548 bytes)Available download formats
Dataset updated
Jul 25, 2024
Authors
Waqar Ali
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset provides information on passengers aboard the RMS Titanic, including features that can be used for predicting survival. It contains various attributes related to passengers such as age, sex, ticket fare, and passenger class, which are crucial for understanding patterns and building predictive models.

Content:

PassengerID: Unique identifier for each passenger. Pclass: Passenger class (1st, 2nd, 3rd). Name: Full name of the passenger. Sex: Gender of the passenger. Age: Age of the passenger. SibSp: Number of siblings/spouses aboard. Parch: Number of parents/children aboard. Ticket: Ticket number. Fare: Ticket fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg; Q = Queenstown; S = Southampton). Survived: Survival status (0 = No; 1 = Yes).

Usage:

This dataset is ideal for practice in classification tasks, particularly for predicting binary outcomes such as survival status. It is commonly used for various machine learning challenges, including exploratory data analysis and feature engineering.
Titanic-json-format
kaggle.com
zip
Updated Sep 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdul Basit AI (2025). Titanic-json-format [Dataset]. https://www.kaggle.com/datasets/engrbasit62/titanic-json-format
Explore at:
zip(33844 bytes)Available download formats
Dataset updated
Sep 21, 2025
Authors
Abdul Basit AI
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
🛳️ Titanic Dataset (JSON Format) 📌 Overview

This is the classic Titanic: Machine Learning from Disaster dataset, converted into JSON format for easier use in APIs, data pipelines, and Python projects. It contains the same passenger details as the original CSV version, but stored as JSON for convenience.

📂 Dataset Contents

File: titanic.json

Columns: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked

Use Cases: Exploratory Data Analysis (EDA), feature engineering, machine learning model training, web app backends, JSON parsing practice.

🛠️ How to Use 🔹 1. Load with kagglehub import kagglehub

Download the latest version of the dataset

path = kagglehub.dataset_download("engrbasit62/titanic-json-format") print("Path to dataset files:", path)

🔹 2. Load into Pandas import pandas as pd

Read the JSON file into a DataFrame

df = pd.read_json(f"{path}/titanic.json")

print(df.head())

💡 Notes

Preview truncation: Kaggle may show only part of the JSON in the preview panel because of its size. ✅ Don’t worry — the full dataset is available when loaded via code.

Benefits of JSON format: Ideal for web apps, APIs, or projects that work with structured data. Easily convertible back to CSV if needed.
⛴️ Titanic dataset ⛴️
kaggle.com
zip
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Davi N. (2024). ⛴️ Titanic dataset ⛴️ [Dataset]. https://www.kaggle.com/datasets/davinascimento/titanic-dataset
Explore at:
zip(22548 bytes)Available download formats
Dataset updated
Sep 13, 2024
Authors
Davi N.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset used in the Titanic - Machine Learning from Disaster competiton.

titanic.csv: - passengerId - Passenger unique ID. - survival - If the passenger survived, 0 = No, 1 = Yes. - pclass - Ticket class, 1 = 1st, 2 = 2nd, 3 = 3rd. - name - Name of the passenger. - sex - Sex. - age - Age in years. - sibsp - # of siblings / spouses aboard the Titanic. - parch - # of parents / children aboard the Titanic. - ticket - Ticket number. - fare - Passenger fare. - cabin - Cabin number. - embarked - Port of Embarkation, C = Cherbourg, Q = Queenstown, S = Southampton.
Titanic Dataset - EDA & Logistic Regression
kaggle.com
zip
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RabbiTheAnalyst (2025). Titanic Dataset - EDA & Logistic Regression [Dataset]. https://www.kaggle.com/datasets/mdrabbiali/titanic-data-set
Explore at:
zip(22566 bytes)Available download formats
Dataset updated
Feb 19, 2025
Authors
RabbiTheAnalyst
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone on board, resulting in the death of 1502 out of 2224 passengers and crew. While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others. In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).

Objective:

Survival Prediction: To build a logistic regression model that accurately predicts the survival of passengers based on features such as age, gender, passenger class, and number of siblings/spouses aboard.

Data Cleaning and Preprocessing:To perform data cleaning by handling missing values, removing unnecessary columns, and encoding categorical variables to prepare the dataset for analysis.

Exploratory Data Analysis (EDA): To conduct a thorough exploratory data analysis to visualize survival rates and identify patterns based on various factors like gender, passenger class, and embarked location.

Feature Importance Analysis: To analyze the correlation between different features and their impact on survival rates, identifying which factors are the most significant predictors of survival.

Model Evaluation: To evaluate the performance of the logistic regression model using accuracy scores and classification reports, ensuring that the model generalizes well to unseen data.

ROC Curve Analysis: To create a ROC curve to assess the trade-off between the true positive rate and false positive rate, providing insights into the model's ability to distinguish between survivors and non-survivors.

Insights and Recommendations: To derive insights from the analysis that could inform future safety measures or policies related to passenger safety in maritime travel.
Simplified Titanic Dataset
kaggle.com
zip
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhavleen Kaur (2023). Simplified Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/bhavkaur/simplified-titanic-dataset
Explore at:
zip(5195 bytes)Available download formats
Dataset updated
Jun 16, 2023
Authors
Bhavleen Kaur
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is a simplified version of the famous Titanic dataset, which contains information about passengers aboard the Titanic ship. It is designed specifically for beginners who are learning about data analysis and classification problems.

Note: This simplified dataset does not contain all the columns available in the original Titanic dataset, but it retains the essential features for introductory purposes.
Titanic Dataset
kaggle.com
zip
Updated Sep 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prince Rajak (2025). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/prince7489/titanic-dataset
Explore at:
zip(1849548 bytes)Available download formats
Dataset updated
Sep 29, 2025
Authors
Prince Rajak
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Titanic Survival Prediction Project explores one of the most iconic datasets in data science. The goal is to predict whether a passenger survived the Titanic disaster based on key attributes such as age, gender, ticket class, family size, and fare.

Using a dataset of 100,000 synthetic records inspired by the original Titanic data, this project demonstrates a complete data science workflow — including data cleaning, exploratory data analysis (EDA), feature engineering, and predictive modeling.

By analyzing patterns (e.g., higher survival rates among women, children, and first-class passengers), the project showcases how machine learning can uncover meaningful insights from historical events.
titanic_dataset
kaggle.com
zip
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mahmoud shogaa (2023). titanic_dataset [Dataset]. https://www.kaggle.com/datasets/mahmoudshogaa/titanic-dataset
Explore at:
zip(22491 bytes)Available download formats
Dataset updated
Nov 24, 2023
Authors
mahmoud shogaa
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The dataset typically includes the following columns:

PassengerId: A unique identifier for each passenger. Survived: This column indicates whether a passenger survived (1) or did not survive (0). Pclass (Ticket class): A proxy for socio-economic status, with 1 being the highest class and 3 the lowest. Name: The name of the passenger. Sex: The gender of the passenger. Age: The age of the passenger. (Note: There might be missing values in this column.) SibSp: The number of siblings or spouses the passenger had aboard the Titanic. Parch: The number of parents or children the passenger had aboard the Titanic. Ticket: The ticket number. Fare: The amount of money the passenger paid for the ticket.

The main goal of using this dataset is to predict whether a passenger survived or not based on various features. It serves as a popular introductory dataset for those learning data analysis, machine learning, and predictive modeling. Keep in mind that the dataset may be subject to variations and updates, so it's always a good idea to check the Kaggle website or dataset documentation for the most recent information.
Titanic Dataset Complete
kaggle.com
zip
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Waqas Ali Khan (2024). Titanic Dataset Complete [Dataset]. https://www.kaggle.com/datasets/waqasali51/titanic-dataset-complete
Explore at:
zip(22678 bytes)Available download formats
Dataset updated
Jul 8, 2024
Authors
Waqas Ali Khan
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Waqas Ali Khan

Released under Apache 2.0

Contents
Titanic Dataset
kaggle.com
zip
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khalid Hussain (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/arman9640/titanic-dataset
Explore at:
zip(22472 bytes)Available download formats
Dataset updated
Oct 15, 2024
Authors
Khalid Hussain
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Khalid Hussain

Released under Apache 2.0

Contents

Titanic Dataset CSV File

Facebook

Twitter

Click to copy link

Link copied

Cite

Sakshi Satre (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/sakshisatre/titanic-dataset

Titanic Dataset

"Tragedy at Sea : The Titanic Disaster !!"

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 30, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Sakshi Satre

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">

Data Description :-

The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.

Columns : -

Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.

These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.

Clear search

Close search

Google apps

Main menu

Titanic Dataset

Data Description :-

Columns : -

Titanic Data set

Titanic Dataset

Column Description

MIT License

titanic

Kaggle-Titanic

Titanic Dataset

titanic-dataset

titanic_dataset

Titanic_dataset

Data from: titanic-survival

Titanic-Dataset

Titanic Dataset

Titanic-json-format

Download the latest version of the dataset

Read the JSON file into a DataFrame

⛴️ Titanic dataset ⛴️

Titanic Dataset - EDA & Logistic Regression

Simplified Titanic Dataset

Titanic Dataset

titanic_dataset

Titanic Dataset Complete

Dataset

Contents

Titanic Dataset

Dataset

Contents

Titanic Dataset

"Tragedy at Sea : The Titanic Disaster !!"

Data Description :-

Columns : -