25 datasets found

Titanic dataset
kaggle.com
Updated Feb 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sidra Kousar (2024). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/sidrakousar/titanic-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 29, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sidra Kousar
License
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Description
The Titanic dataset is a popular dataset used for data analysis and machine learning tasks. It contains various information about passengers aboard the Titanic, including whether they survived or not. Here's a brief description of each of the columns:

PassengerId: A unique identifier for each passenger. Survived: Indicates whether the passenger survived or not. (0 = No, 1 = Yes) Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd) Name: Name of the passenger. Sex: Gender of the passenger. Age: Age of the passenger in years. (Fractional if less than 1) SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Fare paid for the ticket. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton) This dataset is often used for tasks such as predicting survival based on various factors or analyzing demographics of passengers aboard the Titanic.
f
Titanic
rochester.figshare.com
application/csv
Updated Aug 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aabha Pandit; Alois Romanowski; Heather Owen (2024). Titanic [Dataset]. http://doi.org/10.60593/ur.d.26462215.v1
Explore at:
application/csvAvailable download formats
Unique identifier
https://doi.org/10.60593/ur.d.26462215.v1
Dataset updated
Aug 12, 2024
Dataset provided by
University of Rochester
Authors
Aabha Pandit; Alois Romanowski; Heather Owen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Titanic Dataset (for Machine Learning)The Titanic dataset is a classic and widely used dataset for machine learning and data analysis. It contains information about the passengers of the RMS Titanic, which tragically sank on its maiden voyage on April 15, 1912. The dataset provides details about each passenger, including their demographics, ticket information, and survival status. This dataset is often used to demonstrate and practice various machine learning techniques, particularly classification.This dataset is divided into two: training set & testing set.Dataset Variables:PassengerId: count for each passengerSurvived: 0 = No; 1 = YesName: name of passengerSex: passenger's sexAge: passenger's ageSibSp: number of siblings/spouses abroad the TitanicParch: number of parents/children abroad the TitanicTicket: ticket numberFare: passenger fareCabin: cabin numberEmbarked: port where passenger embarked (C = Cherbourg; Q = Queenstown; S = Southampton)
Titanic- Machine Learning from Disaster
kaggle.com
Updated Jan 7, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ManishaPrajapati (2025). Titanic- Machine Learning from Disaster [Dataset]. https://www.kaggle.com/datasets/nitu1234444/titanic-machine-learning-from-disaster/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
ManishaPrajapati
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by ManishaPrajapati

Released under MIT

Contents
c
Titanic Dataset
cubig.ai
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Titanic Dataset [Dataset]. https://cubig.ai/store/products/393/titanic-dataset
Explore at:
Dataset updated
May 29, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.

2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
A
‘Titanic: Machine Learning from Disaster’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic: Machine Learning from Disaster’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-machine-learning-from-disaster-235d/latest
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Titanic: Machine Learning from Disaster’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shuofxz/titanic-machine-learning-from-disaster on 28 January 2022.

--- No further description of dataset provided by original source ---

--- Original source retains full ownership of the source dataset ---
Titanic classification
figshare.com
txt
Updated Sep 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alvaro Rioboo (2020). Titanic classification [Dataset]. http://doi.org/10.6084/m9.figshare.12979220.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12979220.v1
Dataset updated
Sep 19, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Alvaro Rioboo
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Titanic dataset for classification training.
T
titanic
tensorflow.org
Updated Feb 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). titanic [Dataset]. https://www.tensorflow.org/datasets/catalog/titanic
Explore at:
Dataset updated
Feb 12, 2023
Description
Dataset describing the survival status of individual passengers on the Titanic. Missing values in the original dataset are represented using ?. Float and int missing values are replaced with -1, string missing values are replaced with 'Unknown'.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('titanic', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
A
‘Titanic Solution for Beginner's Guide’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic Solution for Beginner's Guide’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-solution-for-beginner-s-guide-03a8/ae3641d4/?iid=014-163&v=presentation
Explore at:
Dataset updated
Feb 14, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Titanic Solution for Beginner's Guide’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 14 February 2022.

--- Dataset description provided by original source is as follows ---

Overview

The data has been split into two groups:

training set (train.csv) test set (test.csv)

The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

Data Dictionary

Variable Definition Key survival Survival 0 = No, 1 = Yes pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

Variable Notes

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

--- Original source retains full ownership of the source dataset ---
A
‘Titanic: cleaned data’ analyzed by Analyst-2
analyst-2.ai
Updated Sep 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Titanic: cleaned data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-cleaned-data-cbf4/dc9cd7ff/?iid=055-046&v=presentation
Explore at:
Dataset updated
Sep 30, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Titanic: cleaned data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/jamesleslie/titanic-cleaned-data on 30 September 2021.

--- Dataset description provided by original source is as follows ---

Introduction

This dataset was created in this notebook as part of a three-part series. The data is in machine-learning-ready format, with all missing values for the Age, Fare and Embarked columns having been imputed.

Data imputation

Age: this column was imputed by using the median age for the passenger's title (Mr, Mrs, Dr etc).

Fare: the single missing value in this column was imputed using the median value for that passenger's class.

Embarked: the two missing values here were imputed using the Pandas backfill method.

Usage

This data is used in both the second and third parts of the series.

--- Original source retains full ownership of the source dataset ---
Titanic_ML_Python
kaggle.com
Updated Dec 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Hernandez Mayen (2023). Titanic_ML_Python [Dataset]. https://www.kaggle.com/datasets/jonathanhernandez1/titanic-ml-python
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 17, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jonathan Hernandez Mayen
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
Explora nuestro proyecto de aprendizaje automático para predecir la supervivencia en el Titanic. Con un puntaje perfecto de 1.0 y una matriz de confusión impecable, revelamos patrones asombrosos en los datos históricos.
Titanic Leaderboard March 2023
kaggle.com
Updated Apr 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lucas Antoine (2023). Titanic Leaderboard March 2023 [Dataset]. http://doi.org/10.34740/kaggle/dsv/5281032
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/5281032
Dataset updated
Apr 3, 2023
Dataset provided by
Kaggle
Authors
Lucas Antoine
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset used in my 🛳️ Titanic - Top 1% with KNN [0.81818] notebook. It contains all the leaderboard's entries from the Titanic - Machine Learning from Disaster competition in March 2023.
Competition_Titanic_machine learning from disaster
kaggle.com
Updated Jan 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mukti shukla (2023). Competition_Titanic_machine learning from disaster [Dataset]. https://www.kaggle.com/datasets/muktishukla/titanic-servival
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 20, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
mukti shukla
License
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
Description
Dataset

This dataset was created by mukti shukla

Released under GNU Lesser General Public License 3.0

Contents
Preprocessed Titanic Survived Prediction Data
kaggle.com
Updated Feb 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fethiye (2021). Preprocessed Titanic Survived Prediction Data [Dataset]. https://www.kaggle.com/fethiye/titanic-preprocessed-train-data/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Fethiye
Description
Context

Data set was created by preprocessing (filling lost data, extracting new features) of Titanic - Machine Learning Disaster data set.

Using this processed data set, the machine learning models can be applied directly.

You can see preprocessing step in notebook: https://www.kaggle.com/fethiye/titanic-predict-survival-prediction
d
Oceanographic data collected during the Titanic Expedition 2004...
catalog.data.gov
data.amerigeoss.org
+1more
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact) (2025). Oceanographic data collected during the Titanic Expedition 2004 (titanic2004) on NOAA Ship Ronald H. Brown in North Atlantic Ocean from 2004-05-27 to 2004-06-12 (NCEI Accession 0072311) [Dataset]. https://catalog.data.gov/dataset/oceanographic-data-collected-during-the-titanic-expedition-2004-titanic2004-on-noaa-ship-ronald2
Explore at:
Dataset updated
Jul 3, 2025
Dataset provided by
(Point of Contact)
Area covered
Atlantic Ocean
Description
Nearly 20 years after first finding the sunken remains of the RMS Titanic, marine explorer Robert Ballard returned in June 2004 to help the National Oceanic and Atmospheric Administration (NOAA) study the ship's rapid deterioration. A professor of oceanography at the University of Rhode Island (URI) and director of its Institute for Archaeological Oceanography, Dr. Ballard and his team of scientists from NOAA and other institutions spent 11 days at the site, mapping the ship and conducting scientific analyses of its deterioration. The team worked aboard NOAA Ship Ronald H. Brown from May 30 through June 9, and used remotely operated vehicles (ROVs) to conduct a sophisticated documentation of the state of Titanic that was not possible in the 1980s. This "Look, don't touch" mission utilized high-definition video and stereoscopic still images to provide an updated assessment of the wreck site. The science team included Dr. Dwight Coleman of URI and the Mystic Aquarium & Institute for Exploration (MAIFE), who was the expedition's research chief. As the marine archaeologist with NOAA's Office of Ocean Exploration, I oversaw the expedition's marine archaeology component. In addition to mapping the Titanic, expedition goals included the microbial research of scientist Roy Cullimore, who studied the natural deterioration of the ship's hull. Tiny microbes that feed on iron and create icicle-shaped formations called rusticles are responsible for this deterioration. While rusticles have been observed for many years, little is known about them. As the nation's ocean agency, NOAA has a vested interest in the scientific and cultural aspects of the Titanic, and in its appropriate treatment and preservation. NOAA's focus is to build a baseline of scientific information from which we can measure the shipwreck's processes and deterioration, and then apply the knowledge we gain to other deep-water shipwrecks and submerged cultural resources. The Guidelines for Research, Exploration and Salvage of RMS Titanic (9 pages, 104k) were issued under the authority of the RMS Titanic Maritime Act of 1986. On Monday, June 7, 2004, at 9 p.m. ET/PT, the National Geographic Channel gave audiences unprecedented access to the ongoing expedition by broadcasting a one-hour special, "Return to Titanic External Link," which originated from NOAA Ship Ronald H. Brown and included a live underwater telecast from the Titanic. Simultaneous with the expedition, MAIFE enabled thousands of children to experience the Titanic mission as it occurred. From June 4 through 9, four shows a day were transmitted live from the expedition via satellite and Internet2 to participating sites. The JASON Foundation for Education has created a new middle-school math curriculum called "JASON Math Adventure: Geometry and Return to Titanic," which follows the work of researchers on the expedition. Students will learn how geometry concepts are used to position NOAA Ship Ronald H. Brown at the Titanic wreck and the ROV Hercules on the Titanic's bow. Technology partners on the expedition included EDS of Texas, which wired the mission, and VBrick Systems of Connecticut, which enabled the mission feed to be broadcast nationwide.
Children on the Titanic
encyclopedia-titanica.org
json
Updated Jan 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Encyclopedia Titanica (2025). Children on the Titanic [Dataset]. https://www.encyclopedia-titanica.org/children-on-titanic/
Explore at:
jsonAvailable download formats
Dataset updated
Jan 28, 2025
Dataset authored and provided by
Encyclopedia Titanicahttp://www.encyclopedia-titanica.org/
License
https://www.encyclopedia-titanica.org/copyright-and-permissions.htmlhttps://www.encyclopedia-titanica.org/copyright-and-permissions.html
Description
A comprehensive dataset containing detailed profiles of all children (14 and under) who were aboard the Titanic. This includes information on their names, ages, family relationships, cabin assignments, nationalities, and survival status. The dataset provides insights into the demographics and experiences of the youngest passengers on the Titanic.
Data from: Titanic Survival Prediction
kaggle.com
Updated Jan 1, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shabu KC (2018). Titanic Survival Prediction [Dataset]. https://www.kaggle.com/shabukc/titanic-survival-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 1, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shabu KC
Description
Context

This is an attempt to learn prediction from the given training and test sets

Content

This is the sample from Kaggle learners section

Acknowledgements

Thankx Kaggle for this sample data set and allowing us to use it for learning

Inspiration

Learn to use data to solve problems and provide solutions.
Titanic Dataset - cleaned
kaggle.com
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
WinstonSDodson (2019). Titanic Dataset - cleaned [Dataset]. https://www.kaggle.com/datasets/winstonsdodson/titanic-dataset-cleaned/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 9, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
WinstonSDodson
Description
This is the classic Titanic Dataset provided in the Kaggle Competition K Kernel and then cleaned in one of the most popular Kernels there. Please see the Kernel titled, "A Data Science Framework: To Achieve 99% Accuracy" for a great lesson in data science. This Kernel gives a great explanaton of the thinking behind the of this data cleaning as well as a very professional demonstration of the technologies and skills to do so. It then continues to provide an overview of many ML techniques and it is copiously and meticulously documented with many useful citations.

Of course, data cleaning is an essential skill in data science but I wanted to use this data for a study of other machine learning techniques. So, I found and used this set of data that is well known and cleaned to a benchmark accepted by many.
Titanic Dataset Competition
kaggle.com
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cynthia Barasa (2022). Titanic Dataset Competition [Dataset]. https://www.kaggle.com/datasets/cynthycynthy/titanicdataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 19, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Cynthia Barasa
Description
The Titanic dataset is a well-known dataset that provides information on the passengers who were onboard the fateful voyage of the RMS Titanic. The data includes details such as the passenger's name, age, gender, ticket class, fare paid, and information on their family members. The dataset also includes a column called "Survived" which indicates whether a passenger survived the disaster or not.

There are a total of 891 rows in the dataset, with 12 columns. Some of the key columns in the dataset include:

• PassengerId: a unique identifier for each passenger • Survived: a binary variable that indicates whether the passenger survived (1) or did not survive (0) the disaster • Pclass: the ticket class of the passenger (1 = first class, 2 = second class, 3 = third class) • Name: the name of the passenger • Sex: the gender of the passenger (male or female) • Age: the age of the passenger (some values are missing) • SibSp: the number of siblings or spouses the passenger had on board • Parch: the number of parents or children the passenger had on board • Ticket: the ticket number of the passenger • Fare: the fare paid by the passenger • Cabin: the cabin number of the passenger (some values are missing) • Embarked: the port at which the passenger embarked (C = Cherbourg, Q = Queenstown, S = Southampton)

Overall, the key challenges I encountered when working on the Titanic dataset were: how to handle missing values and imbalanced classes, encode categorical variables, reduce the dimensionality of the dataset, and identify and handle noise in the data.

Here are a few tips and resources that I found helpful when getting started in the Titanic dataset competition: 1. Get familiar with the dataset 2. Pre-process the data 3. Split the data into training and test sets 4. Try out a few different algorithms 5. Tune the hyper parameters 6. Evaluate the model

Here are a few resources that I found helpful as I started Working on the competition: • Kaggle's Titanic tutorial • scikit-learn documentation. • Pandas documentation
Titanic Solution for Beginner's Guide
kaggle.com
Updated Mar 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harun-Ur-Rashid (2018). Titanic Solution for Beginner's Guide [Dataset]. https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Harun-Ur-Rashid
Description
Overview

The data has been split into two groups:

training set (train.csv) test set (test.csv)

The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

Data Dictionary

Variable Definition Key survival Survival 0 = No, 1 = Yes pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

Variable Notes

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
Titanic Data Simple EDA with Logistic Regression
kaggle.com
Updated Aug 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vicky Nayak (2020). Titanic Data Simple EDA with Logistic Regression [Dataset]. https://www.kaggle.com/vickynayak9/titanic-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vicky Nayak
Description
Dataset

This dataset was created by Vicky Nayak

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Sidra Kousar (2024). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/sidrakousar/titanic-dataset/code

Titanic dataset

"Survival Prediction on the Titanic: A Machine Learning Approach"

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 29, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Sidra Kousar

License

http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

Description

The Titanic dataset is a popular dataset used for data analysis and machine learning tasks. It contains various information about passengers aboard the Titanic, including whether they survived or not. Here's a brief description of each of the columns:

PassengerId: A unique identifier for each passenger. Survived: Indicates whether the passenger survived or not. (0 = No, 1 = Yes) Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd) Name: Name of the passenger. Sex: Gender of the passenger. Age: Age of the passenger in years. (Fractional if less than 1) SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Fare paid for the ticket. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton) This dataset is often used for tasks such as predicting survival based on various factors or analyzing demographics of passengers aboard the Titanic.

Clear search

Close search

Google apps

Main menu

Titanic dataset

Titanic

Titanic- Machine Learning from Disaster

Dataset

Contents

Titanic Dataset

‘Titanic: Machine Learning from Disaster’ analyzed by Analyst-2

Titanic classification

titanic

‘Titanic Solution for Beginner's Guide’ analyzed by Analyst-2

Overview

Data Dictionary

Variable Notes

‘Titanic: cleaned data’ analyzed by Analyst-2

Introduction

Data imputation

Usage

Titanic_ML_Python

Titanic Leaderboard March 2023

Competition_Titanic_machine learning from disaster

Dataset

Contents

Preprocessed Titanic Survived Prediction Data

Context

Oceanographic data collected during the Titanic Expedition 2004...

Children on the Titanic

Data from: Titanic Survival Prediction

Context

Content

Acknowledgements

Inspiration

Titanic Dataset - cleaned

Titanic Dataset Competition

Titanic Solution for Beginner's Guide

Overview

Data Dictionary

Variable Notes

Titanic Data Simple EDA with Logistic Regression

Dataset

Contents

Titanic dataset

"Survival Prediction on the Titanic: A Machine Learning Approach"