Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11299784%2F6530245ff6b6d097af8cb56c86b79943%2Fpxfuel.jpg?generation=1682007437079315&alt=media" alt="">The Titanic dataset is a widely used dataset that contains information on the passengers who were aboard the Titanic when it sank on its maiden voyage in 1912. The dataset includes features such as age, sex, passenger class, and fare paid, as well as whether or not the passenger survived the sinking. The dataset is often used for machine learning and data analysis tasks, such as predicting survival based on passenger characteristics or exploring patterns in the data. The Titanic dataset is a classic example of data analysis and is a great starting point for those new to data science.
The Titanic dataset is available in CSV format and contains two files, one for training and one for testing. The training file is used to build the machine learning model, while the testing file is used to test the performance of the model.
PassengerId: unique identifier for each passenger Survived: whether the passenger survived (1) or not (0) Pclass: passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class) Name: name of the passenger Sex: gender of the passenger Age: age of the passenger (in years) SibSp: number of siblings or spouses aboard the Titanic Parch: number of parents or children aboard the Titanic Ticket: ticket number Fare: passenger fare Cabin: cabin number Embarked: port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
Copyright (c) [2023] [Md Kazi Sajiduddin]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BrianSuToronto/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Titanic Dataset (for Machine Learning)The Titanic dataset is a classic and widely used dataset for machine learning and data analysis. It contains information about the passengers of the RMS Titanic, which tragically sank on its maiden voyage on April 15, 1912. The dataset provides details about each passenger, including their demographics, ticket information, and survival status. This dataset is often used to demonstrate and practice various machine learning techniques, particularly classification.This dataset is divided into two: training set & testing set.Dataset Variables:PassengerId: count for each passengerSurvived: 0 = No; 1 = YesName: name of passengerSex: passenger's sexAge: passenger's ageSibSp: number of siblings/spouses abroad the TitanicParch: number of parents/children abroad the TitanicTicket: ticket numberFare: passenger fareCabin: cabin numberEmbarked: port where passenger embarked (C = Cherbourg; Q = Queenstown; S = Southampton)
This dataset was created by Pushpraj Namdev
This dataset was created by Ryan Selesnik
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Version of the titanic dataset used in ggEDA manuscript.Can be loaded from the datarium R package (datarium::titanic.raw
). Originally published by the British Board of Trade in 1990. If you use, please cite:British Board of Trade. Report on the Loss of the āTitanicā (S.S.). Allan Sutton Publishing, Gloucester, UK, 1990. British Board of Trade Inquiry Report (reprint).Alboukadel Kassambara. datarium: Data Bank for Statistical Analysis and Visualization, 2019. URL https://CRAN.R-project.org/package=datarium. R package version 0.1.0.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of āTitanic-Dataset (train.csv)ā provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 12 November 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
nik20004/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
The Titanic Dataset for GDSC - AI Model
This dataset contains information about the passengers and crew members who were on board the RMS Titanic, a British passenger liner that sank in the North Atlantic Ocean in the early hours of April 15, 1912, after striking an iceberg during her maiden voyage from Southampton to New York City. The sinking of the Titanic resulted in a large loss of life and remains one of the deadliest commercial peacetime maritime disasters in modern history.
The dataset includes a variety of features about the passengers and crew members, such as:
Passenger Class: Indicates the class (1st, 2nd, 3rd) that the passenger traveled in. Name: The passenger's name. Sex: The passenger's sex. Age: The passenger's age. SibSp: The number of siblings or spouses aboard the Titanic with the passenger. Parch: The number of parents or children aboard the Titanic with the passenger. Ticket: The passenger's ticket number. Fare: The passenger's fare. Cabin: The passenger's cabin number. Embarked: The port where the passenger embarked the Titanic. Survived: Whether the passenger survived the sinking of the Titanic (1 = survived, 0 = did not survive). What You Can Do With The Dataset
The Titanic Dataset is a valuable resource for anyone interested in machine learning, data science, or the history of the Titanic. Here are some examples of what you can do with this dataset:
Predict passenger survival: You can use the dataset to train a machine learning model to predict whether a passenger was more likely to survive the sinking of the Titanic based on features such as their class, sex, age, and number of relatives on board. Analyze factors that influenced survival rates: You can use the dataset to analyze the factors that influenced passenger survival rates. For example, you could look at how factors such as class, sex, and age affected a passenger's chances of survival. Build a classification model to identify passengers who were more likely to survive: You can use the dataset to build a classification model that can identify passengers who were more likely to survive the sinking of the Titanic. This model could be used to help us understand the factors that influenced survival rates and could also be used to improve the safety of passengers in future maritime disasters. Overall, the Titanic Dataset is a rich and informative dataset that can be used for a variety of purposes. If you are interested in machine learning, data science, or the history of the Titanic, then this dataset is a great resource to explore.
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Javitron4257/Titanic-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Titanic dataset for classification training.
https://www.encyclopedia-titanica.org/copyright-and-permissions.htmlhttps://www.encyclopedia-titanica.org/copyright-and-permissions.html
Who travelled on the Titanic? When she reached the open Atlantic on 11 April 1912, the Titanic carried 2,208 people however many more travelled on her: on the delivery trip from Belfast to Southampton, and on the short journeys to Cherbourg and Queenstown. This dataset includes everyone that travelled on the maiden voyage but also the delivery and passengers who were fortunate enough to disembark.
**Dataset Overview ** The Titanic dataset is a widely used benchmark dataset for machine learning and data science tasks. It contains information about passengers who boarded the RMS Titanic in 1912, including their age, sex, social class, and whether they survived the sinking of the ship. The dataset is divided into two main parts:
Train.csv: This file contains information about 891 passengers who were used to train machine learning models. It includes the following features:
PassengerId: A unique identifier for each passenger Survived: Whether the passenger survived (1) or not (0) Pclass: The passenger's social class (1 = Upper, 2 = Middle, 3 = Lower) Name: The passenger's name Sex: The passenger's sex (Male or Female) Age: The passenger's age Sibsp: The number of siblings or spouses aboard the ship Parch: The number of parents or children aboard the ship Ticket: The passenger's ticket number Fare: The passenger's fare Cabin: The passenger's cabin number Embarked: The port where the passenger embarked (C = Cherbourg, Q = Queenstown, S = Southampton) Test.csv: This file contains information about 418 passengers who were not used to train machine learning models. It includes the same features as train.csv, but does not include the Survived label. The goal of machine learning models is to predict whether or not each passenger in the test.csv file survived.
**Data Preparation ** Before using the Titanic dataset for machine learning tasks, it is important to perform some data preparation steps. These steps may include:
Handling missing values: Some of the features in the dataset have missing values. These values can be imputed or removed, depending on the specific task. Encoding categorical variables: Some of the features in the dataset are categorical variables, such as Pclass, Sex, and Embarked. These variables need to be encoded numerically before they can be used by machine learning algorithms. Scaling numerical variables: Some of the features in the dataset are numerical variables, such as Age and Fare. These variables may need to be scaled to ensure that they are on the same scale. Data Visualization
Data visualization can be a useful tool for exploring the Titanic dataset and gaining insights into the data. Some common data visualization techniques that can be used with the Titanic dataset include:
Histograms: Histograms can be used to visualize the distribution of numerical variables, such as Age and Fare. Scatter plots: Scatter plots can be used to visualize the relationship between two numerical variables. Box plots: Box plots can be used to visualize the distribution of a numerical variable across different categories, such as Pclass and Sex. Machine Learning Tasks
The Titanic dataset can be used for a variety of machine learning tasks, including:
Classification: The most common task is to use the train.csv file to train a machine learning model to predict whether or not each passenger in the test.csv file survived. Regression: The dataset can also be used to train a machine learning model to predict the fare of a passenger based on their other features. Anomaly detection: The dataset can also be used to identify anomalies, such as passengers who are outliers in terms of their age, social class, or other features.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of āTitanic csvā provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/fossouodonald/titaniccsv on 13 November 2021.
--- Dataset description provided by original source is as follows ---
this dataset is the result of titanic csv
--- Original source retains full ownership of the source dataset ---
https://www.encyclopedia-titanica.org/copyright-and-permissions.htmlhttps://www.encyclopedia-titanica.org/copyright-and-permissions.html
The complete list of RMS Titanic passengers and crew, including detailed records of survivors and victims.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
titanic dataset
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of āTitanic Solution for Beginner's Guideā provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 30 September 2021.
--- Dataset description provided by original source is as follows ---
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the āground truthā) for each passenger. Your model will be based on āfeaturesā like passengersā gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancƩs were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
--- Original source retains full ownership of the source dataset ---
A partial passenger manifest for the fateful last trip of the Titanic
This dataset was created by fehu.zone
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.