Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11299784%2F6530245ff6b6d097af8cb56c86b79943%2Fpxfuel.jpg?generation=1682007437079315&alt=media" alt="">The Titanic dataset is a widely used dataset that contains information on the passengers who were aboard the Titanic when it sank on its maiden voyage in 1912. The dataset includes features such as age, sex, passenger class, and fare paid, as well as whether or not the passenger survived the sinking. The dataset is often used for machine learning and data analysis tasks, such as predicting survival based on passenger characteristics or exploring patterns in the data. The Titanic dataset is a classic example of data analysis and is a great starting point for those new to data science.
The Titanic dataset is available in CSV format and contains two files, one for training and one for testing. The training file is used to build the machine learning model, while the testing file is used to test the performance of the model.
PassengerId: unique identifier for each passenger Survived: whether the passenger survived (1) or not (0) Pclass: passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class) Name: name of the passenger Sex: gender of the passenger Age: age of the passenger (in years) SibSp: number of siblings or spouses aboard the Titanic Parch: number of parents or children aboard the Titanic Ticket: ticket number Fare: passenger fare Cabin: cabin number Embarked: port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
Copyright (c) [2023] [Md Kazi Sajiduddin]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.
2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
Facebook
TwitterTomate/Kaggle-Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BrianSuToronto/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThis dataset was created by Pushpraj Namdev
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Javitron4257/Titanic-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
A cleaned, feature engineered, and split version of the classic Titanic survival dataset from OpenML. See https://github.com/jamieoliver/titanic-dataset-2410 for more details. Based on https://github.com/fastai/course22/blob/master/clean/05-linear-model-and-neural-net-from-scratch.ipynb using the dataset from using the dataset from https://www.openml.org/search?type=data&sort=runs&id=40945&status=active.
Facebook
TwitterThis dataset was created by Pavithra Naidu
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 12 November 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterThis dataset is a copy of the Titanic train.csv dataset used in the Kaggle Titanic competition. I removed all of the missing values from the Age and Embarked columns so that the dataset could be used by high school students that I teach. The following description is taken from the Kaggle competition dataset which can be found here.
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Titanic Survival
from https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/problem12.html
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic Solution for Beginner's Guide’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 30 September 2021.
--- Dataset description provided by original source is as follows ---
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterThe Titanic Dataset for GDSC - AI Model
This dataset contains information about the passengers and crew members who were on board the RMS Titanic, a British passenger liner that sank in the North Atlantic Ocean in the early hours of April 15, 1912, after striking an iceberg during her maiden voyage from Southampton to New York City. The sinking of the Titanic resulted in a large loss of life and remains one of the deadliest commercial peacetime maritime disasters in modern history.
The dataset includes a variety of features about the passengers and crew members, such as:
Passenger Class: Indicates the class (1st, 2nd, 3rd) that the passenger traveled in. Name: The passenger's name. Sex: The passenger's sex. Age: The passenger's age. SibSp: The number of siblings or spouses aboard the Titanic with the passenger. Parch: The number of parents or children aboard the Titanic with the passenger. Ticket: The passenger's ticket number. Fare: The passenger's fare. Cabin: The passenger's cabin number. Embarked: The port where the passenger embarked the Titanic. Survived: Whether the passenger survived the sinking of the Titanic (1 = survived, 0 = did not survive). What You Can Do With The Dataset
The Titanic Dataset is a valuable resource for anyone interested in machine learning, data science, or the history of the Titanic. Here are some examples of what you can do with this dataset:
Predict passenger survival: You can use the dataset to train a machine learning model to predict whether a passenger was more likely to survive the sinking of the Titanic based on features such as their class, sex, age, and number of relatives on board. Analyze factors that influenced survival rates: You can use the dataset to analyze the factors that influenced passenger survival rates. For example, you could look at how factors such as class, sex, and age affected a passenger's chances of survival. Build a classification model to identify passengers who were more likely to survive: You can use the dataset to build a classification model that can identify passengers who were more likely to survive the sinking of the Titanic. This model could be used to help us understand the factors that influenced survival rates and could also be used to improve the safety of passengers in future maritime disasters. Overall, the Titanic Dataset is a rich and informative dataset that can be used for a variety of purposes. If you are interested in machine learning, data science, or the history of the Titanic, then this dataset is a great resource to explore.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic.csv’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/garrettrlynch/titaniccsv on 30 September 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Muhammad Hassaan
Released under CC0: Public Domain
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Titanic dataset for classification training.
Facebook
TwitterThis dataset was created by Ankita Nain
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 2 rows and is filtered where the books is The Titanic & the mystery ship. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Facebook
TwitterThis dataset was created by Stephen Andrew Lynch
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.