https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Syed Hamza Ali
Released under CC0: Public Domain
This dataset was created by Mohammed taha Khamed
Dataset describing the survival status of individual passengers on the Titanic. Missing values in the original dataset are represented using ?. Float and int missing values are replaced with -1, string missing values are replaced with 'Unknown'.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('titanic', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11299784%2F6530245ff6b6d097af8cb56c86b79943%2Fpxfuel.jpg?generation=1682007437079315&alt=media" alt="">The Titanic dataset is a widely used dataset that contains information on the passengers who were aboard the Titanic when it sank on its maiden voyage in 1912. The dataset includes features such as age, sex, passenger class, and fare paid, as well as whether or not the passenger survived the sinking. The dataset is often used for machine learning and data analysis tasks, such as predicting survival based on passenger characteristics or exploring patterns in the data. The Titanic dataset is a classic example of data analysis and is a great starting point for those new to data science.
The Titanic dataset is available in CSV format and contains two files, one for training and one for testing. The training file is used to build the machine learning model, while the testing file is used to test the performance of the model.
PassengerId: unique identifier for each passenger Survived: whether the passenger survived (1) or not (0) Pclass: passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class) Name: name of the passenger Sex: gender of the passenger Age: age of the passenger (in years) SibSp: number of siblings or spouses aboard the Titanic Parch: number of parents or children aboard the Titanic Ticket: ticket number Fare: passenger fare Cabin: cabin number Embarked: port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
Copyright (c) [2023] [Md Kazi Sajiduddin]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Titanic Dataset Description Overview The data is divided into two groups: - Training set (train.csv): Used to build machine learning models. It includes the outcome (also called the "ground truth") for each passenger, allowing models to predict survival based on “features” like gender and class. Feature engineering can also be applied to create new features. - Test set (test.csv): Used to evaluate model performance on unseen data. The ground truth is not provided; the task is to predict survival for each passenger in the test set using the trained model.
Additionally, gender_submission.csv is provided as an example submission file, containing predictions based on the assumption that all and only female passengers survive.
Data Dictionary | Variable | Definition | Key | |------------|------------------------------------------|-------------------------------------------------| | survival | Survival | 0 = No, 1 = Yes | | pclass | Ticket class | 1 = 1st, 2 = 2nd, 3 = 3rd | | sex | Sex | | | age | Age in years | | | sibsp | # of siblings/spouses aboard the Titanic | | | parch | # of parents/children aboard the Titanic | | | ticket | Ticket number | | | fare | Passenger fare | | | cabin | Cabin number | | | embarked | Port of Embarkation | C = Cherbourg, Q = Queenstown, S = Southampton |
Variable Notes
pclass: Proxy for socio-economic status (SES):
1st = Upper
2nd = Middle
3rd = Lower
age:
Fractional if less than 1 year.
Estimated ages are represented in the form xx.5.
sibsp: Defines family relations as:
Sibling: Brother, sister, stepbrother, stepsister.
Spouse: Husband, wife (excluding mistresses and fiancés).
parch: Defines family relations as:
Parent: Mother, father.
Child: Daughter, son, stepdaughter, stepson.
Some children traveled only with a nanny, so parch = 0 for them.
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
victor/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIT/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset typically includes the following columns:
PassengerId: A unique identifier for each passenger. Survived: This column indicates whether a passenger survived (1) or did not survive (0). Pclass (Ticket class): A proxy for socio-economic status, with 1 being the highest class and 3 the lowest. Name: The name of the passenger. Sex: The gender of the passenger. Age: The age of the passenger. (Note: There might be missing values in this column.) SibSp: The number of siblings or spouses the passenger had aboard the Titanic. Parch: The number of parents or children the passenger had aboard the Titanic. Ticket: The ticket number. Fare: The amount of money the passenger paid for the ticket.
The main goal of using this dataset is to predict whether a passenger survived or not based on various features. It serves as a popular introductory dataset for those learning data analysis, machine learning, and predictive modeling. Keep in mind that the dataset may be subject to variations and updates, so it's always a good idea to check the Kaggle website or dataset documentation for the most recent information.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
A cleaned, feature engineered, and split version of the classic Titanic survival dataset from OpenML. See https://github.com/jamieoliver/titanic-dataset-2410 for more details. Based on https://github.com/fastai/course22/blob/master/clean/05-linear-model-and-neural-net-from-scratch.ipynb using the dataset from using the dataset from https://www.openml.org/search?type=data&sort=runs&id=40945&status=active.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 28 January 2022.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dukuru/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Titanic dataset for classification training.
Sreya27/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Dataset Card for "titanic"
More Information needed
This dataset was created by ossama outmani
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic Dataset Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/cities/titanic123 on 28 January 2022.
--- Dataset description provided by original source is as follows ---
There's a story behind every dataset and here's your opportunity to share yours.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
--- Original source retains full ownership of the source dataset ---
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
jcrabasco04/jc-titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
kkovacs/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
This dataset was created by dj thuva
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Sandra Chelimo
Released under MIT
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Syed Hamza Ali
Released under CC0: Public Domain