Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Detail Description:
The Titanic dataset offers a comprehensive glimpse into the passengers aboard the ill-fated RMS Titanic, which famously sank on its maiden voyage in April 1912 after colliding with an iceberg. This dataset contains a wealth of information about individual passengers, including demographics, ticket class, cabin information, family relationships, fare details, and most notably, survival outcomes.
Key attributes within the dataset include:
Passenger Class (Pclass): This categorical variable indicates the ticket class of each passenger, ranging from 1st class (wealthiest) to 3rd class (lower socioeconomic status).
Name: The names of passengers, providing insight into their identities.
Sex: Gender of passengers, categorized as male or female.
Age: Age of passengers, providing information about the demographic composition of the Titanic's passengers.
SibSp: Number of siblings/spouses aboard the Titanic, offering insight into family relationships.
Parch: Number of parents/children aboard the Titanic, indicating family size and composition.
Ticket: Ticket number, providing additional information about passenger accommodations and fare details.
Fare: Fare paid by each passenger, which can be indicative of their ticket class and economic status.
Cabin: Cabin number or location, offering insights into passenger accommodations.
Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton), providing information about passengers' embarkation points.
Survived: This binary variable indicates whether a passenger survived the disaster (1) or not (0), serving as the primary outcome variable for analyses.
Researchers and data analysts frequently utilize the Titanic dataset for various purposes, including:
Overall, the Titanic dataset serves as a valuable resource for understanding historical events, exploring data analysis techniques, and teaching machine learning concepts. Its accessibility and rich contextual information make it a popular choice for both educational and research purposes within the data science community.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11299784%2F6530245ff6b6d097af8cb56c86b79943%2Fpxfuel.jpg?generation=1682007437079315&alt=media" alt="">The Titanic dataset is a widely used dataset that contains information on the passengers who were aboard the Titanic when it sank on its maiden voyage in 1912. The dataset includes features such as age, sex, passenger class, and fare paid, as well as whether or not the passenger survived the sinking. The dataset is often used for machine learning and data analysis tasks, such as predicting survival based on passenger characteristics or exploring patterns in the data. The Titanic dataset is a classic example of data analysis and is a great starting point for those new to data science.
The Titanic dataset is available in CSV format and contains two files, one for training and one for testing. The training file is used to build the machine learning model, while the testing file is used to test the performance of the model.
PassengerId: unique identifier for each passenger Survived: whether the passenger survived (1) or not (0) Pclass: passenger class (1 = 1st class, 2 = 2nd class, 3 = 3rd class) Name: name of the passenger Sex: gender of the passenger Age: age of the passenger (in years) SibSp: number of siblings or spouses aboard the Titanic Parch: number of parents or children aboard the Titanic Ticket: ticket number Fare: passenger fare Cabin: cabin number Embarked: port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton)
Copyright (c) [2023] [Md Kazi Sajiduddin]
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BrianSuToronto/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for Titanic Survival Prediction
Dataset Details
Dataset Description
This dataset is a copy of the original Kaggle Titanic dataset made to explore the Hugging Face Datasets feature. The Titanic Survival Prediction dataset is widely used in machine learning and statistics. It originates from the Titanic: Machine Learning from Disaster competition on Kaggle. The dataset consists of passenger details from the RMS Titanic disaster, including demographic… See the full description on the dataset page: https://huggingface.co/datasets/paulopontesm/titanic.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.
2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.
Facebook
TwitterTomate/Kaggle-Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterThe Titanic dataset contains information about passengers of the Titanic ship, including demographic and survival data.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides information on passengers aboard the RMS Titanic, including features that can be used for predicting survival. It contains various attributes related to passengers such as age, sex, ticket fare, and passenger class, which are crucial for understanding patterns and building predictive models.
Content:
PassengerID: Unique identifier for each passenger. Pclass: Passenger class (1st, 2nd, 3rd). Name: Full name of the passenger. Sex: Gender of the passenger. Age: Age of the passenger. SibSp: Number of siblings/spouses aboard. Parch: Number of parents/children aboard. Ticket: Ticket number. Fare: Ticket fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg; Q = Queenstown; S = Southampton). Survived: Survival status (0 = No; 1 = Yes).
Usage:
This dataset is ideal for practice in classification tasks, particularly for predicting binary outcomes such as survival status. It is commonly used for various machine learning challenges, including exploratory data analysis and feature engineering.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Titanic Dataset (for Machine Learning)The Titanic dataset is a classic and widely used dataset for machine learning and data analysis. It contains information about the passengers of the RMS Titanic, which tragically sank on its maiden voyage on April 15, 1912. The dataset provides details about each passenger, including their demographics, ticket information, and survival status. This dataset is often used to demonstrate and practice various machine learning techniques, particularly classification.This dataset is divided into two: training set & testing set.Dataset Variables:PassengerId: count for each passengerSurvived: 0 = No; 1 = YesName: name of passengerSex: passenger's sexAge: passenger's ageSibSp: number of siblings/spouses abroad the TitanicParch: number of parents/children abroad the TitanicTicket: ticket numberFare: passenger fareCabin: cabin numberEmbarked: port where passenger embarked (C = Cherbourg; Q = Queenstown; S = Southampton)
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Titanic Survival
from https://web.stanford.edu/class/archive/cs/cs109/cs109.1166/problem12.html
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Javitron4257/Titanic-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
victor/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🛳️ Titanic Dataset (JSON Format) 📌 Overview
This is the classic Titanic: Machine Learning from Disaster dataset, converted into JSON format for easier use in APIs, data pipelines, and Python projects. It contains the same passenger details as the original CSV version, but stored as JSON for convenience.
📂 Dataset Contents
File: titanic.json
Columns: PassengerId, Survived, Pclass, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, Embarked
Use Cases: Exploratory Data Analysis (EDA), feature engineering, machine learning model training, web app backends, JSON parsing practice.
🛠️ How to Use 🔹 1. Load with kagglehub import kagglehub
path = kagglehub.dataset_download("engrbasit62/titanic-json-format") print("Path to dataset files:", path)
🔹 2. Load into Pandas import pandas as pd
df = pd.read_json(f"{path}/titanic.json")
print(df.head())
💡 Notes
Preview truncation: Kaggle may show only part of the JSON in the preview panel because of its size. ✅ Don’t worry — the full dataset is available when loaded via code.
Benefits of JSON format: Ideal for web apps, APIs, or projects that work with structured data. Easily convertible back to CSV if needed.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 12 November 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Version of the titanic dataset used in ggEDA manuscript.Can be loaded from the datarium R package (datarium::titanic.raw). Originally published by the British Board of Trade in 1990. If you use, please cite:British Board of Trade. Report on the Loss of the ’Titanic’ (S.S.). Allan Sutton Publishing, Gloucester, UK, 1990. British Board of Trade Inquiry Report (reprint).Alboukadel Kassambara. datarium: Data Bank for Statistical Analysis and Visualization, 2019. URL https://CRAN.R-project.org/package=datarium. R package version 0.1.0.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Titanic dataset is a popular dataset in the field of data science and machine learning. It contains information about the passengers aboard the RMS Titanic, which sank on its maiden voyage in 1912 after hitting an iceberg. The dataset is often used for predictive modeling and classification tasks.
Here are the key features or columns in the Titanic dataset:
- PassengerId: A unique identifier assigned to each passenger. - Survived: A binary variable indicating whether the passenger survived (1) or did not survive (0). - Pclass (Passenger Class): The ticket class of the passenger, which can be 1st (1), 2nd (2), or 3rd (3) class. - Name: The name of the passenger. - Sex: The gender of the passenger (male or female). - Age: The age of the passenger in years. It may contain missing values. - SibSp: The number of siblings or spouses the passenger had aboard the Titanic. - Parch: The number of parents or children the passenger had aboard the Titanic. - Ticket: The ticket number. - Fare: The amount of money the passenger paid for the ticket. - Cabin: The cabin number where the passenger stayed. It may contain missing values. - Embarked: The port at which the passenger boarded the Titanic (C for Cherbourg, Q for Queenstown, S for Southampton).
Facebook
TwitterHemant201/Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic.csv’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/garrettrlynch/titaniccsv on 30 September 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterSreya27/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset containing information about passengers aboard the Titanic is one of the most famous datasets used in data science and machine learning. It was created to analyze and understand the factors that influenced survival rates among passengers during the tragic sinking of the RMS Titanic on April 15, 1912.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2Fd4016c159f1ad17cb30d8905192fe9d7%2Ftitanic-ship_1027017-11.avif?generation=1711562371875068&alt=media" alt="">
The dataset is often used for predictive modeling and statistical analysis to determine which factors (such as socio-economic status, age, gender, etc.) were associated with a higher likelihood of survival. It contains 1309 rows and 14 columns.
Pclass: Ticket class indicating the socio-economic status of the passenger. It is categorized into three classes: 1 = Upper, 2 = Middle, 3 = Lower.
Survived: A binary indicator that shows whether the passenger survived (1) or not (0) during the Titanic disaster. This is the target variable for analysis.
Name: The full name of the passenger, including title (e.g., Mr., Mrs., etc.).
Sex: The gender of the passenger, denoted as either male or female.
Age: The age of the passenger in years.
SibSp: The number of siblings or spouses aboard the Titanic for the respective passenger.
Parch: The number of parents or children aboard the Titanic for the respective passenger.
Ticket: The ticket number assigned to the passenger.
Fare: The fare paid by the passenger for the ticket.
Cabin: The cabin number assigned to the passenger, if available.
Embarked: The port of embarkation for the passenger. It can take one of three values: C = Cherbourg, Q = Queenstown, S = Southampton.
Boat: If the passenger survived, this column contains the identifier of the lifeboat they were rescued in.
Body: If the passenger did not survive, this column contains the identification number of their recovered body, if applicable.
Home.dest: The destination or place of residence of the passenger.
These descriptions provide a detailed understanding of each column in the Titanic dataset subset, offering insights into the demographic, travel, and survival-related information recorded for each passenger.