100+ datasets found

Titanic - Labelled Test Set
kaggle.com
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wesley Howe (2023). Titanic - Labelled Test Set [Dataset]. https://www.kaggle.com/datasets/wesleyhowe/titanic-labelled-test-set
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Wesley Howe
Description
The test set from "Titanic - Machine Learning from Disaster" doesn't include labels.

This is an augmented version of the test set with the correct labels, retrieved from the original Titanic dataset at: https://www.openml.org/search?type=data&sort=runs&id=40945&status=active

The accuracy of the labels was validated by getting a 1.0 score in the competition with them.

This dataset is provided for educational purposes, and is not intended to help people cheat in the competition. If the only reason you want to download this is so you can get a shiny 1.0 on the leaderboards, don't do it.
titanic_dataset
kaggle.com
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SURENDHAN (2024). titanic_dataset [Dataset]. https://www.kaggle.com/datasets/surendhan/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
SURENDHAN
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The Titanic dataset on Kaggle is a well-known dataset used for machine learning and data science projects, especially for binary classification tasks. It includes data on the passengers of the Titanic, which sank on its maiden voyage in 1912. This dataset is often used to predict the likelihood of a passenger's survival based on various features. Here is a detailed description of the dataset:

Overview The Titanic dataset includes information about the passengers on the Titanic, such as their demographic information, class, fare, and whether they survived the disaster. The goal is to predict the survival of the passengers.

Files The dataset typically includes three files:

train.csv: The training set, which includes the features and the target variable (Survived). test.csv: The test set, which includes the features but not the target variable. You use this file to make predictions that can be submitted to Kaggle. gender_submission.csv: An example of a submission file in the correct format. Features The dataset contains the following columns:

PassengerId: Unique ID for each passenger. Survived: Target variable (0 = No, 1 = Yes) indicating if the passenger survived. Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd). Name: Name of the passenger. Sex: Gender of the passenger (male or female). Age: Age of the passenger in years. Fractional values indicate age in months for infants. SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Passenger fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).
Modified Titanic Dataset (train.csv)
kaggle.com
Updated Feb 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Kiarie (2025). Modified Titanic Dataset (train.csv) [Dataset]. https://www.kaggle.com/datasets/khaliban/modified-titanic-dataset-train-csv
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jason Kiarie
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a modified version of the original train.csv file provided for the Kaggle Titanic Competition. Missing passenger ages have been imputed using randomly generated values within a reasonable range. While these values may not reflect the actual ages, they facilitate a more structured classification of passengers into categories such as children and adults.

The imputation process was based on the following assumptions:

Passengers with 0 Parch (no parents or children aboard) were classified as adults, with their ages randomly assigned between 21 and 80, reflecting the Age of Majority at the time (21 years).

Passengers with 4 or more siblings/spouses (SibSp ≥ 4) were likely children (under 21).

Passengers with "Miss." in their name were assumed to be children. While this may not always be accurate, it provided a simple classification method.

Passengers with "Mr." in their name were assumed to be adults.

Passengers with "Master." in their name were assumed to be children.

Passengers with "Mrs." in their name were assumed to be married and, therefore, likely adults. The imputation process was carried out in the following order: 1 → 2 → (3, 4, 5, 6 applied together).

It is important to note that while honorifics such as Mr., Miss., Mrs., and Master. were historically used with some flexibility, this dataset (Version 1) assumes a strict age classification based on a legal age of adulthood set at 21.

Version 2 Modifications:

The title "Master." was assumed to refer to males aged 0 to 16. The title "Mr." was assumed to refer to males aged 17 and above. These modifications aim to provide a structured approach to handling missing age data while maintaining reasonable historical assumptions.
Spaceship Titanic Solution
kaggle.com
Updated May 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nabarungos (2022). Spaceship Titanic Solution [Dataset]. https://www.kaggle.com/datasets/nabarungos/spaceship-titanic-solution
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 13, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nabarungos
Description
Dataset

This dataset was created by Nabarungos

Contents
Titanic dataset
kaggle.com
Updated Aug 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khamed Mohammed Taha (2024). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/khamedmohammedtaha/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 11, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Khamed Mohammed Taha
Description
Dataset

This dataset was created by Mohammed taha Khamed

Contents
Titanic Dataset
kaggle.com
Updated Jul 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MatteoD83 (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/matteod83/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 3, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
MatteoD83
Description
Dataset

This dataset was created by MatteoD83

Contents
Spaceship Titanic Dataset
kaggle.com
Updated Feb 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanchi Batra (2024). Spaceship Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/sanchibatra/spaceship-titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 6, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sanchi Batra
Description
Dataset

This dataset was created by Sanchi Batra

Contents
titanic dataset
kaggle.com
Updated Feb 15, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
KSanjana2001 (2022). titanic dataset [Dataset]. https://www.kaggle.com/datasets/ksanjana2001/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 15, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
KSanjana2001
Description
Dataset

This dataset was created by KSanjana2001

Contents
titanic dataset
kaggle.com
Updated Sep 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
dj thuva (2024). titanic dataset [Dataset]. https://www.kaggle.com/datasets/djthuva/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 20, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
dj thuva
Description
Dataset

This dataset was created by dj thuva

Contents
Titanic dataset
kaggle.com
Updated Apr 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ashish Jung Basnet (2025). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/ashishjungbasnet/titanic-dataset/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ashish Jung Basnet
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Ashish Jung Basnet

Released under MIT

Contents
Titanic Dataset - Machine Learning from Disaster
kaggle.com
Updated Sep 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aman Chauhan (2022). Titanic Dataset - Machine Learning from Disaster [Dataset]. https://www.kaggle.com/datasets/whenamancodes/titanic-dataset-machine-learning-from-disaster
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 20, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aman Chauhan
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Overview

The data has been split into two groups:

training set (train.csv)

test set (test.csv)

The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

Data Dictionary:

| Variable | Definition | Key | | --- | --- | | survival | Survival | 0 = No, 1 = Yes | | pclass | Ticket class | 1 = 1st, 2 = 2nd, 3 = 3rd | | sex | Sex | | | Age | Age in years | | | sibsp | # of siblings / spouses aboard the Titanic | | | parch | # of parents / children aboard the Titanic | | | ticket | Ticket number | | | fare | Passenger fare | | | cabin | Cabin number | | | embarked | Port of Embarkation | C = Cherbourg, Q = Queenstown, S = Southampton |

Variable Notes

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Hehe
titanic dataset
kaggle.com
Updated Sep 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Priyanka (2018). titanic dataset [Dataset]. https://www.kaggle.com/datasets/priyanka2018/titanic-dataset/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 22, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Priyanka
Description
Dataset

This dataset was created by Priyanka

Contents
titanic-dataset-25
kaggle.com
Updated Apr 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nguyenthanhktdt (2025). titanic-dataset-25 [Dataset]. https://www.kaggle.com/datasets/nguyenthanhktdt/titanic-dataset-25/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
nguyenthanhktdt
Description
Dataset

This dataset was created by nguyenthanhktdt

Released under Other (specified in description)

Contents
Titanic Dataset
kaggle.com
Updated Sep 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Trisa Biswas (2018). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/biswastrisa2345/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 18, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Trisa Biswas
Description
Dataset

This dataset was created by Trisa Biswas

Contents
Titanic Dataset
kaggle.com
Updated Nov 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MArco (2017). Titanic Dataset [Dataset]. https://www.kaggle.com/ninjavsdev/titanic-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 28, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
MArco
Description
Dataset

This dataset was created by MArco

Contents
titanic dataset
kaggle.com
Updated Feb 7, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kordasg (2018). titanic dataset [Dataset]. https://www.kaggle.com/kordasg/titanic-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 7, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
kordasg
Description
Dataset

This dataset was created by kordasg

Contents
titanic-dataset
kaggle.com
Updated Jan 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ehsaan (2025). titanic-dataset [Dataset]. https://www.kaggle.com/datasets/ehsaanali/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 10, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ehsaan
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Ehsaan

Released under CC0: Public Domain

Contents
Titanic Dataset
kaggle.com
Updated Nov 27, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RobinReni (2017). Titanic Dataset [Dataset]. https://www.kaggle.com/robinreni/titanic-dataset/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 27, 2017
Dataset provided by
Kagglehttp://kaggle.com/
Authors
RobinReni
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by RobinReni

Released under CC0: Public Domain

Contents
Titanic Dataset
kaggle.com
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parth Gajmal (2023). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/parthgajmal/titanic-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 21, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Parth Gajmal
Description
Dataset

This dataset was created by Parth Gajmal

Contents
Titanic Dataset
kaggle.com
Updated Oct 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nisha Kushwaha (2020). Titanic Dataset [Dataset]. https://www.kaggle.com/nishakumari95/titanic-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 19, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nisha Kushwaha
Description
Dataset

This dataset was created by Nisha Kushwaha

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Wesley Howe (2023). Titanic - Labelled Test Set [Dataset]. https://www.kaggle.com/datasets/wesleyhowe/titanic-labelled-test-set

Titanic - Labelled Test Set

The testing set for "Titanic - Machine Learning from Disaster" with labels.

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 30, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Wesley Howe

Description

The test set from "Titanic - Machine Learning from Disaster" doesn't include labels.

This is an augmented version of the test set with the correct labels, retrieved from the original Titanic dataset at: https://www.openml.org/search?type=data&sort=runs&id=40945&status=active

The accuracy of the labels was validated by getting a 1.0 score in the competition with them.

This dataset is provided for educational purposes, and is not intended to help people cheat in the competition. If the only reason you want to download this is so you can get a shiny 1.0 on the leaderboards, don't do it.

Clear search

Close search

Google apps

Main menu

Titanic - Labelled Test Set

titanic_dataset

Modified Titanic Dataset (train.csv)

Spaceship Titanic Solution

Dataset

Contents

Titanic dataset

Dataset

Contents

Titanic Dataset

Dataset

Contents

Spaceship Titanic Dataset

Dataset

Contents

titanic dataset

Dataset

Contents

titanic dataset

Dataset

Contents

Titanic dataset

Dataset

Contents

Titanic Dataset - Machine Learning from Disaster

Overview

The data has been split into two groups:

Data Dictionary:

Variable Notes

titanic dataset

Dataset

Contents

titanic-dataset-25

Dataset

Contents

Titanic Dataset

Dataset

Contents

Titanic Dataset

Dataset

Contents

titanic dataset

Dataset

Contents

titanic-dataset

Dataset

Contents

Titanic Dataset

Dataset

Contents

Titanic Dataset

Dataset

Contents

Titanic Dataset

Dataset

Contents

Titanic - Labelled Test Set

The testing set for "Titanic - Machine Learning from Disaster" with labels.