100+ datasets found
  1. Titanic - Labelled Test Set

    • kaggle.com
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wesley Howe (2023). Titanic - Labelled Test Set [Dataset]. https://www.kaggle.com/datasets/wesleyhowe/titanic-labelled-test-set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Wesley Howe
    Description

    The test set from "Titanic - Machine Learning from Disaster" doesn't include labels.

    This is an augmented version of the test set with the correct labels, retrieved from the original Titanic dataset at: https://www.openml.org/search?type=data&sort=runs&id=40945&status=active

    The accuracy of the labels was validated by getting a 1.0 score in the competition with them.

    This dataset is provided for educational purposes, and is not intended to help people cheat in the competition. If the only reason you want to download this is so you can get a shiny 1.0 on the leaderboards, don't do it.

  2. titanic_dataset

    • kaggle.com
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SURENDHAN (2024). titanic_dataset [Dataset]. https://www.kaggle.com/datasets/surendhan/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SURENDHAN
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Titanic dataset on Kaggle is a well-known dataset used for machine learning and data science projects, especially for binary classification tasks. It includes data on the passengers of the Titanic, which sank on its maiden voyage in 1912. This dataset is often used to predict the likelihood of a passenger's survival based on various features. Here is a detailed description of the dataset:

    Overview The Titanic dataset includes information about the passengers on the Titanic, such as their demographic information, class, fare, and whether they survived the disaster. The goal is to predict the survival of the passengers.

    Files The dataset typically includes three files:

    train.csv: The training set, which includes the features and the target variable (Survived). test.csv: The test set, which includes the features but not the target variable. You use this file to make predictions that can be submitted to Kaggle. gender_submission.csv: An example of a submission file in the correct format. Features The dataset contains the following columns:

    PassengerId: Unique ID for each passenger. Survived: Target variable (0 = No, 1 = Yes) indicating if the passenger survived. Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd). Name: Name of the passenger. Sex: Gender of the passenger (male or female). Age: Age of the passenger in years. Fractional values indicate age in months for infants. SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Passenger fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

  3. Modified Titanic Dataset (train.csv)

    • kaggle.com
    Updated Feb 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Kiarie (2025). Modified Titanic Dataset (train.csv) [Dataset]. https://www.kaggle.com/datasets/khaliban/modified-titanic-dataset-train-csv
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jason Kiarie
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is a modified version of the original train.csv file provided for the Kaggle Titanic Competition. Missing passenger ages have been imputed using randomly generated values within a reasonable range. While these values may not reflect the actual ages, they facilitate a more structured classification of passengers into categories such as children and adults.

    The imputation process was based on the following assumptions:

    1. Passengers with 0 Parch (no parents or children aboard) were classified as adults, with their ages randomly assigned between 21 and 80, reflecting the Age of Majority at the time (21 years).
    2. Passengers with 4 or more siblings/spouses (SibSp ≥ 4) were likely children (under 21).
    3. Passengers with "Miss." in their name were assumed to be children. While this may not always be accurate, it provided a simple classification method.
    4. Passengers with "Mr." in their name were assumed to be adults.
    5. Passengers with "Master." in their name were assumed to be children.
    6. Passengers with "Mrs." in their name were assumed to be married and, therefore, likely adults. The imputation process was carried out in the following order: 1 → 2 → (3, 4, 5, 6 applied together).

    It is important to note that while honorifics such as Mr., Miss., Mrs., and Master. were historically used with some flexibility, this dataset (Version 1) assumes a strict age classification based on a legal age of adulthood set at 21.

    Version 2 Modifications:

    The title "Master." was assumed to refer to males aged 0 to 16. The title "Mr." was assumed to refer to males aged 17 and above. These modifications aim to provide a structured approach to handling missing age data while maintaining reasonable historical assumptions.

  4. Spaceship Titanic Solution

    • kaggle.com
    Updated May 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nabarungos (2022). Spaceship Titanic Solution [Dataset]. https://www.kaggle.com/datasets/nabarungos/spaceship-titanic-solution
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 13, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nabarungos
    Description

    Dataset

    This dataset was created by Nabarungos

    Contents

  5. Titanic dataset

    • kaggle.com
    Updated Aug 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khamed Mohammed Taha (2024). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/khamedmohammedtaha/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 11, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Khamed Mohammed Taha
    Description

    Dataset

    This dataset was created by Mohammed taha Khamed

    Contents

  6. Titanic Dataset

    • kaggle.com
    Updated Jul 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MatteoD83 (2024). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/matteod83/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    MatteoD83
    Description

    Dataset

    This dataset was created by MatteoD83

    Contents

  7. Spaceship Titanic Dataset

    • kaggle.com
    Updated Feb 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanchi Batra (2024). Spaceship Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/sanchibatra/spaceship-titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sanchi Batra
    Description

    Dataset

    This dataset was created by Sanchi Batra

    Contents

  8. titanic dataset

    • kaggle.com
    Updated Feb 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KSanjana2001 (2022). titanic dataset [Dataset]. https://www.kaggle.com/datasets/ksanjana2001/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 15, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    KSanjana2001
    Description

    Dataset

    This dataset was created by KSanjana2001

    Contents

  9. titanic dataset

    • kaggle.com
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dj thuva (2024). titanic dataset [Dataset]. https://www.kaggle.com/datasets/djthuva/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    dj thuva
    Description

    Dataset

    This dataset was created by dj thuva

    Contents

  10. Titanic dataset

    • kaggle.com
    Updated Apr 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Jung Basnet (2025). Titanic dataset [Dataset]. https://www.kaggle.com/datasets/ashishjungbasnet/titanic-dataset/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ashish Jung Basnet
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Ashish Jung Basnet

    Released under MIT

    Contents

  11. Titanic Dataset - Machine Learning from Disaster

    • kaggle.com
    Updated Sep 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Chauhan (2022). Titanic Dataset - Machine Learning from Disaster [Dataset]. https://www.kaggle.com/datasets/whenamancodes/titanic-dataset-machine-learning-from-disaster
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 20, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aman Chauhan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    The data has been split into two groups:

    • training set (train.csv)
    • test set (test.csv)

    The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

    The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

    We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

    Data Dictionary:

    | Variable | Definition | Key | | --- | --- | | survival | Survival | 0 = No, 1 = Yes | | pclass | Ticket class | 1 = 1st, 2 = 2nd, 3 = 3rd | | sex | Sex | | | Age | Age in years | | | sibsp | # of siblings / spouses aboard the Titanic | | | parch | # of parents / children aboard the Titanic | | | ticket | Ticket number | | | fare | Passenger fare | | | cabin | Cabin number | | | embarked | Port of Embarkation | C = Cherbourg, Q = Queenstown, S = Southampton |

    Variable Notes

    pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

    age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

    sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

    parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.

    More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Hehe

  12. titanic dataset

    • kaggle.com
    Updated Sep 22, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanka (2018). titanic dataset [Dataset]. https://www.kaggle.com/datasets/priyanka2018/titanic-dataset/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 22, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Priyanka
    Description

    Dataset

    This dataset was created by Priyanka

    Contents

  13. titanic-dataset-25

    • kaggle.com
    Updated Apr 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nguyenthanhktdt (2025). titanic-dataset-25 [Dataset]. https://www.kaggle.com/datasets/nguyenthanhktdt/titanic-dataset-25/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nguyenthanhktdt
    Description

    Dataset

    This dataset was created by nguyenthanhktdt

    Released under Other (specified in description)

    Contents

  14. Titanic Dataset

    • kaggle.com
    Updated Sep 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trisa Biswas (2018). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/biswastrisa2345/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Trisa Biswas
    Description

    Dataset

    This dataset was created by Trisa Biswas

    Contents

  15. Titanic Dataset

    • kaggle.com
    Updated Nov 28, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MArco (2017). Titanic Dataset [Dataset]. https://www.kaggle.com/ninjavsdev/titanic-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 28, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    MArco
    Description

    Dataset

    This dataset was created by MArco

    Contents

  16. titanic dataset

    • kaggle.com
    Updated Feb 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    kordasg (2018). titanic dataset [Dataset]. https://www.kaggle.com/kordasg/titanic-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    kordasg
    Description

    Dataset

    This dataset was created by kordasg

    Contents

  17. titanic-dataset

    • kaggle.com
    Updated Jan 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ehsaan (2025). titanic-dataset [Dataset]. https://www.kaggle.com/datasets/ehsaanali/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 10, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ehsaan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Ehsaan

    Released under CC0: Public Domain

    Contents

  18. Titanic Dataset

    • kaggle.com
    Updated Nov 27, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RobinReni (2017). Titanic Dataset [Dataset]. https://www.kaggle.com/robinreni/titanic-dataset/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 27, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    RobinReni
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by RobinReni

    Released under CC0: Public Domain

    Contents

  19. Titanic Dataset

    • kaggle.com
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parth Gajmal (2023). Titanic Dataset [Dataset]. https://www.kaggle.com/datasets/parthgajmal/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Parth Gajmal
    Description

    Dataset

    This dataset was created by Parth Gajmal

    Contents

  20. Titanic Dataset

    • kaggle.com
    Updated Oct 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nisha Kushwaha (2020). Titanic Dataset [Dataset]. https://www.kaggle.com/nishakumari95/titanic-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 19, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nisha Kushwaha
    Description

    Dataset

    This dataset was created by Nisha Kushwaha

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Wesley Howe (2023). Titanic - Labelled Test Set [Dataset]. https://www.kaggle.com/datasets/wesleyhowe/titanic-labelled-test-set
Organization logo

Titanic - Labelled Test Set

The testing set for "Titanic - Machine Learning from Disaster" with labels.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 30, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Wesley Howe
Description

The test set from "Titanic - Machine Learning from Disaster" doesn't include labels.

This is an augmented version of the test set with the correct labels, retrieved from the original Titanic dataset at: https://www.openml.org/search?type=data&sort=runs&id=40945&status=active

The accuracy of the labels was validated by getting a 1.0 score in the competition with them.

This dataset is provided for educational purposes, and is not intended to help people cheat in the competition. If the only reason you want to download this is so you can get a shiny 1.0 on the leaderboards, don't do it.

Search
Clear search
Close search
Google apps
Main menu