100+ datasets found
  1. P

    Titanic Dataset

    • paperswithcode.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Titanic Dataset [Dataset]. https://paperswithcode.com/dataset/titanic
    Explore at:
    Dataset updated
    Nov 25, 2024
    Description

    Titanic Dataset Description Overview The data is divided into two groups: - Training set (train.csv): Used to build machine learning models. It includes the outcome (also called the "ground truth") for each passenger, allowing models to predict survival based on “features” like gender and class. Feature engineering can also be applied to create new features. - Test set (test.csv): Used to evaluate model performance on unseen data. The ground truth is not provided; the task is to predict survival for each passenger in the test set using the trained model.

    Additionally, gender_submission.csv is provided as an example submission file, containing predictions based on the assumption that all and only female passengers survive.

    Data Dictionary | Variable | Definition | Key | |------------|------------------------------------------|-------------------------------------------------| | survival | Survival | 0 = No, 1 = Yes | | pclass | Ticket class | 1 = 1st, 2 = 2nd, 3 = 3rd | | sex | Sex | | | age | Age in years | | | sibsp | # of siblings/spouses aboard the Titanic | | | parch | # of parents/children aboard the Titanic | | | ticket | Ticket number | | | fare | Passenger fare | | | cabin | Cabin number | | | embarked | Port of Embarkation | C = Cherbourg, Q = Queenstown, S = Southampton |

    Variable Notes

    pclass: Proxy for socio-economic status (SES): 1st = Upper 2nd = Middle 3rd = Lower age:
    Fractional if less than 1 year.
    Estimated ages are represented in the form xx.5. sibsp: Defines family relations as: Sibling: Brother, sister, stepbrother, stepsister. Spouse: Husband, wife (excluding mistresses and fiancés). parch: Defines family relations as: Parent: Mother, father. Child: Daughter, son, stepdaughter, stepson. Some children traveled only with a nanny, so parch = 0 for them.

  2. Titanic - Labelled Test Set

    • kaggle.com
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wesley Howe (2023). Titanic - Labelled Test Set [Dataset]. https://www.kaggle.com/datasets/wesleyhowe/titanic-labelled-test-set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Wesley Howe
    Description

    The test set from "Titanic - Machine Learning from Disaster" doesn't include labels.

    This is an augmented version of the test set with the correct labels, retrieved from the original Titanic dataset at: https://www.openml.org/search?type=data&sort=runs&id=40945&status=active

    The accuracy of the labels was validated by getting a 1.0 score in the competition with them.

    This dataset is provided for educational purposes, and is not intended to help people cheat in the competition. If the only reason you want to download this is so you can get a shiny 1.0 on the leaderboards, don't do it.

  3. titanic_dataset

    • kaggle.com
    Updated Jun 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SURENDHAN (2024). titanic_dataset [Dataset]. https://www.kaggle.com/datasets/surendhan/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    SURENDHAN
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The Titanic dataset on Kaggle is a well-known dataset used for machine learning and data science projects, especially for binary classification tasks. It includes data on the passengers of the Titanic, which sank on its maiden voyage in 1912. This dataset is often used to predict the likelihood of a passenger's survival based on various features. Here is a detailed description of the dataset:

    Overview The Titanic dataset includes information about the passengers on the Titanic, such as their demographic information, class, fare, and whether they survived the disaster. The goal is to predict the survival of the passengers.

    Files The dataset typically includes three files:

    train.csv: The training set, which includes the features and the target variable (Survived). test.csv: The test set, which includes the features but not the target variable. You use this file to make predictions that can be submitted to Kaggle. gender_submission.csv: An example of a submission file in the correct format. Features The dataset contains the following columns:

    PassengerId: Unique ID for each passenger. Survived: Target variable (0 = No, 1 = Yes) indicating if the passenger survived. Pclass: Ticket class (1 = 1st, 2 = 2nd, 3 = 3rd). Name: Name of the passenger. Sex: Gender of the passenger (male or female). Age: Age of the passenger in years. Fractional values indicate age in months for infants. SibSp: Number of siblings or spouses aboard the Titanic. Parch: Number of parents or children aboard the Titanic. Ticket: Ticket number. Fare: Passenger fare. Cabin: Cabin number. Embarked: Port of embarkation (C = Cherbourg, Q = Queenstown, S = Southampton).

  4. c

    Titanic Dataset

    • cubig.ai
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Titanic Dataset [Dataset]. https://cubig.ai/store/products/393/titanic-dataset
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • Based on passenger information from the Titanic, which sank in 1912, the Titanic Dataset is a representative binary classification data that includes various demographics and boarding information such as Survived, Passengers Class, Name, Sex, Age, SibSp, Parch, Ticket, Fare, Cabin, and Embarked.

    2) Data Utilization (1) Titanic Dataset has characteristics that: • It consists of a total of 891 training samples and 12 to 15 columns (numerical and categorical mix) and also includes variables such as Age, Cabin, and Embarked with some missing values, making it suitable for preprocessing and feature engineering practice. (2) Titanic Dataset can be used to: • Development of survival prediction models: Key characteristics such as passenger rating, gender, age, and fare can be used to predict survival with different machine learning classification models such as logistic regression, random forest, and SVM. • Analysis of survival influencing factors: By analyzing the correlation between variables such as gender, age, socioeconomic status, and survival rates, you can statistically and visually explore which groups have a higher survival probability.

  5. h

    titanic-dataset

    • huggingface.co
    Updated Sep 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Su (2024). titanic-dataset [Dataset]. https://huggingface.co/datasets/BrianSuToronto/titanic-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 26, 2024
    Authors
    Brian Su
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    BrianSuToronto/titanic-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. T

    titanic

    • tensorflow.org
    Updated Feb 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). titanic [Dataset]. https://www.tensorflow.org/datasets/catalog/titanic
    Explore at:
    Dataset updated
    Feb 12, 2023
    Description

    Dataset describing the survival status of individual passengers on the Titanic. Missing values in the original dataset are represented using ?. Float and int missing values are replaced with -1, string missing values are replaced with 'Unknown'.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('titanic', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  7. Modified Titanic Dataset (train.csv)

    • kaggle.com
    Updated Feb 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason Kiarie (2025). Modified Titanic Dataset (train.csv) [Dataset]. https://www.kaggle.com/datasets/khaliban/modified-titanic-dataset-train-csv
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jason Kiarie
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is a modified version of the original train.csv file provided for the Kaggle Titanic Competition. Missing passenger ages have been imputed using randomly generated values within a reasonable range. While these values may not reflect the actual ages, they facilitate a more structured classification of passengers into categories such as children and adults.

    The imputation process was based on the following assumptions:

    1. Passengers with 0 Parch (no parents or children aboard) were classified as adults, with their ages randomly assigned between 21 and 80, reflecting the Age of Majority at the time (21 years).
    2. Passengers with 4 or more siblings/spouses (SibSp ≥ 4) were likely children (under 21).
    3. Passengers with "Miss." in their name were assumed to be children. While this may not always be accurate, it provided a simple classification method.
    4. Passengers with "Mr." in their name were assumed to be adults.
    5. Passengers with "Master." in their name were assumed to be children.
    6. Passengers with "Mrs." in their name were assumed to be married and, therefore, likely adults. The imputation process was carried out in the following order: 1 → 2 → (3, 4, 5, 6 applied together).

    It is important to note that while honorifics such as Mr., Miss., Mrs., and Master. were historically used with some flexibility, this dataset (Version 1) assumes a strict age classification based on a legal age of adulthood set at 21.

    Version 2 Modifications:

    The title "Master." was assumed to refer to males aged 0 to 16. The title "Mr." was assumed to refer to males aged 17 and above. These modifications aim to provide a structured approach to handling missing age data while maintaining reasonable historical assumptions.

  8. A

    ‘Titanic dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-dataset-7d6b/0f1d826e/?iid=009-782&v=presentation
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Titanic dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ibrahimelsayed182/titanic-dataset on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Overview

    This is Titanic dataset

    Data Dictionary

    AttributesDefinitionKey
    sexSex/Gendermale/female
    ageAge
    sibspsiblings of the passenger0/1 /2 ...
    parchparents / children aboard the Titanic0/1/2 ...
    farePassenger fare
    embarkedPort of EmbarkationC : Cherbourg, Q : Queenstown, S : Southampton
    classTicket classFirst / Second / Third
    whocategories to passengersmale, female, child
    alonehe was alone in ship or no0/1
    survived0/1

    --- Original source retains full ownership of the source dataset ---

  9. h

    titanic

    • huggingface.co
    Updated Dec 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K Kovacs (2024). titanic [Dataset]. https://huggingface.co/datasets/kkovacs/titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 11, 2024
    Authors
    K Kovacs
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    kkovacs/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community

  10. h

    Titanic

    • huggingface.co
    Updated Jun 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rahul Saraf (2023). Titanic [Dataset]. https://huggingface.co/datasets/rahuketu86/Titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 18, 2023
    Authors
    Rahul Saraf
    License

    https://choosealicense.com/licenses/gpl-3.0/https://choosealicense.com/licenses/gpl-3.0/

    Description

    rahuketu86/Titanic dataset hosted on Hugging Face and contributed by the HF Datasets community

  11. h

    titanic

    • huggingface.co
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexey Kislyakov (2024). titanic [Dataset]. https://huggingface.co/datasets/ankislyakov/titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 28, 2024
    Authors
    Alexey Kislyakov
    Description

    ankislyakov/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. Machine Learning on Titanic Data Set

    • kaggle.com
    Updated Dec 27, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emmanuel Appiah-Kubi (2017). Machine Learning on Titanic Data Set [Dataset]. https://www.kaggle.com/datasets/emmazeinab/machine-learning-on-titanic-data-set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 27, 2017
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Emmanuel Appiah-Kubi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Emmanuel Appiah-Kubi

    Released under CC0: Public Domain

    Contents

  13. h

    titanic

    • huggingface.co
    Updated Oct 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Aldeguer López (2024). titanic [Dataset]. https://huggingface.co/datasets/alejandro-aldeguer-lopez/titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 31, 2024
    Authors
    Alejandro Aldeguer López
    Description

    alejandro-aldeguer-lopez/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community

  14. A

    ‘Titanic Dataset Analysis’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Titanic Dataset Analysis’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-titanic-dataset-analysis-c0ba/latest
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Titanic Dataset Analysis’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/cities/titanic123 on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    There's a story behind every dataset and here's your opportunity to share yours.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

    --- Original source retains full ownership of the source dataset ---

  15. h

    titanic

    • huggingface.co
    • zenodo.org
    Updated Nov 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Sanseviero (2023). titanic [Dataset]. https://huggingface.co/datasets/osanseviero/titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 2, 2023
    Authors
    Omar Sanseviero
    Description

    Titanic dataset

  16. titanic

    • kaggle.com
    Updated Jul 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    victor ushie (2023). titanic [Dataset]. https://www.kaggle.com/datasets/victorushie/titanic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 9, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    victor ushie
    Description

    Dataset

    This dataset was created by victor ushie

    Contents

  17. Titanic classification

    • figshare.com
    txt
    Updated Sep 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alvaro Rioboo (2020). Titanic classification [Dataset]. http://doi.org/10.6084/m9.figshare.12979220.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 19, 2020
    Dataset provided by
    figshare
    Authors
    Alvaro Rioboo
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Titanic dataset for classification training.

  18. h

    Titanic

    • huggingface.co
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Mashhudur Rahim (2025). Titanic [Dataset]. https://huggingface.co/datasets/XythicK/Titanic
    Explore at:
    Dataset updated
    Apr 21, 2025
    Authors
    M Mashhudur Rahim
    Description

    DEATH RECORD'S OF RMS TITANIC INCIDENT

    The Titanic was a British luxury ocean liner that sank on April 15, 1912, after striking an iceberg during its maiden voyage from Southampton, England, to New York City.

      Dataset Description
    

    Curated by: [XythicK] Funded by [optional]: [XythicK/Alchemist] Shared by [optional]: [People] Language(s) (NLP): [English] License: [All rights reserved to Official Titanic Website]

      Dataset Sources [optional]
    

    Repository:… See the full description on the dataset page: https://huggingface.co/datasets/XythicK/Titanic.

  19. Titanic-Dataset

    • kaggle.com
    Updated Dec 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fehu.zone (2024). Titanic-Dataset [Dataset]. https://www.kaggle.com/datasets/fehu94/titanic-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    fehu.zone
    Description

    Dataset

    This dataset was created by fehu.zone

    Contents

  20. h

    titanic-qa-dataset

    • huggingface.co
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soumik Bose (2025). titanic-qa-dataset [Dataset]. https://huggingface.co/datasets/Soumik555/titanic-qa-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2025
    Authors
    Soumik Bose
    Description

    Soumik555/titanic-qa-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). Titanic Dataset [Dataset]. https://paperswithcode.com/dataset/titanic

Titanic Dataset

Titanic - Machine Learning from Disaster

Explore at:
Dataset updated
Nov 25, 2024
Description

Titanic Dataset Description Overview The data is divided into two groups: - Training set (train.csv): Used to build machine learning models. It includes the outcome (also called the "ground truth") for each passenger, allowing models to predict survival based on “features” like gender and class. Feature engineering can also be applied to create new features. - Test set (test.csv): Used to evaluate model performance on unseen data. The ground truth is not provided; the task is to predict survival for each passenger in the test set using the trained model.

Additionally, gender_submission.csv is provided as an example submission file, containing predictions based on the assumption that all and only female passengers survive.

Data Dictionary | Variable | Definition | Key | |------------|------------------------------------------|-------------------------------------------------| | survival | Survival | 0 = No, 1 = Yes | | pclass | Ticket class | 1 = 1st, 2 = 2nd, 3 = 3rd | | sex | Sex | | | age | Age in years | | | sibsp | # of siblings/spouses aboard the Titanic | | | parch | # of parents/children aboard the Titanic | | | ticket | Ticket number | | | fare | Passenger fare | | | cabin | Cabin number | | | embarked | Port of Embarkation | C = Cherbourg, Q = Queenstown, S = Southampton |

Variable Notes

pclass: Proxy for socio-economic status (SES): 1st = Upper 2nd = Middle 3rd = Lower age:
Fractional if less than 1 year.
Estimated ages are represented in the form xx.5. sibsp: Defines family relations as: Sibling: Brother, sister, stepbrother, stepsister. Spouse: Husband, wife (excluding mistresses and fiancés). parch: Defines family relations as: Parent: Mother, father. Child: Daughter, son, stepdaughter, stepson. Some children traveled only with a nanny, so parch = 0 for them.

Search
Clear search
Close search
Google apps
Main menu