Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 28 January 2022.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Dataset describing the survival status of individual passengers on the Titanic. Missing values in the original dataset are represented using ?. Float and int missing values are replaced with -1, string missing values are replaced with 'Unknown'.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('titanic', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
In this Dataset you find the original titanic csv-files train and test. Special in this dataset is, that I added the right (100%) Survival Solution to the test data. This is only for better and faster evaluation of your own solution. Please don't upload this solution as a Submission to the official Competition!
Please be fair to the other Kagglers!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic: all ones csv file’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/brendan45774/gender-submisson on 13 November 2021.
--- Dataset description provided by original source is as follows ---
The score of the csv file is 0.37799. This is the number to beat, so make sure you don't have a number below this.
This is the titanic csv file, but everyone survives.
I also have another csv file: https://www.kaggle.com/brendan45774/test-file This may help you on your mission to get a perfect score.
--- Original source retains full ownership of the source dataset ---
This dataset was created by VVignesh Kumar
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic csv’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/fossouodonald/titaniccsv on 28 January 2022.
--- Dataset description provided by original source is as follows ---
this dataset is the result of titanic csv
--- Original source retains full ownership of the source dataset ---
https://choosealicense.com/licenses/afl-3.0/https://choosealicense.com/licenses/afl-3.0/
victor/titanic dataset hosted on Hugging Face and contributed by the HF Datasets community
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The score of the csv file is 0.37799. This is the number to beat, so make sure you don't have a number below this.
This is the titanic csv file, but everyone survives.
I also have another csv file: https://www.kaggle.com/brendan45774/test-file This may help you on your mission to get a perfect score.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Base de datos utilizada como entrada para la visualización del descubrimiento de subgrupos
Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
License information was derived automatically
Descripción General del Dataset
Este dataset consta de dos archivos CSV: train.csv (61.19 kB) y test.csv (28.63 kB), que contienen datos relacionados con los pasajeros a bordo del Titanic. Los datos son utilizados para analizar diferentes aspectos socioeconómicos y demográficos que influyeron en la supervivencia de los individuos durante el desastre del Titanic.
Diccionario de Datos
Variable Definición Detalles
survival Supervivencia 0 = No, 1 = Sí
pclass… See the full description on the dataset page: https://huggingface.co/datasets/CarPeAs/dataset_titanic.
Titanic Dataset
Основной датасет: titanic.csvМетаданные (описание структуры): croissant.json Этот репозиторий содержит датасет Titanic в формате CSV.Файл croissant.json содержит метаданные в формате ML Croissant, описывающие структуру и свойства датасета.
Состав репозитория
titanic.csv — основной датасет (CSV-таблица) croissant.json — метаданные (описание структуры, полей, лицензии и т.д.) README.md — описание
Использование
Для анализа данных используйте… See the full description on the dataset page: https://huggingface.co/datasets/RiddarsCorp/TestDOI.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset was created by John Mitchell
Released under CC BY-SA 3.0
Description 👋🛳️ Ahoy, welcome to Kaggle! You’re in the right place. This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works.
If you want to talk with other users about this competition, come join our Discord! We've got channels for competitions, job postings and career discussions, resources, and socializing with your fellow data scientists. Follow the link here: https://discord.gg/kaggle
The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.
Read on or watch the video below to explore more details. Once you’re ready to start competing, click on the "Join Competition button to create an account and gain access to the competition data. Then check out Alexis Cook’s Titanic Tutorial that walks you through step by step how to make your first submission!
The Challenge The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
Recommended Tutorial We highly recommend Alexis Cook’s Titanic Tutorial that walks you through making your very first submission step by step and this starter notebook to get started.
How Kaggle’s Competitions Work Join the Competition Read about the challenge description, accept the Competition Rules and gain access to the competition dataset. Get to Work Download the data, build models on it locally or on Kaggle Notebooks (our no-setup, customizable Jupyter Notebooks environment with free GPUs) and generate a prediction file. Make a Submission Upload your prediction as a submission on Kaggle and receive an accuracy score. Check the Leaderboard See how your model ranks against other Kagglers on our leaderboard. Improve Your Score Check out the discussion forum to find lots of tutorials and insights from other competitors. Kaggle Lingo Video You may run into unfamiliar lingo as you dig into the Kaggle discussion forums and public notebooks. Check out Dr. Rachael Tatman’s video on Kaggle Lingo to get up to speed!
What Data Will I Use in This Competition? In this competition, you’ll gain access to two similar datasets that include passenger information like name, age, gender, socio-economic class, etc. One dataset is titled train.csv and the other is titled test.csv.
Train.csv will contain the details of a subset of the passengers on board (891 to be exact) and importantly, will reveal whether they survived or not, also known as the “ground truth”.
The test.csv dataset contains similar information but does not disclose the “ground truth” for each passenger. It’s your job to predict these outcomes.
Using the patterns you find in the train.csv data, predict whether the other 418 passengers on board (found in test.csv) survived.
Check out the “Data” tab to explore the datasets even further. Once you feel you’ve created a competitive model, submit it to Kaggle to see where your model stands on our leaderboard against other Kagglers.
How to Submit your Prediction to Kaggle Once you’re ready to make a submission and get on the leaderboard:
Click on the “Submit Predictions” button
Upload a CSV file in the submission file format. You’re able to submit 10 submissions a day.
Submission File Format: You should submit a csv file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.
The file should have exactly 2 columns:
PassengerId (sorted in any order) Survived (contains your binary predictions: 1 for survived, 0 for deceased) Got it! I’m ready to get started. Where do I get help if I need it? For Competition Help: Titanic Discussion Forum Kaggle doesn’t have a dedicated team to help troubleshoot your code so you’ll typically find that you receive a response more quickly by asking your question in the appropriate forum. The forums are full of useful information on the data, metric, and different approaches. We encourage you to use the forums often. If you share your knowledge, you'll find that others will share a lot in turn!
A Last Word on Kaggle Notebooks As we mentioned before, Kaggle Notebooks is our no-setup, customizable, Jupyter Notebooks environment with free GPUs and a huge repository ...
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
This Titanic Dataset is based on my research to correct a series of database inconsistencies in this well known dataset.
The only purpose of this research is practical knowledge related to Data Science and the desire to understand some aspects of Titanic accident and it’s impacts.
The information present here was based on the following sources:
DATA FILES
The purpose of this project is to identify how the accident impact in the countries and identify the economic influence in the occurrence of passenger survival, due to lack of safety structure in the vessel.
To do so was necessary create a new dataset with complete passenger's dataset, correct information like age, official name and country of residence were consulted photocopies of original passenger's list, datasets, and passenger's biography.
Below is indicated each resource and the data collected or consulted.
BASIC DATASET
Was used to create this project the following Titanic Dataset as a base dataset extracted from the Github account of the book "Efficient Amazon Machine Learning", published by Packt. https://github.com/alexisperrier/packt-aml/blob/master/ch4/original_titanic.csv
DATASET EXTENSION
In order to populate with the correct data values the following data sources were consulted:
1.UK, RMS Titanic, Outward Passenger List, 1912. Was accessed the database and the original photocopies of passenger's list in order to acquire additional information. This collection was accessed through Ancestry services but provided in association with The National Archives. https://search.ancestry.com/search/db.aspx?dbid=2970. Terms and Conditions: http://www.ancestry.com/cs/legal/termsandconditions#Usage.
Encyclopedia Titanica. Database with the biography of victims. https://www.encyclopedia-titanica.org
Titanic - Titanic. The dataset with the biography of victims. http://www.titanic-titanic.com/, In order to solve inconsistency in names used in passengers list, was consulted the following websites:
Find a Grave. Database with biography and grave pictures with names and surnames. https://www.findagrave.com.
Wikipedia. Online encyclopedia. Used to understand the country changes over the years. For instance the change of political geography after the World War I. http://www.wikipedia.org.
I want to know in depth the impact of this terrible accident.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic Solution for Beginner's Guide’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 14 February 2022.
--- Dataset description provided by original source is as follows ---
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
--- Original source retains full ownership of the source dataset ---
This dataset was created by Adam Tadele
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by Rasool Shaik
Released under Apache 2.0
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Public "Titanic" dataset for data exploration, preprocessing and benchmarking basic classification/regression models.
Github: https://github.com/mwaskom/seaborn-data/blob/master/titanic.csv
Playground for visualizations, preprocessing feature engineering, model pipelining, and more.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic-Dataset (train.csv)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hesh97/titanicdataset-traincsv on 28 January 2022.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---