This dataset was created by Antonio Rivero
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic Solution for Beginner's Guide’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide on 14 February 2022.
--- Dataset description provided by original source is as follows ---
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
--- Original source retains full ownership of the source dataset ---
The data has been split into two groups:
training set (train.csv)
test set (test.csv)
The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.
The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.
We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.
Variable Definition Key
survival Survival 0 = No, 1 = Yes
pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd
sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton
pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower
age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5
sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)
parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
In this Dataset you find the original titanic csv-files train and test. Special in this dataset is, that I added the right (100%) Survival Solution to the test data. This is only for better and faster evaluation of your own solution. Please don't upload this solution as a Submission to the official Competition!
Please be fair to the other Kagglers!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Titanic Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yasserh/titanic-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
https://raw.githubusercontent.com/Masterx-AI/Project_Titanic_Survival_Prediction_/main/titanic.jpg" alt="">
The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone on board, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
This dataset has been referred from Kaggle: https://www.kaggle.com/c/titanic/data.
--- Original source retains full ownership of the source dataset ---
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://raw.githubusercontent.com/Masterx-AI/Project_Titanic_Survival_Prediction_/main/titanic.jpg" alt="">
The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone on board, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
This dataset has been referred from Kaggle: https://www.kaggle.com/c/titanic/data.
This dataset was created by SONER KURT
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Titanic - Machine Learning from Disaster (Kaggle competition). This dataset contains datafiles for the Notebook Titanic/Kaggle -Full analysis 🕵🏽, by Fernando Meneses. It includes: training and testing datasets, the solution file, Leaderboard statistics and pre-trained results.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone on board, resulting in the death of 1502 out of 2224 passengers and crew. While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others. In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
Objective:
Survival Prediction: To build a logistic regression model that accurately predicts the survival of passengers based on features such as age, gender, passenger class, and number of siblings/spouses aboard.
Data Cleaning and Preprocessing:To perform data cleaning by handling missing values, removing unnecessary columns, and encoding categorical variables to prepare the dataset for analysis.
Exploratory Data Analysis (EDA): To conduct a thorough exploratory data analysis to visualize survival rates and identify patterns based on various factors like gender, passenger class, and embarked location.
Feature Importance Analysis: To analyze the correlation between different features and their impact on survival rates, identifying which factors are the most significant predictors of survival.
Model Evaluation: To evaluate the performance of the logistic regression model using accuracy scores and classification reports, ensuring that the model generalizes well to unseen data.
ROC Curve Analysis: To create a ROC curve to assess the trade-off between the true positive rate and false positive rate, providing insights into the model's ability to distinguish between survivors and non-survivors.
Insights and Recommendations: To derive insights from the analysis that could inform future safety measures or policies related to passenger safety in maritime travel.
This dataset was created by ramarajud1986
The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
This is an attempt to learn prediction from the given training and test sets
This is the sample from Kaggle learners section
Thankx Kaggle for this sample data set and allowing us to use it for learning
Learn to use data to solve problems and provide solutions.
Description 👋🛳️ Ahoy, welcome to Kaggle! You’re in the right place. This is the legendary Titanic ML competition – the best, first challenge for you to dive into ML competitions and familiarize yourself with how the Kaggle platform works.
If you want to talk with other users about this competition, come join our Discord! We've got channels for competitions, job postings and career discussions, resources, and socializing with your fellow data scientists. Follow the link here: https://discord.gg/kaggle
The competition is simple: use machine learning to create a model that predicts which passengers survived the Titanic shipwreck.
Read on or watch the video below to explore more details. Once you’re ready to start competing, click on the "Join Competition button to create an account and gain access to the competition data. Then check out Alexis Cook’s Titanic Tutorial that walks you through step by step how to make your first submission!
The Challenge The sinking of the Titanic is one of the most infamous shipwrecks in history.
On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.
While there was some element of luck involved in surviving, it seems some groups of people were more likely to survive than others.
In this challenge, we ask you to build a predictive model that answers the question: “what sorts of people were more likely to survive?” using passenger data (ie name, age, gender, socio-economic class, etc).
Recommended Tutorial We highly recommend Alexis Cook’s Titanic Tutorial that walks you through making your very first submission step by step and this starter notebook to get started.
How Kaggle’s Competitions Work Join the Competition Read about the challenge description, accept the Competition Rules and gain access to the competition dataset. Get to Work Download the data, build models on it locally or on Kaggle Notebooks (our no-setup, customizable Jupyter Notebooks environment with free GPUs) and generate a prediction file. Make a Submission Upload your prediction as a submission on Kaggle and receive an accuracy score. Check the Leaderboard See how your model ranks against other Kagglers on our leaderboard. Improve Your Score Check out the discussion forum to find lots of tutorials and insights from other competitors. Kaggle Lingo Video You may run into unfamiliar lingo as you dig into the Kaggle discussion forums and public notebooks. Check out Dr. Rachael Tatman’s video on Kaggle Lingo to get up to speed!
What Data Will I Use in This Competition? In this competition, you’ll gain access to two similar datasets that include passenger information like name, age, gender, socio-economic class, etc. One dataset is titled train.csv and the other is titled test.csv.
Train.csv will contain the details of a subset of the passengers on board (891 to be exact) and importantly, will reveal whether they survived or not, also known as the “ground truth”.
The test.csv dataset contains similar information but does not disclose the “ground truth” for each passenger. It’s your job to predict these outcomes.
Using the patterns you find in the train.csv data, predict whether the other 418 passengers on board (found in test.csv) survived.
Check out the “Data” tab to explore the datasets even further. Once you feel you’ve created a competitive model, submit it to Kaggle to see where your model stands on our leaderboard against other Kagglers.
How to Submit your Prediction to Kaggle Once you’re ready to make a submission and get on the leaderboard:
Click on the “Submit Predictions” button
Upload a CSV file in the submission file format. You’re able to submit 10 submissions a day.
Submission File Format: You should submit a csv file with exactly 418 entries plus a header row. Your submission will show an error if you have extra columns (beyond PassengerId and Survived) or rows.
The file should have exactly 2 columns:
PassengerId (sorted in any order) Survived (contains your binary predictions: 1 for survived, 0 for deceased) Got it! I’m ready to get started. Where do I get help if I need it? For Competition Help: Titanic Discussion Forum Kaggle doesn’t have a dedicated team to help troubleshoot your code so you’ll typically find that you receive a response more quickly by asking your question in the appropriate forum. The forums are full of useful information on the data, metric, and different approaches. We encourage you to use the forums often. If you share your knowledge, you'll find that others will share a lot in turn!
A Last Word on Kaggle Notebooks As we mentioned before, Kaggle Notebooks is our no-setup, customizable, Jupyter Notebooks environment with free GPUs and a huge repository ...
Ce graphique expose les quinze films français ayant réalisé le plus grand nombre d’entrées au cinéma de 1945 à 2024. Au 22 mai 2024, c’était encore Danny Boon avec Bienvenue chez les Ch’tis (2008) qui trônait en haut du classement avec plus de 20,4 millions d’entrées, soit près d’un million de plus que son dauphin Intouchables. Les Français semblent se déplacer massivement en salle pour les sorties des grandes franchises comme Astérix et Obélix et des grands noms du cinéma français. Le cinéma français résiste encore et toujours à l’envahisseur Bienvenue chez les Ch’tis et Intouchables ne sont pas seulement les plus grands succès du cinéma français en France mais également les plus grands succès du cinéma en France, se positionnant juste derrière Titanic de James Cameron en nombre d’entrées (21,8 millions). En plus de ces deux films, il est possible de retrouver deux autres films français dans le classement des 15 meilleures entrées en France avec Astérix et Obélix : mission Cléopâtre (14,4 millions) et Les Visiteurs (13,67 millions). Depuis 2013, la part de marché des films français a petit à petit rattrapée celle des films américains au box-office en France. En 2020 et 2022 les films français ont même généré plus d’entrées que les films américains. Il reste désormais à savoir si cette embellie de l’industrie du cinéma français pourra se poursuivre dans les prochaines années, notamment face à la concurrence toujours grandissante des services de streaming. Le streaming vidéo toujours plus populaire en France Au deuxième trimestre 2022, Netflix comptabilisait à lui seul plus de 220 millions d’abonnés dans le monde. Entre Netflix, Disney+, Amazon Prime Video, OCS ou encore MyCanal, les services de streaming vidéo ne manquent pas en France, d’autant plus que d’autres plateformes comme HBO ou Paramount préparent leur entrée sur le marché français, et abreuvent les consommateurs de films qu’ils peuvent regarder chez eux. Même si les différents confinements ont joué un rôle prépondérant dans cette chute, il faut relever une baisse drastique des entrées en salles en France en 2022 due notamment à l’explosion des services de streaming. Il reste désormais à savoir si, face à la pression des plateformes, l'État pourra continuer à maintenir la chronologie des médias. La chronologie des médias est une mesure encadrant le rythme de diffusion et de sortie d'un film, notamment le délai entre son exploitation en salle puis sa disponibilité sur les services de streaming. Récemment, Disney s'est montré particulièrement critique envers cette mesure qui lui empêche notamment de pouvoir sortir simultanément en salle et sur sa plateforme un film.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset was created by Antonio Rivero