Titanic Survival Exploration: manual implementation of a decision tree. Select the file "titanic_survival_exploration.ipynb" to see the analysis.
This project was done as part of Udacity's Machine Learning Engineer Nanodegree. It started as a template developed by Udacity which I completed with code of my own in order to uncover insights in the data and to answer the questions. I've also edited the notebook so it reads in a more linear fashion.
This csv file contains a sample of the RMS Titanic data, composed of 891 rows and 11 columns:
- Survived: Outcome of survival (0 = No; 1 = Yes)
- Pclass: Socio-economic class (1 = Upper class; 2 = Middle class; 3 = Lower class)
- Name: Name of passenger
- Sex: Sex of the passenger
- Age: Age of the passenger (Some entries contain NaN)
- SibSp: Number of siblings and spouses of the passenger aboard
- Parch: Number of parents and children of the passenger aboard
- Ticket: Ticket number of the passenger
- Fare: Fare paid by the passenger
- Cabin Cabin number of the passenger (Some entries contain NaN)
- Embarked: Port of embarkation of the passenger (C = Cherbourg; Q = Queenstown; S = Southampton)
This is a Python module that was made available by Udacity. It provides visualizations based on the columns that the user is interested in analyzing. I made minor changes so it could be run in Python 3.