Machine Learning Using Titanic Data
Why not try out machine learning along with data analysis??
Machine learning is a branch of artificial intelligence where we construct models/systems that learn and study data to make predictions.
This is what I tried out as well using the analysis I did on the titanic data set. A model that can predict the chance of survival for any passenger based on data about them.
Let us begin creating our model. What would the first step be?
Understanding your data set
The data set I chose is “Titanic: Machine Learning from Disaster” from Kaggle and which contains two separate train and test data files.
Analyzing the train and test data is essential, which is explained in detail in my article “Investigating Titanic Data”.
As observed in the data, the test data is missing the ‘Survived’ column.
Therefore we can use the features Sex, Age,Pclass, Fare, Embarked, SibSp, and Parch that affected survival rates in the train data as observed in the analysis done.