Machine Learning Using Titanic Data

Abinaya Jayaprakash
The Startup
Published in
4 min readOct 14, 2020

--

Why not try out machine learning along with data analysis??

Machine learning is a branch of artificial intelligence where we construct models/systems that learn and study data to make predictions.

This is what I tried out as well using the analysis I did on the titanic data set. A model that can predict the chance of survival for any passenger based on data about them.

Let us begin creating our model. What would the first step be?

Understanding your data set

The data set I chose is “Titanic: Machine Learning from Disaster” from Kaggle and which contains two separate train and test data files.

Analyzing the train and test data is essential, which is explained in detail in my article “Investigating Titanic Data”.

As observed in the data, the test data is missing the ‘Survived’ column.

Therefore we can use the features Sex, Age,Pclass, Fare, Embarked, SibSp, and Parch that affected survival rates in the train data as observed in the analysis done.

--

--

Abinaya Jayaprakash
The Startup

Srilankan living in Berlin. Graduate Trainee - Technology at Deutsche Bank. Interested in Data science & Machine Learning