Simple breakdown of the steps involved in Machine Learning !

Harinath Selvaraj
coding&stuff
Published in
3 min readSep 19, 2018

By now, most of the readers might have heard about the term, “Machine Learning”. This topic will cover on the overall process handled as part of machine learning.

Starting from detecting skin cancers to sorting cucumbers and detecting escalators in needs of repair, machine learning has granted computer systems an entirely new abilities but how does it really work under the hood? There are few processes involved in machine learning and we’ll go through them one by one,

Step 1 — Gathering the data

This step is very important because the quality and the quantity of data that you gather will directly determine how good your predictive model can be. Example — If we are trying to apply machine learning to improve the traffic in Los Angeles, we need to pick some traffic data ie) Amount of traffic/waiting time and the date & time of recording. This is called the Training data.

Step 2 — Data preparation

In this step, we load our data in to a suitable place and prepare it for use in our machine learning training. We would first put all data together and then randomise the ordering since we wouldn’t want the order of the data to affect how it learns since that is not part of determining in this use case. We also need to split the data in to two parts — the first part is used for Training our model (which is the majority of data) and the second is Evaluation which is used to evaluate our trained model’s performance. Sometimes, the data that we collect may also require other forms of adjusting to handle duplication, error correction & normalisation. All these process will happen in the data preparation step.

Step 3 — Choosing a model

There are many models that researchers and data scientists have created over the years. Few of them are very well suited for image data. Others for sequences such as music, some for numerical and few others for text based data.

Step 4 — Training

In this step, we will use our data to incrementally improve our model’s ability to predict the correct result. The data is being measured in terms of Weights and Biases. This is used to predict the output with those values. At first, it may produce poor result however by comparing with the actual training data, the results can be improved. Each iteration of updating the weights or biases is called 1 training step. This can be conducted until the results looks satisfying.

Step 5 — Evaluation

Evaluation allows us to test our model against data that have never been used for training. This metric allows us to determine how the model may perform against the data that has not yet seen. This is meant to be representative of how the model might perform in the real world. The ratio of Evaluation & Split should be roughly around 80/20 or 75/25. Much of this depends on the size of the original dataset.

Step 6 — Hyper Parameter Tuning

In some instances, we may have kept few parameters constant. This is the time we can test to improve the accuracy. Learning rate is how far we adjust the measurement based on the previous training step.

Step 7 — Prediction

This is the final step wherein we predict how good is the data.

Try working on Machine learning exercises at tensorflow.org. Good Luck ! 😃

--

--