Chinmay s yalameli
Intel Student Ambassadors
4 min readAug 1, 2019

--

Machine Learning made Easy — Linear Regression: An Intuition

What are machine learning and linear regression for the layman?

Machine Learning — Fictional story. Once there was a doctor. He would look at a person and predict if s/he has a lack of Hemoglobin (red blood cells) or not. And almost all the time, his prediction will come true upon testing the blood of the patient.

How would he predict? Let’s assume a pale color of the face and tiredness in the body are symptoms of low Hb (hemoglobin). If the weight of the person is less, then more chances, such symptoms are because of low Hb. The doctor, “based upon his experience” will “predict” after looking at the person.

What if we can make a computer also learn things like this and make such predictions, assuming we have numeric quantified variables like paleness and tiredness?

Mathematical Approach

To understand the above equation, let’s observe the graph below. Here it’s a simple case study of the Salary of various employees with their years of experience in particular work field.

Let’s assume we have plotted various points on the graph stating their salary according to their years of experience. So we draw a line passing through all possible use cases. What if we figure out a line whose distances from each dot in the graph is optimal/minimal? It would be the ‘best-fit line’ as shown in red in the pic above. Our objective is to draw the red line in the graph above. So the question arises: what is the need of the graph? Let’s get deep into the intuition.

Assume the equation given below is used to fit a straight line to our graph, people who studied statistics will be familiar with it, as it is a basic straight-line equation. So here as we can see, b0 is constant, which is fixed, and b1 is the score to plot the graph or its merely a coefficient.

So let’s see what base b0 means in our example. Assuming every fresher in the company gets 30K as starting salary we can set it as the base price.

So now let’s take the salary of an employee with 1-year experience so that we can see changes in the plot. We can observe an increase in pay by 10K in the green line, so if we have an experience value, for example, take nine years for which there is no salary specified, we can easily predict.

Finally, our objective is to minimize the distance between actual and observed values in the graph as we can see in the graph below if we reduce the cost more the chances of our better prediction.

Here Yi is the original value, while y^ is the observed value. So here is the equation used to minimize the distance between Actual and Observed values. In below formula h0(xi) is nothing but our original cost, we take summation of squared differences and divide it by 2m where m is the total number of features or rows in our data set, to get significant value.

Out of various Cost function values, the minimum is chosen, and its line of the graph is selected as the Best fit Line.

For further updates and a practical approach, you can refer to my new article Machine Learning made Easy — Linear Regression: Code Concept (Python).

Credits for images and content for the above article:-

  1. Course Machine Learning A-Z from Udemy.

2. Course on Machine learning from Cousera from Andrew Ng

3. https://towardsdatascience.com/linear-regression-with-example-8daf6205bd49

--

--