Mayank Arora
Dec 18, 2018 · 3 min read

In these series of articles I’ll try to give you an intuition about regressions, probably the first thing that you learn while learning Machine Learning. I’ll start from simple linear regression, to multivariate, polynomial,ridge, lasso, logistic, etc. These are also essential in building neural nets. I’ll not be discussing a lot of math( I wish I could but that’s a lot of symbols), I’ll just focus on the python code which you might actually use while working with a project. So let’s get started.

Linear Regression:

It’s a type of regression that denotes a linear relationship of ‘y’(our output) with a dependent variable known as X.

Long story short, X here is a collection of data points. Now we have to, well, putting it indelicately, draw a line that goes through this data and that ‘line’ would help us predict the value of ‘y’ whenever a new data point comes in. As we can see here, we have only one feature. To draw this line,(remember y=mx+c), we need two parameters, ‘m’ and ‘c’. In ML Language they are known as model parameters and are denoted by a theta matrix. θ0 and θ1 are ‘c’ and ‘m’ respectively. Theta-nought controls the intercept, (where the line cuts the y axis) and theta-one controls the slope of the line.

We need to find that red line

Now to find a line that fit’s the data, we need to find the thetas. This can be done by using gradient descent or by using a normal equation. These two are topics for another article so…. well. In short, well, there’s a cost function involved. High error =high cost. (think of error as the line drawn vertically (NOT a right angle to the line) down/up from the data point) For eg, if the line is far away from the data points, think of like line y=0*X+2 , the error is wayyy to high. As we move closer to the line by tuning the parameters theta-nought and theta-one, the error becomes minimal at some point. As we move away, the error increases. So that point of minimum error, which in-turn also gives us value of best theta-nought and theta-one is found by using an algorithm known as gradient descent.

Or you can straight away use a math equation to find theta-nought and theta-one. This is the normal equation.

The middle equation is the Normal equation.

Here’s the python code. You need to know how to use numpy, pandas, and matplotlib to understand how this code works.

Hope you got some of the intuition behind Linear Regression. It’s quite hard to understand it without math so i would recommend the Standford Coursera ML course. Andrew Ng is a great teacher.

Thanks a lot :)


The Tech Blog

Mayank Arora

Written by

Machine Learning nerd. University level tennis player. Cogito, ergo sum.



The Tech Blog

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade