Regressions

Mayank Arora
Mackweb
3 min readDec 18, 2018

--

In these series of articles I’ll try to give you an intuition about regressions, probably the first thing that you learn while learning Machine Learning. I’ll start from simple linear regression, to multivariate, polynomial,ridge, lasso, logistic, etc. These are also essential in building neural nets. I’ll not be discussing a lot of math( I wish I could but that’s a lot of symbols), I’ll just focus on the python code which you might actually use while working with a project. So let’s get started.

Linear Regression:

It’s a type of regression that denotes a linear relationship of ‘y’(our output) with a dependent variable known as X.

Long story short, X here is a collection of data points. Now we have to, well, putting it indelicately, draw a line that goes through this data and that ‘line’ would help us predict the value of ‘y’ whenever a new data point comes in. As we can see here, we have only one feature. To draw this line,(remember y=mx+c), we need two parameters, ‘m’ and ‘c’. In ML Language they are known as model parameters and are denoted by a theta matrix. θ0 and θ1 are ‘c’ and ‘m’ respectively. Theta-nought controls the intercept, (where the line cuts the y axis) and theta-one controls the slope of the line.

We need to find that red line

Now to find a line that fit’s the data, we need to find the thetas. This can be done by using gradient descent or by using a normal equation. These two are topics for another article so…. well. In short, well, there’s a cost function involved. High error =high cost. (think of error as the line drawn vertically (NOT a right angle to the line) down/up from the data point) For eg, if the line is far away from the data points, think of like line y=0*X+2 , the error is wayyy to high. As we move closer to the line by tuning the parameters theta-nought and theta-one, the error becomes minimal at some point. As we move away, the error increases. So that point of minimum error, which in-turn also gives us value of best theta-nought and theta-one is found by using an algorithm known as gradient descent.

Or you can straight away use a math equation to find theta-nought and theta-one. This is the normal equation.

The middle equation is the Normal equation.

Here’s the python code. You need to know how to use numpy, pandas, and matplotlib to understand how this code works.

Hope you got some of the intuition behind Linear Regression. It’s quite hard to understand it without math so i would recommend the Standford Coursera ML course. Andrew Ng is a great teacher.

Thanks a lot :)

--

--

Mayank Arora
Mackweb
Editor for

Machine Learning nerd. University level tennis player. Cogito, ergo sum.