LINEAR REGRESSION

Mahesh Singh Dasila
4 min readJun 4, 2019

--

Regression analysis

Well as you can see the face of the guy above , very confused, skeptical about what data-points are where and what is the best fit line and so on.I know at first for some of you it might be like talking to moon to whatever I am saying, so don’t worry we are going to discuss these things in detail.

So before starting linear regression let us talk a little bit about regression. As in simple term , in Regression the alphabet ‘re’ means something repetitive and so regression means to return to from high state to low state.

Regression comes under supervised learning model where we are provided with the labels. The labels may be discrete or continuous , depending upon which we choose our regression model. Like in this case , as we are discussing about Linear Regression model so that means we would be having continuous values in the label or target column whereas we use discrete in case of logistic regression.

But in statistics , regression is defined as the relationship between the dependent variable and independent variable. Dependent variable here is the output variables which depend on the input variables of the dataset whereas independent variable are the inputs which do not depend on output o target values.

Now, what is Linear Regression?

I have already told you about regression and linear here means the dependent variables and independent variables data-points making a linear relationship .Here the nature of the graph would be linear .Let me show you the diagram-

Regression line fits data-points

So here the blue dots are the data-points and the red line is the linear regression line. It is been called linear because the values in the y-axis i.e dependent variable and values in X-axis i.e independent variables are incrementing which each increase in value.

So lets talk abouth the application of this linear regression, like where it could be used.

1) weather forecasting — You must have seen the weather prediction in the news channel, newspaper ,the reason are multiples to predict and this is one of them.

2) Businesses- this is significantly used in this area to check the growth of the finance and products and then plan accordingly.

Ok lets dive a little deep into linear regression. Like what is the maths behind it.

So here we make use of a linear equation as you must have seen in the very first picture with that uncle (he he..)

Y= mX+C ,

Where Y= data-points of dependent variables

X= data-points of independent variables

m = slope of the graph

C= y-intercept

On changing the values of m and c we will get different-different lines. And here we have to see the best fit line.

So how do we even get to know the values of m and c which best gives the best fit line, we can sit whole day looking by changing values .

Here we can do it by “Least square Method ”.

See the line which we are trying to best fit is must be having all the data-points closest to that line. So what least square method is doing it is finding out the closeness of the data-points in the ‘up’ and ‘down’ direction.

Error calculation

We use least square method by minimizing the sum of squares of the residuals and residual here are the difference between observed value (actual value) and fitted value(predicted value). You can see the diagram.

Residuals finding

The formula in statistics for least square method is this –

Least square method formula

Where X bar is the mean of all the X values and Y is the mean of all Y values.

Ok , that’s it friends for now , hope you find it informative .Do clap if you found it useful.

-Mahesh Singh Dasila

--

--