Regression concept in Machine learning:
From my previous blog, we got to know about the supervised and unsupervised learning algorithm. I have already explained why we need to know the difference between both two algorithms. As the day to day life, we are dealing with so many kinds of data whether it is structured or unstructured.
Let’s talk about regression: -
What is Regression?
Regression is a statistical measurement as all we know, which determines the strength of the relationship between one dependent variable and an independent variable. It determines the gap between two different variables so that we can find a relationship between the two.
Machine learning is used to predict the outcome of an event based on the relationship between variables obtained from the data-set.
Out of 15 types of regression below 2 regression are mostly used in Machine learning. Let’s know the concept.
1. Linear regression
2. Logistic regression
Linear regression vs Logistic regression
In simple words, when we want to predict the value of a categorical or discrete outcome, we use logistic regression. When we want to predict the value of an outcome given the input values, we use linear regression.
In linear, the outcome can be an infinite number of possible values but in logistic the outcome has only a limited number of possible values.
Stateline Equation:
Y=mx+c, where y is the dependent variable and x is the independent variable, c is the constant, m is the slope.
Y=x1+x2+x3…..
Where we also mention y=f(x)
In a set of data, it is given the score of students like x1+x2+x3+x4+x5, we want to find out the average score of the data.
Let x=x1+x2+x3+x4+x5
Average score= [x1+x2+x3+x4+x5]/5 if we want to find out the weighted average score them,
W= [w1 w2 w3 w4 w5]
Average score will be [w1x1+w2x2+w3x3+w4x4+w5x5]/5
So here linear regression is, y=f(ΣiWiXi)
Where we also find the error= (y-y’) ^2,
Now the concept comes why we are taking a square of the error
i.e. error= (y-y’) ^2
After taking the difference between actual data and predicted data, we square the value and find out the error value, there are also different functions of error like the least square error, etc.
The reason is we want to find out the bigger error because the bigger errors are magnified and smaller errors are suppressed, so we do not bother about a smaller error. That is the objective behind finding the square of an error value.
Sometimes we also find out the summation of an error i.e.
E=e1+e2+e3……