Introduction to Linear Regression — sklearn Diabetes Dataset
We all know the equation of line, that we learnt in high school,
y = mx + c
If you know this it literally means that you know the equation for a simple linear regression. Many a times, we feel big words like ‘Regression’ can mean big things, while they might be as simple as the above equation.
In Linear Regression,
y : Is a variable to be predicted ( aka Dependent variable) . It is of numerical continuous data-type.
m : here the coefficient ‘m’ is nothing but the slope of the line.
x : Is the variable which is called the independent variable.
c : We know this as a constant value , aka y-intercept.( The value of ‘y’ when ‘x’ is zero. Basically means it is the point at which crosses the vertical axis.
With Multiple Linear Regression, the numbers of x’s (predictors / features) will be more than one. The the equation will look like.
Y = m1x1 + m2x2 + …… + C
Linear Regression is simple, easy to understand, yet a very powerful machine learning algorithm. Its basic assumption is that the independent variables / features are “Linearly” related to the response / target variable.
For now, we will focus on how to do a Linear Regression in Python & Analyze the results. The dataset we will be using is an inbuilt dataset called ‘Diabetes’ in sklearn package.
Thank You for Reading. If you want to read more on ML Topics please follow me & motivate me by clapping & sharing the content. Thank You & Happy Learning!