Linear Regression — Simple/Single — Multiple

Shanthababu Pandian
Analytics Vidhya
Published in
5 min readNov 1, 2020
Regression!

When you start learning Machine Learning absolutely you would start with a Linear Regression algorithm, no one escapes or expectational for this, because this algorithm would be the very first child of Supervised Learning methodology. Since in this case the dataset is labelled and in which algorithm would identify the features explicitly and deriving predictions from the given data set by finding best fit or line.

Let’s Understand the components in Linear Regression — Graph

Linear Regression — Components

Let’s discuss Linear Regression — In Mathematical point of view. Before, how to use the Machine Learning model.

Go with simple relationship analogy between Distance -Speed-Time

Time, Speed and Distance relationship

Hope you could easily understand from the below picture, WHAT is a Positive relationship and Negative relationship between Speed Vs Distance

Let’s Solve the problem

Given Data

Find the Slope, and intercept? Remember your mathematics in college days (Engineering Mathematics 3)

What is (Slope) and c (Intercept) from the given problem

formula for Slope (m)

After long calculations, I am getting below table.

Calculation

if you apply your values in a formula for Slope (m). you would get m=0.4. Since we have apply the value of m,x,Y then we can figure out c(Intercept)

Now, we know m (Slope) and c (Intercept) from above calculations, followed by this we arrived below the equation, to solve any Y for given X.

Actually, We have predict the values (Yp)

Y-Predictor: By apply derived values

Let’s compare Actuals and Predicted

If you plot the graph, you will get the below.

Actuals Vs Predicted

Next What! 😊.

Way to understand Linear regression in Machine Learning model.

Linear regression is a way to explain the relationship between a Dependent (Observation or Y) variable and one or more explanatory variables (Independent or Y variable) using a straight line.

Attributes of LR

Again, under Linear Regression, we have two types.

1. Single/Simple Linear equation

2. Multiple Linear Regression

Always remember that the Multiple Linear Regression model speaks a lot in Data Science and Machining Learning space than Simple one, because predictive analysis is always dependent on multiple factors. Will discuss more…..

Let’s Understand few more components in Linear Regression — Graph

Let’s Starts With Single Linear Regression

1. Single/Simple Linear equation

a. The most common regression is Simple/Single LR, In which the linear relationship between two variables

b. So called Predictor variable and Response variables.

c. One step further we could say the correlation between two variables.

d. Involves two(2) variables, one in each side of Dependent and Independent Scope

e. LR models are highly valuable and most common ways to make the predictions of dependent variable.

f. In Which a dependent variable is labelled as Y, is being predicted, and other one is independent variables are labelled as x, Then we can have below the equation, As we studied in our school and college mathematics….

Y= mx + b (single/simple Linear equation)

Code from sample

Without wasting our time….lets jump into Multiple Linear Regression

Multiple Linear Regression

In Which a dependent variable is labelled as Y, is being predicted, and other set of independent variables are labelled as x1, x2…

y = β0 + β1x1 + β2x2 + ··· βkxk +error noise

As mentioned earlier, In the real-world scenario, there are more than two variables in the regression model analysis. This is so called “Multiple Linear Regression” or “Multivariate Linear Regression”.

The difference is the evaluation of the highest impact of independent variables (x1,x2…) on the predicted variable(Y).

Ref….wine quality dataset

winequality.csv

Feature Selection from given data set

Feature Selection
How Dependent and Independent variables are looks like in the equation.

Now the model can understand What is Y and X reference in the given data set .

As Y=mX+C either Single or Multiple independent variables.

Rest is Test and Train split and applying Algorithm on given fine tuned DATA SET.

Remember one thing. the above calculations are taken care by Python libraries. So you need not to worry about that! COOL! :). But understand the concepts before applying, it would be great help for you to explain and handle any situation.

Will discuss more about Dependent and Independent variables and what is Root-Mean-Square-Error Shortly!

See You all Soon! Thanks for reading.

--

--

Shanthababu Pandian
Analytics Vidhya

Director- Data and AI -Data, AIML and Gen AI Architect, National and International Speaker, Author. https://www.linkedin.com/in/shanthababu-pandian-b2a9259/