Linear Regression: Real-life example

Vaishno Kumar
3 min readSep 12, 2021

--

Real-world problem solved with Maths

https://www.123rf.com/photo_43254198_doodle-writing-and-drawing-of-math-formulas-with-a-chalk-on-a-blackboard.html

Definition:

Simple linear regression allows studying the relationship between two variables. One variable(x) is called the independent variable and the other variable (y) is known as the dependent variable which is our target variable.

Formula:

Equation of simple linear regression

x = The value of the independent variable
y= The value of the dependent variable
ß0= constant (shows the value of y-axis when the value of x=0)
ß1=The regression coefficient (shows how much y changes for each unit of change in x.

Example:

let’s take an example of football stats which shows the transfer fee according to goals. It’s just an overview to show just the data frame.

football player stats and fee

Step 1: Estimating the Slope (ß1) -

here is the formula for ß1

Simplify the above formula according to our data set:

where (A= goal-goal_mean) and ( B = fee($million)-fee_mean)

ß1= 296.14/177.71 = 1.66

Step 2: Estimate the Intercept(ß0)-

ß0=fee_mean - ß1 * goal_mean = 60.285 - 1.66 * 29.42 = 11.44

Step 3: Making Predictions equation

Put ß0 and ß1 value in the above equation

y = 11.44 + 1.66 * x

in above equation , implement input value (x) and you got output prediction value (y).

Root mean Square Error (RMSE) :

RMS is the square root of the average of squared errors. The effect of each error on RMS is proportional to the size of the squared error; thus larger errors have a disproportionately large effect on RMS. Consequently, RMS is sensitive to outliers.

the equation for root mean square error

where y = actual value ,y ̂ =predicted value

Suppose that the average squared error value is 78 and sample of size (N) = 7
then RMS =78/7 =11.4

It means that each prediction is on average wrong by about 11.4 units. Keep remembering, RMS is sensitive to outliers.

Next time you find yourself in a situation where you need to estimate a quantity based on several factors that can be described by a straight line — you know you can use a Linear Regression Model.

Thanks for reading!

--

--