Linear Regression ML

Nikhil Upadhyay
Knowledge Gurukul
Published in
4 min readMar 9, 2021

This Publication about Linear Regression with ML and how we can use Linear regression with Python Implementation.

Regression :

Before Understand Linear regression we have to understand Regression. Regression analysis is a statistical method as well as relationship between a dependent (target) and independent (predictor) variables with one or more independent variables. Regression predicts continuous/real values or Continuous values such as temperature, age, salary, price, etc. For example, if a company’s sales have increased steadily every month for the past few years, by conducting a linear analysis on the sales data with monthly sales, the company could forecast sales in future months and there are various features which help in increasing sales of Company like advertisement, Digital platform etc. and on the basis of this features we can detect how they can increase there sales. Another Example predict the cost of a house or the weather outside in degrees.

Regression is a Supervised Learning technique which helps in finding the correlation between variables and enables us to predict the continuous output variable based on the one or more predictor variables.

Regression shows a line or curve that passes through all the data points on target-predictor graph in such a way that the vertical distance between the data points and the regression line is minimum.

Use of Linear Regression :

  1. Prediction
  2. Forecasting
  3. Time series modelling,
  4. Determining the causal-effect relationship between variables.
  5. By performing the regression, we can confidently determine the most important factor, the least important factor, and how each factor is affecting the other factors.

Type of Regression :

  • Linear Regression
  • Logistic Regression
  • Polynomial Regression
  • Support Vector Regression
  • Decision Tree Regression
  • Random Forest Regression
  • Ridge Regression
  • Lasso Regression

Each type of regression have its own importance on different scenarios, but at the core, all the regression methods analyse the effect of the independent variable on dependent variables.

Linear Regression :

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (X) variables, hence called as linear regression. If number of independent is more then one then it will be consider as multiple linear regression. Linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable for better understand see the image below.

Mathematical Representation of Linear regression :

Y = a0 +a1x + ε / Y = mx +c

Y= Dependent Variable (Target Variable)
X= Independent Variable (predictor Variable)
a0= intercept of the line (Gives an additional degree of freedom)
a1 = Linear regression coefficient (scale factor to each input value).
ε = random error

Relationship in Linear Regression :

  1. Positive Linear Regression (dependent variable increases on the Y-axis and independent variable increases on X-axis).
  2. Negative Linear Regression (dependent variable decreases on the Y-axis and independent variable increases on the X-axis).

Before Working with linear regression there are few terms that we have to know these terms are :

Regression is for Continuous Value or for Predict Continuous Value.

Regression

Correlation

In Statistics — Mean, Median, Mode, Standrd Deviation, IQR

Dependent variable and Independent Variable

Residuals - (distance between the actual value and predicted values)

Gradient Decent

EDA (Exploratory Data Analysis)

Data Visualization with Mat Lab, Sea born, Plotly etc.

train_test_split

coefficient — coef_

Cost Function

Mean Absolute Error(MAE) = Actual — Predicted = result

Mean Squared Error(MSE) = result ** 2

Root mean Squared Error(RMSE) = root of Mean Squared Error

R squared

Normal Distribution or Distribution in Statistics

Gradient Descent:

  • Gradient descent is used to minimize the MSE(Mean Square Error) by calculating the gradient of the cost function.
  • A regression model uses gradient descent to update the coefficients of the line by reducing the cost function.
  • It is done by a random selection of values of coefficient and then iteratively update the values to reach the minimum cost function.

R-squared method:

  • R-squared is a statistical method that determines the goodness of fit.
  • It measures the strength of the relationship between the dependent and independent variables on a scale of 0–100%.

Process we have to follow before working on Data :

  1. Convert Business problem to analytical Problem.
  2. Data Collection
  3. Statistics on Data
  4. Exploratory Data Analysis
  5. Train_test_split
  6. Model Selection
  7. Check Accuracy
  8. Make Prediction
  9. Model Deploy

For better understand this all terminology i already created few models on the basis of real life data. you can go through my GitHub profile where you can see implementation of Linear regression with real time.

Backward Elimination:

Backward elimination is a feature selection technique while building a machine learning model. It is used to remove those features that do not have a significant effect on the dependent variable or prediction of output. There are various ways to build a model in Machine Learning, which are:

  1. All-in
  2. Backward Elimination
  3. Forward Selection
  4. Bidirectional Elimination
  5. Score Comparison

Above are the possible methods for building the model in Machine learning, but we will only use here the Backward Elimination process as it is the fastest method.

--

--

Nikhil Upadhyay
Knowledge Gurukul

Experience in AI(Computer Vision), Machine Learning, Python, Data Science and Proficient in Data Analysis, Predictive modelling, NLP, Database(SQL,