Are they cousins? Linear Regression, Logistic Regression and Support vector machine.

Bharat Dadwaria
TheCyPhy
Published in
4 min readFeb 15, 2020

In this article we will discuss about the basic intuition behind Linear Regression, Logistic Regression and support vector machine.The reason behind taking three of these machine learning model is the fact that there is relationship between three of them.

Before deep-diving into the relationship between Linear Regression, Logistic Regression and Support vector machine let us put some light towards individuals.

Basic overview

Linear Regression

Linear Regression is a supervised machine learning algorithm used for continuous attributes to predict real values. Linear Regression is the very basic machine learning algorithm that not required any deep statistical modeling. There is a linear relationship between the input and the label (output) which can be visualized through the line equation.

The Linear Regression best-fit equation is based on equation Y=mX+b.

  • Y : Label (Dependent Variable)
  • m : Slope
  • X : Input (Independent Variable)
  • b : Bias.

And the Linear Regression is further be scaled up through Multiple Linear Regression having multiple no of dependent variables.

Multiple Linear Regression

The above equation gives the best fit line for Input X with m features {x1,x2,… .. x}. The Basic idea is to configure the parameters {b0,b1,b2 … bn} such that the equation provides the best fit line.

https://towardsdatascience.com/introduction-to-linear-regression-and-polynomial-regression-f8adc96f31cb

Logistic Regression

Logistic Regression maps the linear relationship of input (that we have used in Linear Regression) into discrete quantity by applying a function popularly known as sigmoid function. Logistic Regression is a supervised learning-based Classification machine learning algorithm.

Logistic Regression

Let us say the linear relationship of variables is represented by z. Then we can infer that z=0 at the activation function of z is 0.5.

https://en.wikipedia.org/wiki/Sigmoid_function

Linear classifier classifies the variables into separate classes which are infer based on probability.

Analysis of Logistic Regression :

This is a very fast algorithm and preferred by internet companies where low latency is required.

  • The time and space complexity are less.

Support Vector Machine

Support Vector Machine is very similar to Logistic Regression by the fact that both are Supervised classification algorithms and having the same procedures as Logistic Regression.

When it came to solving the classification problem most of the professionals prefer support vector machines. The support vector machine uses kernelization for dealing with non-linear boundaries.

Here rather then maximizing the correctly classified points, we begin by maximizing the distance between the convex hulls.

https://en.wikipedia.org/wiki/Support-vector_machine

Are they, cousins?

The similarity between Linear Regression, Logistic Regression and Support Vector Machine

When we talk about the similarity between them, all the three models follow a unified approach of loss minimization and regularization.

They differ only by the optimization function or in other word the loss functions. The general model can be represented as:

The above graph shows the graphical representation of the loss functions. The zero-one loss is the perfect case which is the ideal situation that is the goal of all the algorithms to optimize.

Mean Square Error loss function is sensitive to outliers. It provides the best result when the distribution around means. The logistic Loss function is logarithmic which helps the function to be monotonic in nature. When we compare the logistic cost function with Support vector hinge loss function is that

When it came to regularization, we biased data towards a particular values for example small values tends to zero. By adding a penalty term that will encourage those values.So by optimizing the data term and regularization balances the loss functions.

So from the above comparison, we can conclude that in order to balance the loss functions we used the same regularization function.

References

1 : https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-867-machine-learning-fall-2006/lecture-notes/lec4.pdf

2: https://machinelearningmastery.com/logistic-regression-for-machine-learning/

3: https://en.wikipedia.org/wiki/Linear_regression

--

--

Bharat Dadwaria
TheCyPhy

I'm a Computer vision research engineer, exploring the intersecting field of Computer vision and Robotic vision. https://bharatdadwaria.github.io