Support Vector Regression

Published in

ML_with_Arpit_Pathak

6 min readJun 8, 2020

Hello readers , this blog explains one more algorithm in Machine Learning , SVR . Here we will first understand the superset algorithm principle of Support Vector Machine and then move to Support Vector Regression that works on the same principle .

Support Vector Machine ( SVM )

SVM is a type of Supervised Machine Learning algorithm that works by analyzing the data and representing the outputs as points in a space that are clearly mapped by dividing them into two separate categories . This algorithm is used either in classification or regression analysis . In classification , it is called Support Vector Classifier (SVC) and in regression , it is called Support Vector Regression (SVR) .

There are some terms associated to this algorithm . These terms can easily explain the working of this algorithm . Let us see what these terms are —

Hyperplane

A hyperplane is a geometrical subspace in any dimensional combination where the dimension of the Hyperplane is one less than the original dimensions of the space in which it lies . For example , in a @-dimensional pace , the hyperplane is a 1-dimensional line .
In machine learning , a hyperplane is a decision boundary that strictly divides two categories from each other .

Let us try to understand it with a simple diagram —

SVM always creates a hyperplane between two categories in order to separate them strictly from each other .

2. Support Vectors

As the name suggests , Support vectors are the supporting data points in any space of classification that decide the orientation and position of the hyperplane . These are the most important part of the working of the SVM algorithm as they help in configuring the hyperplane over the categories .

In this diagram , the support vectors are drawn by creating the parallel line to the hyperplane from the closest points of the two categories . These two point are basically known as support vectors and the parallel line to the hyperplane are known as support vector lines .

3. Marginal Distance

Marginal distance is the total perpendicular distance between the two support vector lines on either side of the hyperplane .

In the figure , you can see that the marginal distance is the distance between the support vectors line . In other words , we can say that the marginal distance is the maximum perpendicular distance between the closest data point of the 2 categories from each other .

The SVM algorithm works by creating a hyperplane is such a way so that the support vectors have a maximum marginal distance between them . Let us see the below image to understand this concept —

SVM Kernal Trick

All the above example images that we saw are based on the data that is easily separable i.e a linear separating line can be drawn between them . This type of data is known as Linearly Separable data . However , most of the real world cases deal with the data that are not easily separable . These data are known as Non-Linear separable data . This means that we cannot create a linear line to separate this kind of data . Let us take an example from a simple diagram of non- linear separable data —

This figure gives an example of non-linear separable data . If we try to draw a linear line for this data , then we will get an accuracy of 50% or less in our model . In order to solve this kind of data problem , the SVM kernal has a trick by which it adds one more dimension to the data so that the hyperplane can be created over the data in that space . Let us see how this happens —

In the above examples , if we consider green points as positive and blue points as negative then we can get the perfect intuition of working of SVM as follows —
The SVM algorithm tries to create a hyperplane that has the highest marginal distance between the support vectors and all the data points lie outside this marginal boudary in sucha way that the negative points are on the negative side and the positive points are on the positive side .

Now , let us jump to the topic of this blog i.e SVR —

Support Vector Regression (SVR)

SVR is a sub-algorithm of SVM that is dedicated to the regression based algorithm . It works on the same principle of creating a hyperplane just like the SVM . However , the aim of creating a hyperplane in this algorithm is in such a way that most of the data point lie on the hyperplane and rest lie inside the support vectors viz. decision boundaries .

Le us understand this concept through a simple diagram of regression analysis —

As you can see in the diagram , there are some data points over which a hyperplane is created in such a way that most of the points lie on it and all the remaining points lie inside the support vector lines . This marginal distance is known as the valid margin and the support vectors are known as the decision boundaries .

Mathematical Implementation Of SVR

Let us consider from the above example where the equation of the hyperplane is —

— — — — — — — — — — — y = wx + b — — — — — — — — — — — —

Let each of the decision boundaries are at “p” distance from the hyperplane . So , the vector distance of each hyperplane is “-p” and “+p” from the hyperplane . So , the equation of both the line can be said as —

— — — — — — — — ( p = wx + b ) & ( -p = wx + b ) — — — — — — —

Hence , the hyperplane should always be in such a way that —

— — — — — — — — — —( p > y -wx + b > -p ) — — — — — — — — — —

This hyperplane must have the least error rate i.e the deviation of points from the hyperplane should be minimum .

That is all about this blog . Hope it was an informative one . Thank you for reading…!!!