In this blog, we’ll cover another interesting Machine Learning algorithm called Support Vector Regression(SVR). But before going to study SVR let’s study about Support Vector Machine(SVM) as SVR is based on SVM.
SVM is a supervised learning algorithm which tries to predict values based on Classification or Regression by analysing data and recognizing patterns. The algorithm used for Classification is called SVC( Support Vector Classifier) and for Regression is called SVR(Support Vector Regression).
Let’s understand some basic concepts
- Hyperplane: A hyperplane is a plane which is used to divide categories based on their values. A hyperplane is always 1 dimension less than the actual plane used for plotting the outcomes or for analyses. For eg, in Linear Regression with 1 feature and 1 outcome we can make a 2-D plane to depict the relationship and the regression line fitted to that is a 1-D plane. Hence, this plane is called as Hyperplane. Similarly, for a 3-D relationship, we get a 2-D hyperplane.
- Support Vectors: Support Vectors are those points in the space that are closer to the hyperplane and also decide the orientation of the hyperplane. The lines or planes drawn is called Support Vector Lines or Support Vector Planes.
- Margin Width: The perpendicular distance between the 2 support vector lines or planes is called Margin Width.
In the above diagrams, we saw that our data is linearly separable. But the consider the below case.
In this, a simple linear division is not possible. So, the SVM kernel adds one more dimension to it. After adding another dimension, the data becomes separable using a plane. The following intuition can be drawn.
The SVM algorithm tries to draw a hyperplane having highest margin width between the support vector and points lie either above or below the support vector planes i.e. those points on the negative side remain below the negative hyperplane and points on the positive side remain above the positive hyperplane.
Support Vector Regression
Suppose a dataset in which we try to predict the salary of an employee using his Age. So, we can create a Linear Regression model which will help us in this. In Linear Regression we try to fit a line that minimizes the error as much as possible. But through Support Vector Machine we’ll fit a line (for 2-d) or a hyperplane(for n-d) that tries to limit the error to a certain extent.
The main intuition behind SVR is that most of the points lie on the hyperplane and rest lie inside the positive and negative hyperplanes. But in SVC we try to draw a hyperplane such that negative points lie below negative hyperplane and positive points lie above positive hyperplane.
Thus, in SVR we try to find the planes such that points lying outside the positive or negative hyperplanes are reduced.
Let ‘a’ be the distance of the support vector lines from the hyperplane.
1 — Let the equation of a hyperplane be:
Y= wx + b
2 — The equation of positive hyperplane is:
+a = wx + b
3 — The equation of negative hyperplane is:
-a = wx + b
4 — Hence our hyperplane should satisfy the following condition:
-a < Y-wx+b < +a
So, we should choose ‘a’ such that the support vectors lie inside it and error is least possible.
That you all for reading this. Stay tuned for further blogs. Have a nice day.