SVR: Machine Learning

4 min readJan 4, 2023

This article continues from the previous: Polynomial Regression.

Welcome back! In today’s topic, we will talk about another popular regression algorithm: Support Vector Regression or Support Vector Machine in Regression.

What’s wrong with the name? To be honest, Support Vector Machine (SVM) is a broad topic, but more specifically, we will talk about Support Vector Machine in Regression — Support Vector Regression (SVR).

The photo came from: https://scikit-learn.org/

In the map provided by Scikit-Learn, as we can see, SVC under classification and SVR under regression, they both are under the category of SVM, that’s why I said that SVM is a broad topic.

To begin, remember the article I wrote on Linear Regression? I mentioned that Linear Regression is all about finding the line of best fit, and to do so, it uses an idea known as Ordinary Least Squares, which is all about minimizing the errors.

The image came from: https://www.superdatascience.com/

The image shows that in order to find the line of best fit, we have to minimize the error. To do this, we want to find out the difference between each actual point and the predicted point of y, and square it. Then, we want to sum up all of the calculated values on the graph. This way, we get the line of best fit.

Alright, so why did I mention all these? Well, the idea of SVR is kind of similar.

The image came from: https://www.theclickreader.com/support-vector-regression/

In general, for any value that falls in the tube, we ignore the error, and this gives some flexibility to our model.
However, for values that fall outside of the tube, we do care about the error, and it is calculated by finding the difference between the value itself and the boundary of the tube, thus, separating the values into categories.
The difference of each value will become an invisible force to decide the shape of the tube, and they are support vectors, that is where the name SVR comes from.

Please note that the above example is linear SVR, and this is just scratching the surface of SVM, but this is enough to get us going and understand the theory behind it. For non-linear SVR, just imagine that there is a way that SVM can figure out a correlation between the values, and draw a curvy line to separate them.

Let’s implement SVR!

1. Preparing the data

Using the same data provided by https://www.superdatascience.com/.

We first get our X feature by keeping only the Level column, and y variable by keeping only the Salary column, with y being the variable that we want to predict.

X = df.drop(['Salary', 'Position'], axis=1).values
y = df['Salary'].values

2. Feature Scaling

In this example, it is important for us to perform feature scaling on both the X and the y.
This is because unlike linear regression, X and y have an implicit relationship (implicit equation), without the coefficient of the same scale to each of the features, and in this case, feature scaling is important to avoid a poor model.

from sklearn.preprocessing import StandardScaler

sc_X = StandardScaler()
X = sc_X.fit_transform(X)

sc_y = StandardScaler()
y = sc_y.fit_transform(y)

With the help of Scikit-Learn library, we implement standardization to our variables separately, and now, we have X & y values between -3 & +3.

Read more from my previous article: Data Preprocessing.

Training SVR model

from sklearn.svm import SVR

regressor = SVR(kernel = 'rbf')

regressor.fit(X, y)

To train our SVR model, we simply import the SVR from sklearn.svm module.

In the argument that is being passed into the SVR, please note that ‘rbf’ is just a function to apply the separation boundary, and there are many different types of SVM Kernel Functions, but ‘rbf’ is the most common one.

Image from: https://data-flair.training/

Predictions

def input_pred_X(value):
    return sc_X.transform([[value]]) # Scale to the training scale

# Reverse back to original scale
sc_y.inverse_transform(regressor.predict(input_pred_X(6.5)).reshape(-1, 1))

Don’t worry, there is nothing we have not seen before.

In the above, I simply make a function so that to transform our inputted value (in this case = 6.5) to the same scale of our training scale as we have performed feature scaling.

Then, I transformed the scale back to the original scale to get the salary prediction.

In this case, we get 170370 for a position level of 6.5, that is about right!

In this article, I won’t cover the evaluation part just yet, as I want to focus on the theory behind the algorithm. However, some useful evaluations for regression will be:

R_squared (Coefficient of Determination)
MAE (Mean Absolute Error)
MSE (Mean Squared Error)

That’s it for the SVR, I will be talking more about SVM in the classification section.

> Continue reading: Decision Tree Regression

SVR: Machine Learning

Written by TC. Lin