# Part II — Support Vector Machines: Regression

This post is the second part of a series of posts on Support Vector Machines(SVM) which will give you a general understanding of SVMs and how they work(the first part of this series can be found here). This will provide you with practical examples of how to use SVMs to tackle regression problems. Regression problems involve the task of approximating a mapping function from input variables to a continuous output variable. The approach of using SVMs to solve regression problems is called Support Vector Regression(SVR).

**DDI Editor's Pick: 5 Machine Learning Books That Turn You from Novice to Expert - Data Driven…**

*The booming growth in the Machine Learning industry has brought renewed interest in people about Artificial…*www.datadriveninvestor.com

Now, let's try to solve a regression problem using this approach. For this example, we will be using the Boston House price data set which has 506 records, 13 features and a single output(more information on this data set can be found here).

**Imports**

First, we need to import a few libraries.

import math

import pandas

from sklearn.preprocessing import MinMaxScaler

from sklearn.svm import SVR

from sklearn.model_selection import GridSearchCV, cross_validate

from sklearn.utils import shuffle

Let's see what we have imported,

- math — allow us to perform mathematical functions with ease
- pandas — allows us to manipulate data structures more easily
- sklearn — a machine learning library for python

**2. Load data**

Now let’s load our data set and specify the features and the dependent variable.

dataset = pandas.read_csv('Dataset.csv')

X = dataset.iloc[:, [0, 12]]

y = dataset.iloc[:, 13]

**3. Pre-process data**

When pre-processing our data, we are using **MinMax scaling**, in order to normalize the data set.

scaler = MinMaxScaler(feature_range=(0, 1))

X = scaler.fit_transform(X)

Before feeding data to a model, data is shuffled. For more information regarding these techniques refer to this post.

**4. Implement model**

We will be showing you how to use all 3 kernels of the SVR model. More information on kernels is included in the first part of this series.

*Linear kernel*

def svr_model(X, y):

gsc = GridSearchCV(

estimator=SVR(kernel='linear'),

param_grid={

'C': [0.1, 1, 100, 1000],

'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10],

},

cv=5, scoring='neg_mean_squared_error', verbose=0, n_jobs=-1)

grid_result = gsc.fit(X, y)

best_params = grid_result.best_params_

best_svr = SVR(kernel='linear', C=best_params["C"], epsilon=best_params["epsilon"], coef0=0.1, shrinking=True,

tol=0.001, cache_size=200, verbose=False, max_iter=-1)

scoring = {

'abs_error': 'neg_mean_absolute_error',

'squared_error': 'neg_mean_squared_error'}

scores = cross_validate(best_svr, X, y, cv=10, scoring=scoring, return_train_score=True)

return "MAE :", abs(scores['test_abs_error'].mean()), "| RMSE :", math.sqrt(abs(scores['test_squared_error'].mean()))

# Run

print(svr_model(X,y))

Here’s what we get as the error metric results for our SVR model which uses the linear kernel.

*Polynomial kernel*

def svr_model(X, y):

gsc = GridSearchCV(

estimator=SVR(kernel='poly'),

param_grid={

'C': [0.1, 1, 100, 1000],

'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10],

'degree': [2, 3, 4],

'coef0': [0.1, 0.01, 0.001, 0.0001]},

cv=5, scoring='neg_mean_squared_error', verbose=0, n_jobs=-1)

grid_result = gsc.fit(X, y)

best_params = grid_result.best_params_

best_svr = SVR(kernel='poly', C=best_params["C"], epsilon=best_params["epsilon"], coef0=best_params["coef0"], degree=best_params["degree"], shrinking=True,

tol=0.001, cache_size=200, verbose=False, max_iter=-1)

scoring = {

'abs_error': 'neg_mean_absolute_error',

'squared_error': 'neg_mean_squared_error'}

scores = cross_validate(best_svr, X, y, cv=10, scoring=scoring, return_train_score=True)

return "MAE :", abs(scores['test_abs_error'].mean()), "| RMSE :", math.sqrt(abs(scores['test_squared_error'].mean()))

# Run

print(svr_model(X,y))

Here’s what we get as the error metric results for our SVR model which uses the polynomial kernel.

*RBF kernel*

def svr_model(X, y):

gsc = GridSearchCV(

estimator=SVR(kernel='rbf'),

param_grid={

'C': [0.1, 1, 100, 1000],

'epsilon': [0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10],

'gamma': [0.0001, 0.001, 0.005, 0.1, 1, 3, 5]

},

cv=5, scoring='neg_mean_squared_error', verbose=0, n_jobs=-1)

grid_result = gsc.fit(X, y)

best_params = grid_result.best_params_

best_svr = SVR(kernel='rbf', C=best_params["C"], epsilon=best_params["epsilon"], gamma=best_params["gamma"],

coef0=0.1, shrinking=True,

tol=0.001, cache_size=200, verbose=False, max_iter=-1)

scoring = {

'abs_error': 'neg_mean_absolute_error',

'squared_error': 'neg_mean_squared_error'}

scores = cross_validate(best_svr, X, y, cv=10, scoring=scoring, return_train_score=True)

return "MAE :", abs(scores['test_abs_error'].mean()), "| RMSE :", math.sqrt(abs(scores['test_squared_error'].mean()))

# Run

print(svr_model(X,y))

Here’s what we get as the error metric results for our SVR model which uses the RBF kernel.

The following table contains a summary of all the error metric results obtained for the three kernels using the SVR model.

Based on the results we can say that the SVR-RBF model performs the best with the given dataset whereas the SVR-linear model performs the worst. By looking at the performance of these models we can say that the dataset used follows a non-linear pattern as it performs best with the non-linear kernels.

This brings us to the end of this post. Hope this article gave you a good understanding of how to use SVMs to tackle regression problems. Until next time, Adios….

More articles related to Machine Learning:

- Part I — Support Vector Machines: An Overview
- A practical guide to getting started with Machine Learning
- A Beginners guide to Random Forest Regression

**References**