Support Vector Machines (SVM) Explanation & Mini-Project

Youness Habach
5 min readApr 24, 2020

--

We’ll talk about Support Vector Machines (explanation, some use case and how to implement a simple svm model for classification and regression)

Definition

Support vector machines or large margin separators (are a generalization of linear classifier) are a set of supervised learning techniques for problem solving of Classification and Regression, developed in the 1990s from theoretical considerations of Vladimir Vapnik on the development of a theory learning statistics: Vapnik-Tchervonenkis theory.

General principe of SVM

Like we said before SVM used for Classification and Regression problems so the resolution of these two problems goes through the construction of a function h which to an input vector x matches an output y (we call that last one Target): y = h(x)

Example : We limit ourselves for the moment to a problem of discrimination with only 2 classes (binary discrimination) [Class1, Class2], that is to say {y=- 1 or y=1 }, the input vector x being in a space X provided with a scalar product.

Advantages & Disadvantages

Strengths: the algorithme can model limits for linear and non-linear problems, based on kernels. It is also quite robust against “Overfitting”, especially in large spaces.

Weakness: SVM requires a lot of memory, it is more difficult to adjust in
reason of the importance of choosing the right kernel (Kernel) and does not allow to give good results with fairly large datasets.

Brief explanation

Imagine that we have dataset of 6 points as follows

And as you see they have linearly separable
But the problem there are thousands of lines that can do the trick

All these lines are valid and make the classification in a 1000000% correct manner. But the difference is these lines are valid but not optimal.

As shown in the figure below, their principle is simple: their purpose is to separate data into classes using a border as “simple” as possible, so that the distance between the different groups of data and the border between them is maximum. This distance is also called “margin” and the SVMs are thus qualified as “wide margin separators”, the “support vectors” being the data closest to the border.

Project

Dataset to use

In this part we will start with 2 types of problem for the SVM algorithm
1) First SVM for Classification : We are using the “Social Network Ads” dataset on kaggle here is the link of this dataset Social_Network_Ads

Dataset compose of 5 columns known as [User ID, Gender, Age, Estimated Salary and Purchased] and 400 row

Structure of Social_Network_Ads dataset

2) Second SVM for Regression : We are using the “Position Salaries” dataset on kaggle here is the link of this dataset Position_Salaries

Dataset compose of 3 columns known as [Position, Level and Salary] and 10 row

Reached result

Classification
Visualize and identify each class of the different classes and draw the dividing line by each dataset for testing

Regression
Visualize the data points and draw the regression line and predict the salary of an employee at level 4.5 and 8.5

Steps to follow

Classification

  • Import the necessary libraries
  • Import the dataset and identify the data and labels (matrix X and vector Y)
  • Divide data into training and test sets for data and labels
  • Establish features scaling if needed
  • Create a SVC object for Classification from SVM library
  • Fit the dataset (training set)
  • Predict the result (test set)
  • Evaluate the model

Regression

  • Import the necessary libraries
  • Import the dataset
  • Establish features scaling if needed
  • Create a SVR object for Regression from SVM library
  • Fit the dataset
  • Predict the result

Algorithms Implementation (Classification)

Source Code

This part of code presents the preprocessing step, the feature scaling step, follows by divides the data to training and test sets, and thereafter declares our SVC model for classification from the SVM class in order to fit and predict.
As I mentioned in “Steps to follow”

This source code presents the code to display the data points information separated in two classes and also the line of separation.

Result

We are going to visualize the test set for 2 kinf ok svc object using the linear and non-linear kernels

Linear kernel
Non-linear kernel

Algorithms Implementation (Regression)

Source Code

The same as previous declare SVR model for regression from the SVM class in order to fit and predict.
As I mentioned in “Steps to follow”

Result

as you can see the distribution is not near to linear problem so a linear model cannot handle this problem (As the last point the prediction is too far from the real value this called underfitting), we need to mention that SVM has several types of kernels (‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’, ‘precomputed’)

The prediction of the value 4.5 is 130101.64 and 8.5 is 303706.02

So we replace regressor = SVR() with regressor = SVR(kernel='rbf')
and re-run the program

And the predictions here are 115841.63 for 4.5 and 403162.82 for 8.5

Conclusion

Among the limits of SVM:

  • The SVM algorithm is not suitable for large data sets.
  • SVM does not work very well when the dataset has more noise.
  • In cases where the number of entities for each data point exceeds the number of training data samples, the SVM will perform poorly.
  • As the support vector classifier works by placing data points, above and below the classification hyperplane, there is no probabilistic explanation for the classification.

Resources

If you have a question, don’t hesitate to write it in the comments below.
Make sure to follow me on Medium. You can find me on Linkedin or contact me on email.
Thanks for reading.

--

--