What is a SVM ?
SVM is a supervised machine learning algorithm which can be used for classification problems. It is used to classify two set of data by drawing a line, a decision boundary that seperate those classes and that line is called as HyperPlane. It is genrally used for binary classification.
Here, In the above diagram ‘+’ symbols and ‘-’ symbols are two classes which is separated by a blue line, that line is called as HyperPlane.
I am always interested in applying any algorithm to solve real world problems. So, before moving further lets see some use cases of SVM.
Use Case of SVM
- Classification Problem
- Text and hypertext categorization
- Generalized predictive control(GPC)
- Face detection
- Outlier detection
- Handwriting recognition
How Does it works ?
Well, I know you must be wondering that how it works and how we are going to decide the best hyperplane for our dataset.
- Identify the right hyperplane : The hyperplane(line) must segregate the both the two classes into distinct classes, means after drawing the line you can say that yes class A is in the one side and class B is on the other side of the line
- Selecting best Hyperplane : It might be possible that you can get multiple hyperplane (line). So, how to select the best line ??
Selecting Best HyperPlane
The line which have largest margin from the nearest data point from the line is best line. The distance between the line and point is called as margin.
How SVM is different from others classifcation techniques ?
For classification, we have other methods like decision tree and logistic regression. So, why use SVM ? Well, its has much better accuracy than decision tree and Logistic Regression.
See this diagram :)
Limitation and Advantage
- It works really well with clear margin of separation.
- It is effective in high dimensional spaces.
- It is effective in cases where number of dimensions is greater than the number of samples.
- It uses a subset of training points in the decision function (called support vectors), so it is also memory efficient.
- It doesn’t perform well, when we have large data set because the required training time is higher.
- It also doesn’t perform very well, when the data set has more noise i.e. target classes are overlapping.
Time for some brainstorming, Ready ?
Brain-storm 1 : Can SVM used to categorize multiple classes ?
Brain-storm 2 : Can Regression be used for classification of classes? If Yes, then why do we need Classification techniques.
I will be happy to know your response and answer any of your queries.
Implementation of SVM from scratch : https://github.com/dipakkr/learn-ml/tree/master/SVM