C and Gamma in SVM

A Man Kumar
4 min readDec 17, 2018

--

I assume you know about SVM a little bit. But I am going to cover an overview of SVM.

What is our goal for SVM?

Answer: To find the best point(in 1-D), line(in 2-D), plane(3-D), hyperplane(in more than 3-D) to separate the classes. Have a look below image

But it is separable dataset but when we have non-separable dataset then what will do?

In the above image if we choose A then we are getting two errors but if we choose B we do not get any error. But do you think B is the right choice to make a decision boundary? off course NO. A looks better decision boundary even if we are having the errors. In this image, you noticed that error play an important role in SVM.

So in this Blog, Our objective is how to find better C and Gamma.

Before going to find better C and Gamma. Let me introduce what are C and Gamma.

What is C ?

C- It is a hypermeter in SVM to control error. What does that mean to control error or margin? Let’s understand with visualization.

You can see if we have low C means low error and if we have large C means large error.

In low C we have only one error but in case of large C, we have four errors.

let’s see some other dataset.

In the above image, when we choose low C we have no error but in case of high C, we have two errors. but B or low C is not the best decision boundary. A looks better decision boundary. So you can observe that low C or less error does not mean better decision boundary.

let’s see one more example.

in the above image, we can see that low C means low error, medium C means medium error and large C means high error. And here medium C giving is the better model.

So what we have learnt so far that low C less error and large C high error. Note that it does not mean that low error means we have a good model. It totally depends upon on datasets that how much error dataset consists. There is no thumb of rules that low C will work always or high C or medium C.

So now so far we have learnt that what is C and how it can be used. Now let’s move to Gamma.

What is Gamma?

Gamma is used when we use the Gaussian RBF kernel. if you use linear or polynomial kernel then you do not need gamma only you need C hypermeter. Somewhere it is also used as sigma. Actually, sigma and gamma are related. That relation is below.

So here Gamma and sigma are the same things.

Gamma is a hyperparameter which we have to set before training model. Gamma decides that how much curvature we want in a decision boundary.

Gamma high means more curvature.

Gamma low means less curvature.

As you can see above image if we have high gamma means more curvature and if we have low gamma then less curvature.

So the question is when we should tune high or low gamma?

The answer is it totally depend upon data to data.

So till here, we have learnt Gamma and C.let’s recap what we have learnt yet.

C is a hypermeter which is set before the training model and used to control error and Gamma is also a hypermeter which is set before the training model and used to give curvature weight of the decision boundary.

For choosing C we generally choose the value like 0.001, 0.01, 0.1, 1, 10, 100

and same for Gamma 0.001, 0.01, 0.1, 1, 10, 100

we use C and Gammas as grid search. See below image.

Above image shows how grid search works. the set which gives the better result we choose that value of Gamma and C.

I hope you understood what is C and Gamma and how it can be used to train model.

if you like this article please give a clap. It motivates me to write more blog.

Thanks for Reading.

Credit: www.appliedaicourse.com and his team.

--

--