Variance in Machine Learning

Hasan Sajedi
sajedi
Published in
4 min readDec 22, 2018

Definition of variance and deviation from the benchmark in statistics:

What is variance? The size of the spreadsheet displays a column for us.
What is the deviation from the benchmark? The distance between each data and the mean is deviation from the criterion.

In Machine learning the Variance, we can help determine if the records of a single column are spread, or, in other words, the distance from the meanings of a dataset.

Formula of calculation of variance
The formula for calculating the deviation from the criterion

Sigma is referred to as σ in Greek.

Calculate the variance and deviation from the benchmark by an example
Let’s go one with an example (example is taken from the reference below).

In this example, several dogs have been brought in and they want to take the variance to measure the condition of the dogs.

The height of each dog up to its shoulder (shin) is 600mm, 470mm, 170mm, 430mm, 300mm, respectively. The task we are in is to get the variance and deviation from the criterion as well as the mean.

Mean = (600 + 470 +170 +430 + 300) / 5 =>

Mean = 1970/5 =>

Mean = 395

So the average in the above figure is as follows:

Average view in figure (green line is average)

Now calculate the height difference of each dog with the mean as follows:

Difference in the height of each dog to the average

Now we want to calculate the variance between them:

The method of calculating variance

Now, if we take the root (or root) of variance, we will deviate from the criterion:

Calculate the deviation from the criterion

Now we can update our shape:

Calculate the standard deviation among dogs

Now we can easily determine which standard is for us. Left-handed dog is not our standard and is higher than standard, and also the middle dog, which is the shortest dog, is not among our standards among this dataset.

Note: The point is that as the variance is small and closer to zero, it means that the records of a column are very close to the average and each other. The high variance shows us how many records are a pillar of the average and each other far away.

The concept of variance in learning the machine:
This is the simplest definition for variance and deviation from the criterion. But this look is only a statistical look and not as a data scientist. You need to know as a data scientist what the implications of variance make in learning your machine.

So we have two concepts:

Low variance: tells you that the smallest change in the data set causes the results to change in the target function.
High variance: tells you that a big change has to occur so that the objective function changes in its estimates.

Examples of low variance in machine learning include linear regression, linear analysis, linear logic regression, and logistic regression.

Examples of high variance in machine learning include Decision Tree, K-Nearby Neighbor, and Support Vector Machine.

Finally, in the calculation of variance and deviation from the criterion, you may encounter two concepts of Sample and Population:

When talking about the sample and population that we are, we actually want to talk about N in the formula I originally provided for you. Let’s look at the same example as a set of dogs. If our entire dataset had the same number of dogs, we would divide the variance into the total number of records in a column of that set, with five here (a total of five dogs). If this number of dogs is only an example of a larger dataset and are represented as representing here, we should put N-1 in the formula instead of N. So:

What is Population? We calculate the total records of a column of a population dataset.
What is Sample Sample? To calculate a number of records of a column from a sample dataset we say.

The point in Sample is that it says that there is no need for statistics, and we always want to calculate the whole of a dataset. The only index of patterns gives us the most information. But also consider that using your sample you lose the amount of accuracy you have in the crowd, but what you get is time.

Finally, we have:

Two methods of displaying the formula of deviation from the criterion

--

--