Digit recognition


Just like in programming we first learn to print "hello, world!", similarly in machine learning we first do handwritten digit recognition (MNIST). In this article I will use one of the simplest machine learning algorithms called k nearest neighbors to solve this famous problem of recognizing handwritten digits.

So, what is KNN?

It is a classification algorithm which means that it classifies the new data point (test data) into some category. To do so it basically looks at its (test data) distance from other data points (training points). Then out of the k closest training points the class in majority is assigned to that new test data point.

The distance metric depends on the problem chosen for classification task. This is because, for instance, the distance between the skills of two programmers (scores in competition) would be different from the distance between two kites flying in sky (3D environment), etc. Similarly, the distance metric for distance between images is also different (as discussed below).

That’s all about k nearest neighbors! The algorithm is pretty simple.

Finding Distance

The distance between the two images can be calculated in many different ways. But I used the Euclidean distance algorithm. The way it will be calculated for images is by summing the squared Euclidean distance between the corresponding pixels of two images.

Euclidean Distance

The squaring of corresponding pixel distances is just to eliminate the negative distances. An alternative to this would be to find the absolute value of the corresponding euclidean distances.

Majority Voting

After the distances have been found and sorted (we’ll sort it later) it’s time to find the majority class in closest k training points. Here, we take the labels/classes of closest training points and find the most occurring one and assign it to the new test data point. And that’s all for how the classification occurs.

The following code describes the majority algorithm.

Voting for majority

Classifying Handwritten Digits

We have everything we need for classification of handwritten digits. All we are left to do is what we are actually here for.

The following classification code describes itself.

Classifying handwritten digits

I ran my code for first 2500 test images for classification and achieved the classification accuracy of 95.32% which is really not so bad for such a simple machine learning model. The complete accuracy can be checked by using all the test images for classification process but it’ll take very long (maybe even a day).


So, k nearest neighbor is a simple algorithm for classification tasks. And we used it to classify handwritten digits and achieved accuracy of 95.32% which is pretty good for such a simple classification algorithm.