KNN Classifier Optimization: Best Practices and Tips (PART II)

Madhuri Patil
5 min readNov 20, 2023

In the previous article, we implement the k-nearest neighbors (KNN) algorithm using scikit-learn library. We also learned the effect of different values of weights parameter.

Let’s continue with that, in this article we’re going to learn how to optimize the classifier using scikit-learn and also observe the effect of different values of neighbors (k) on the performance of classifier.

Choosing the best k-value

The choice of ‘k’ plays a pivotal role in KNN’s performance. Let’s see how the change in number of neighbors can affect the classifier decision by observing the decision boundaries of each cluster.

For that, we going to plot the decision plots of a classifier, one with very few neighbors and other with many neighbors and compared their decision regions.

You can see the plots of classifiers decision boundaries with smaller values for k, say 1 and larger value say 30.

In the above plot, the square and triangle marker represent class labels for 0, and 1 respectively and the hollow black circle represent our query point.

The shaded regions with respective class colors are called decision regions, and the border that separates these regions are called decision boundaries.

--

--