A Quick Introduction to K-Nearest Neighbors Algorithm

Adi Bronshtein
4 min readApr 11, 2017
Unfortunately, it’s not that kind of neighbor! :)

Hi everyone! Today I would like to talk about the K-Nearest Neighbors algorithm (or KNN). KNN algorithm is one of the simplest classification algorithm and it is one of the most used learning algorithms. So what is the KNN algorithm? I’m glad you asked! KNN is a non-parametric, lazy learning algorithm. Its purpose is to use a database in which the data points are separated into several classes to predict the classification of a new sample point.

Just for reference, this is “where” KNN is positioned in the algorithm list of scikit learn. BTW, scikit-learn documentation (clickable link) is amazing! Worth a read if you’re interested in machine learning.

When we say a technique is non-parametric , it means that it does not make any assumptions on the underlying data distribution. In other words, the model structure is determined from the data. If you think about it, it’s pretty useful, because in the “real world”, most of the data does not obey the typical theoretical assumptions made (as in linear regression models, for example). Therefore, KNN could and probably should be one of the first choices for a classification study when there is little or no prior…

--

--

Adi Bronshtein

Data Scientist | Data Science Instructor @ General Assembly, D.C.