Understand K-Means Classification Algorithm

Understand the K-Means model by creating one from scratch

Andrew Zhu (Shudong Zhu)
CodeX

--

K-Means classification

K-Means model is one of the unsupervised machine learning models. This model is usually used to partition observed data into k clusters. You give the model a bunch of data with defined features and tell it how many clusters you want it to output. The model will classify the dataset into the number of clusters assigned by you.

Since K-Means is a non-supervised model, means you don’t have to label your train dataset and the model will automatically classify the input data.

In this article, you will read:

  1. How the K-Means model works.
  2. How to use the K-Means model from the scikit-learn package.
  3. Build a K-Means classifier from scratch using Python.

How K-Means works

The idea underlining is pretty simple and straightforward, while the result is amazing. The core ideas of k-means are:

  1. Guess some center points.
  2. Repeat until no new center points are found:
    2.1 Assign the points to the currently known centers;
    2.2 Set the new center to the mean of current points;

--

--