Photo by Pierre Bamin on Unsplash

Hands-on Tutorial

Why Do Initial Cluster Centroids in k-means Affect the Final Cluster Generated?

Different initial cluster centroids will affect different results in the k-means cluster algorithm — How to determine the best one?

Geek Culture
Published in
10 min readOct 4, 2021

--

The k-means clustering algorithm is such an easy algorithm in unsupervised (clustering) to be implemented. Formulating the algorithm in manual calculation using mathematics is also quite simple.

However, behind its advantages, k-means has a limitation in its rule for choosing the cluster centroids. It is too sensitive to the initial cluster centroids — when we choose different initial values, the result will be different. It directs us to an invalid conclusion about the cluster.

In this short article, you will understand how can the initial cluster centroids will affect the final clusters generated. In the last article, I will tell you an approach to handle this stuff. So, keep reading and I hope you will enjoy this topic!

Buona lettura!

The basic theory of k-means (this is what you must know!)

k-means is an unsupervised learning method that is used to group data with similar…

--

--

Audhi Aprilliant
Geek Culture

Data Scientist. Tech Writer. Statistics, Data Analytics, and Computer Science Enthusiast. Portfolio & social media links at http://audhiaprilliant.github.io/