Silhouette Method

Sanjjushri Varshini R
featurepreneur
Published in
Mar 12, 2022

i) Silhouette method is used for finding an optimal number of clusters.

ii) Silhouette method is better than the elbow method to find optimal clusters.

iii) Silhouette distance ranges from -1 to 1.

iv) Higher value within the range -1 to 1 indicates that the object is well matched to its cluster and poorly to the neighbouring cluster.

v) Value closer to 1 is better.

Formula: This is for a single data point.

a(i) -> The average distance between that point and the rest of the points in the cluster.
b(i) -> The average distance between that point and all of the points in the cluster nearest to it.
s(i) -> silhouette coefficient

To calculate the average silhouette width for the dataset, we use the below formula:

Implementation for Unsupervised ML with finding the silhouette score is here.

--

--