Manual Step by Step Single Link hierarchical clustering with dendrogram.

Ganesh Chandrasekaran
Analytics Vidhya
Published in
3 min readDec 23, 2019

--

You are here because, you knew something about Hierarchical clustering and want to know how Single Link clustering works and how to draw a Dendrogram.

Hierarchical Clustering : Its slow :: complicated :: repeatable :: not suited for big data sets.

Lets take a 6 simple Vectors.

Using Euclidean Distance lets compute the Distance Matrix.
Euclidean Distance = sqrt( (x2 -x1)**2 + (y2-y1)**2 )

Example : Distance between A and B
sqrt ( (18- 22) ** 2 + (0–0) ** 2))
sqrt( (16) + 0)
sqrt(16)= 4

Single Link Clustering: Minimum of two distances. Leads to large more diverse clusters.

Distance Matrix: Diagonals will be 0 and values will be symmetric.

Step a: The shortest distance in the matrix is 1 and the vectors associated with that are C & D

So the first cluster is C — D

Distance between other vectors and CD

A to CD = min(A->C, A->D) = min(25,24) = 24
B to CD = min(B-<C, B->D) = min(21,20) = 20

and similarly find for E & F

Step b : Now 2 is the shortest distance and the vectors associated with that are E & F

--

--

Ganesh Chandrasekaran
Analytics Vidhya

Big Data Solution Architect | Adjunct Professor. Thoughts and opinions are my own and don’t represent the companies I work for.