Manual Step by Step Complete Link hierarchical clustering with dendrogram.

Ganesh Chandrasekaran
Analytics Vidhya
Published in
3 min readDec 23, 2019

--

How complete link clustering works and how to draw a dendrogram.

Hierarchical Clustering : Its slow :: complicated :: repeatable :: not suited for big data sets.

Lets take 6 simple Vectors.

Using Euclidean Distance lets compute the Distance Matrix.
Euclidean Distance = sqrt( (x2 -x1)**2 + (y2-y1)**2 )

Example : Distance between A and B
sqrt ( (18- 22) ** 2 + (0–0) ** 2))
sqrt( (16) + 0)
sqrt(16)= 4

Complete Link Clustering: Considers Max of all distances. Leads to many small clusters.

Distance Matrix: Diagonals will be 0 and values will be symmetric.

Step a: The shortest distance in the matrix is 1 and the vectors associated with that are C & D

So the first cluster is C — D

Distance between other vectors and CD

A to CD = max(A->C, A->D) = max(25,24) = 25
B to CD = max(B-<C, B->D) = max(21,20) = 21

and similarly find for E -> CD & F -> CD

Step b : Now 2 is the shortest distance and the vectors associated with that are E & F

Second cluster is E — F

A to EF = max(A->E, A->F) = max(9,7) = 9
CD to EF = max(CD->E, CD->F) = max(15,17)…

--

--

Ganesh Chandrasekaran
Analytics Vidhya

Big Data Solution Architect | Adjunct Professor. Thoughts and opinions are my own and don’t represent the companies I work for.