Weighted Union Find: a fascinating and elegant algorithm

Jeremy Gottfried
Jeremy Gottfried’s tech blog
5 min readAug 26, 2018

The Union find algorithm solves a huge problem in computing. In a system with many elements, where elements are connected via a path through other elements, how do we efficiently determine if two elements are connected?

This is commonly referred to as the dynamic connectivity problem. A brute force algorithm works for a system with few elements.

//brute force quick find class QuickFindUF {
// construct an array with N integer elements
constructor (N) {
this.id = []
for (var i = 0; i < N; i++) {
this.id[i] = i;
}
}// Connections between elements are tracked in the ID array.
// To check if two elements are connected, we check for an ID match
connected = (p, q) => {
return this.id[p] === this.id[q];
}// To create a new connection between two elements P and Q, we change the id of element P to the id of element Q.
// We also change all the elements that were previously connected to P to the new id.
union = (p, q) => {
var pid = this.id[p]
var qid = this.id[q]
for (var i = 0; i < id.length; i++) {
if (this.id[i] === pid) { this.id[i] = qid }
} }}

The brute force algorithm works by assigning two connected elements to the same id. You can check if two elements are connected by checking that they have a matching id.

The problem with this implementation is that whenever you create a new connection between two elements, you have to iterate through the entire id array in search of elements that have been previously connected to the element whose id you are changing.

For example, if we connect 2 and 3, we also have to connect 3 to every element previously connected to 2. The only way to accomplish this is to iterate through the whole array.

Iterating through the entire array will have 0(n) time complexity which is inefficient for large systems.

Luckily, there’s an elegant solution to improve the efficiency of the algorithm. Rather than iterate through the whole array to create a new connection, we will maintain a tree of connections with a single root element:

Instead of assigning connected elements the same id, we will assign them a parent element. To find the root of an element, we can follow parents up the tree until we hit the element whose parent is itself. That’s the root. This is what the findRoot algorithm looks like:

findRoot = (currentEl) => {    while (currentEl != id[currentEl]) {
currentEl = id[currentEl]
}
return currentEl;
}

This algorithm creates a while loop. In each iteration of the loop, the current element is reassigned to its parent element. When the current element is equal to its parent, that means we hit the root element, so we return that element.

The beautiful thing about this implementation is that it vastly improves the efficiency of creating a new connection between two elements. Rather than reassign the id of every element we previously connected, we only need to change one element — the root of the smaller tree being connected. Since all the elements already connect to their root, changing the root will connect all those elements to the target tree.

Here is what the new union algorithm looks like:

union = (elementA, elementB) => {    var rootA = this.findRoot(elementA)
var rootB = this.findRoot(elementB)
if (rootA === rootB) {return}

// connect root of smaller tree to root of larger tree
if (size[rootA] < size[rootB]) {
id[rootA] = rootB
size[rootB] += size[rootA]
} else {
id[rootB] = rootA
size[rootA] += size[rootB]
}
this.treeCount--
}

Now we have increased the efficiency of the union algorithm from 0(n) to 0(log2(n)), because the depth of a tree will never exceed the log of n elements. This means that as the size of the data set increases, the time it takes for the union algorithm to complete will not increase by much. We can connect trees trees containing billions of elements in a matter of seconds.

We can improve the algorithm even more by compressing the trees whenever we create a new union. As the algorithm follows the path to reach the root element, we will reassign elements’ parents to their grandparents as we pass through, thereby reducing the depth of the tree. This only requires one extra line of code:

findRoot = (currentEl) => {    while (currentEl != id[currentEl]) {
// as we follow a tree to its root, reassign elements parents to their grandparents
id[currentEl] = id[id[currentEl]];
currentEl = id[currentEl];
} return currentEl
}

Here is a full js implementation, taken partly from Michael Toth’s JS visualisation of percolation through union find. I added compression, which Michael did not implement. I also changed the naming so it’s easier to read.


//N is the initial number of elements in the system, before any connections are made.
function WeightedUnionFind(N) {

// Constructor
var id = [];
var size = [];
for (var i = 0; i < N; i++) {
id[i] = i; // id[i] = parent of i
size[i] = 1; // size[i] = number of objects in subtree with root i
}

// Returns the number of components, which initializes at N
this.treeCount = N;

// Returns the component id for the containing site
this.findRoot = function(currentEl) {
while (currentEl != id[currentEl]) {
// compression
id[currentEl] = id[id[currentEl]]
currentEl = id[currentEl];
}
return currentEl;
}

// Returns true if two elements are part of the same component
this.connected = function(elementA, elementB) {
return this.find(elementA) === this.find(elementB);
}

// Connects the components of two elements
this.union = function(elementA, elementB) {
var rootA = this.find(elementA);
var rootB = this.find(elementB);
if (rootA === rootB) { return; }

// make smaller tree point to larger one
if (size[rootA] < size[rootB]) {
id[rootA] = rootB; size[rootB] += size[rootA];
} else {
id[rootB] = rootA; size[rootA] += size[rootB];
}
this.treeCount--;
}
}

Enjoy, and check out Coursera’s free algorithms course which covers the union-find algorithm in detail in the first section!

--

--