# Dynamic Connectivity Problem

## Pixels in a digital photo, Computers in a network, People on a social network are all connected objects.

Given a set of N objects, connect two objects, or ask if two objects are connected (directly or in-directly).

The problem is to be able to implement two commands; *union and connected*.

`union(2, 5) // connect object 2 with object 5`

connected(1 , 6) // is object 1 connected to object 6?

## Connection Properties

We assume that “is connected to” is an *equivalence relation. *An equivalence relation partitions the objects into *equivalence classes* or *connected components*.

**Reflexive**→p is connected to p.**Symmetric**→If p is connected to q, then q is connected to p.**Transitive**→If p is connected to q, and q is connected to r, then p is connected to r.

## Implementing The Operations

**Find**: In which component is object p ?**Connected**: Are objects p and q in the same component?**Union**: Connect two components containing objects p and q (if not already).

Connected components are subsets, each consists of connected objects.

## Union-Find Data Type

The goal is design an efficient data structure for a data type called *union–find (also known as disjoint-sets) *with two commands; union and connected.

It can be implemented by several different algorithms. We are going to explore two algorithms; *Quick Find* & *Quick Union*.

# Quick Find

It’s an algorithm for solving the dynamic connectivity problem, also called “eager approach”.

## Initialize

Each object in the array has a value, or an id, and the id is it’s index initially. An element p is connected to element q if they have the same id.

`for (int i = 0; i < n; i++)`

id[i] = i; // set id of the object at index i

## Union

When connecting two objects p and q, change the id of all objects that have the id of p to that of q, or vice-versa. So, we will have to loop on all array elements to check the id of each.

if (pID == qID) return; // already connected?for (int i = 0; i < id.length; i++)

if (id[i] == pID)

id[i] = qID;

**Connected**

Two objects are connected if and only if they have the same id.

`return id[p] == id[q];`

## Analyzing

It’s fast in checking whether there is a connection, slow in constructing the array of objects, and connecting objects.

- Initialize & Union →O(N)
- Connected→O(1)

# Quick Union

It’s an algorithm for solving the dynamic connectivity problem, also called “lazy approach”.

## Initialize

The array represent the parent of each object. Initially, the parent of an object is the object itself; every object is a root.

Now, we can imagine the data structure consists of trees, each tree has connected objects, and, trees mightn’t be connected together.

`for (int i = 0; i < n; i++)`

parent[i] = i; // set parent of the object at index i

**Root**

Get the root of an object. This is just a helper method that will be used in union and connected commands.

`while (p != parent[p])`

p = parent[p];

return p;

## Union

When connecting two objects p and q, connect the roots. So, change the parent of the root of p to the root of q, or vice-versa. Now, the root of p became a child of the root of q (as if we are connecting two trees or subsets).

int rootP = root(p);

int rootQ = root(q);if (rootP == rootQ) return; // already connected?

parent[rootP] = rootQ;

## Connected

Two objects are connected if their root is the same. So, we have to check the parents of each object up to the root.

`return root(p) == root(q);`

## Analyzing

It’s slow in checking whether there is a connection, slow in construct the array of objects, and connecting objects. Connecting two objects could take O(N) when the depth of the tree is N.

- Initialize & Union & Connected → O(N).

# Quick Union Improvements

## 1. Weighted Quick Union

Avoid having tall trees, by tracking number of objects in each tree, and so we can maintain balance by linking the root of smaller(or shorter) tree to root of greater(or taller) tree.

We will add an extra array denotes the size(or depth) of the subtree formed by each node.

// union by size (initially size[i] = 1)if (size[rootP] < size[rootQ]) {

parent[rootP] = rootQ;

size[rootQ] += size[rootP];

} else {

parent[rootQ] = rootP;

size[rootP] += size[rootQ];

}// union by depth (initially depth[i] = 0)if (depth[rootP] < depth[rootQ])

parent[rootP] = rootQ;

else if (depth[rootP] > depth[rootQ])

parent[rootQ] = rootP;

else {

parent[rootQ] = rootP;

depth[rootP]++;

}

## Analyzing

Time taken for union and connected commands is at most the depth of the tallest tree, which is O(LogN) at most.

P

roof: When T1 is merged to T2, where T1 ≤T2, the depth of T1 will increase by 1. Now, the depth of T2(after merging) will increase by 1 if merged with T3, where T3 ≥ T2. So, every time, we need a tree X ≥ tree Y, to increase the depth of tree Y by 1. Therefore, the size of T1 can increase at most LogN.

- Union & Connected→O(LogN)

## 2. Path Compression

Flatten the tree; Whenever you want to get to the root from an object, assign the parent of the current object to it’s grandparent (or to the root). This rule applies all the way up to the tree.

This will decrease number of steps to get to the root object from the current object.

// path compression by linking to the rootint root = p;

while (root != parent[root])

root = parent[root];

while (p != root) {

int newp = parent[p];

parent[p] = root;

p = newp;

}// path compression by linking to grandparent(halving the depth)

while (p != parent[p]) {

parent[p] = parent[parent[p]];

p = parent[p];

}

## Analyzing

The amortized cost per operation is known to be logarithmic.

- Union & Connected→O(LogN) (amortized)

# Summary

- Quick Find → O(1) for connected, and O(N) for union.
- Quick Union → O(N).
- Weighted Quick Union →O(LogN).
- Quick Union and Path Compression → O(LogN) (amortized).
- Weighted Quick Union with Path Compression → O(Log*N) (amortized).

Log*N is the iterated logarithm of N, usually read “log star”.

# Other Related Problems (Percolation)

The data structure called union-find we’ve used to solve the problem of dynamic connectivity can also be used to solve other problems, such as percolation.

The percolation problem is given a grid of N x N elements, colored with white if open, blue if connected to any element at top row, and black if closed. Is the system percolates?. In other words, Is the top row is connected to the bottom row through open elements?.

Percolation problem shows up in electricity, fluid flow, social interactions, many other.

With a grid of N x N elements, to check if any element at the first row connected with any element at the bottom. This will require N² calls to *connected *command.

The trick we can use is to connect all elements on the first row to a “virtual” element, and we will do the same thing for all the elements at the bottom row.

Now, to know if the system percolates, we check if the virtual element at the top is connected to that at the bottom. So, it’s a more efficient algorithm since it only requires one call to *connected *command.

## Watch out for this **bug**!

If the elements at the bottom row are connected to a “virtual” bottom element, and an element A at the bottom gets connected with the top virtual node(became blue).

Now, whenever another element B at the bottom row gets opened, it will be also colored with blue, Why? Since it’s also connected to element A(through the virtual bottom element), which in turn connected to the top virtual node.

## Solutions

Either **use two N x N grids**, where the first has only the virtual top element, and it’s used as the main grid(represents the system), and the second has both top and bottom virtual elements, and it’s just used to tell whether system percolates or not.

Or, **use one grid **with only top virtual element, and whenever we open an element at the last row, and it’s connected to any node at the top, mark the system as percolates.