Breadth First Search Algorithm
(This article is written believing that the reader knows what a graph data structure is. If not, they may read these four lines:https://en.wikipedia.org/wiki/Graph_(abstract_data_type))
A term that keeps popping up whenever graphs are mentioned is ‘traversal’. But, what exactly is meant by the ‘traversal’ of a graph? To ‘traverse’ a graph is to visit all its vertices. What do you mean ‘visit’? Alright, look at the graph below:
When we say we are visiting the ‘Node A’, we mean that we are focusing on ‘Node A’ and not on the other nodes. While we are focusing on ‘Node A’, we might obtain information from it, print it, or change/update it. When we are done focusing on the ‘Node A’, we move to some other node in the graph and we say that we have traversed the ‘Node A’.
Similarly, we traverse the other nodes.
Breadth First Search (BFS) is an algorithm for the traversal of graphs. What this means is that when we implement BFS on a graph, we choose a ‘source’ node and starting from this node, we traverse all other nodes in a well defined manner instead of jumping from one node to another randomly.
What is this well defined manner?
Consider the graph shown below:
Let the source node be ‘3’, now once we have traversed the node ‘3’, we move to the node directly connected to it, ‘1’.
We traverse ‘1’.
Now, once ‘1’ is traversed we can move to the node connected directly to ‘1’, i.e. ‘7’.
But that’s not how BFS works. In BFS after traversing the a node, we must traverse all it’s immediate neighbors, i.e. the nodes directly connected to it via single edges. Hence, after traversing ‘1’, we traverse ‘5’ (the other immediate neighbor of ‘3’). Note that we could also have traversed ‘5’ before ‘1’, without violating the algorithm.
The order of traversal then is: 3 → 1 → 5.
Now, we traverse all the immediate neighbors of ‘1’.
The only immediate neighbor of ‘1’ is ‘7’.
3 → 1 → 5 → 7
We next traverse all the immediate neighbors of ‘5’. They are: ‘2’ and ‘4’.
3 →1 →5 →7 →2 →4
Similarly all the remaining nodes are traversed.
In code, BFS is almost always associated with a queue data structure.
Queue follows First In First Out (FIFO). The idea is to have the node whose immediate neighbors have to be traversed at the top of the queue.
Let us revisit our graph, this time, bringing a queue in the picture.
Let ‘3’ be the source node. We enqueue it. ‘3’ will be at the top of the queue.
Queue: 3
We store the value of the top of the queue (‘3’) in a variable ‘x’ and then pop the top of queue.
x = 3
Queue: Empty
Now we enqueue all the immediate neighbors of x (‘3’).
Queue: 1, 5
Now we store the value of the top of the queue in variable x and pop it.
x = 1
Queue: 5
Now we enqueue all the immediate neighbors of x (‘1’).
Queue: 5, 7
We store the value of the top of queue in x and pop it.
x = 5
Queue: 7
We enqueue all the immediate neighbors of x (‘5’).
Queue: 7, 2, 4
We store the value of the top of the queue and pop it.
x=7;
Queue: 2,4
We enqueue all the immediate neighbors of x (‘7’).
Queue: 2,4,2,8
If you notice, ‘2’ is already present in the queue which means it has already been traversed. To avoid traversal of nodes that have already been traversed, we maintain a ‘list of visited nodes’. Before enqueuing any node we check whether it is present in this list or not. If it is, we don’t enqueue it.
Hence, with a maintained ‘list of visited nodes’, the queue becomes:
Queue: 2,4,8
We store the value of the top of the queue and pop it.
x=2
Queue: 4,8
We enqueue all the immediate neighbors of x.
Queue: 4,8,6
Continuing similarly,
x=4
Queue: 8,6
Immediate neighbor of ‘4’ is ‘6’ which is already visited. So we don’t enqueue it.
x=8
Queue: 6
‘8’ has no immediate neighbors.
x=6
Queue: Empty
Immediate neighbor of 6 is 8 and it has already been visited.
We have traversed the entire graph using BFS.
Here is the pseudo code:
Another graph traversal algorithm is Depth First Search (DFS). We shall discuss DFS in the next article.