A Graph models a set of connections, and it is a data structure that you cannot avoid. From your social media connections to cities on a map, you’ll find graphs everywhere, and with a graph comes graph traversal methods. The process of visiting each node in a graph is known as Graph Traversal. Breadth-First Search or BFS is one such algorithm for graph traversal and you have probably been using it in your daily life. One common example is when you want to find out the directions to a destination. You check if there’s a way to reach your destination and often try to find the shortest path as well. This article takes a more technical yet simple approach towards explaining this algorithm. I have made a video explaining the algorithm as well. The example used has been inspired by Aditya Bhargava from his book Grokking Algorithms: An illustrated guide for programmers and other curious people (check it out for a detailed yet fun approach on algorithms).
A graph is made up of a finite set of nodes (or vertices) and edges that connect two nodes.
A graph can be directed or undirected. Edges of Undirected Graphs do not have a direction. Directed graphs have edges that point in a single direction.
How does BFS work?
Breadth-First Search answers two main questions:
Type 1: Is there a path between two nodes?
Type 2: What is the shortest path between two nodes?
Let’s take a simple example to see how BFS answers these questions.
Suppose you own a watermelon farm, and you are looking for a watermelon seller. You turn to Facebook to find a seller easily. You and your Facebook friends form a network that can be represented in form of a graph.
The first step is to create a list of your friends. Now you visit each name on the list one by one. You have a simple yes-no question to answer, “Does your friend sell watermelons?”, if YES, you found your seller. If your answer is NO, then you strike this friend’s name from the list and go to the next name.
Now suppose none of your friends sells watermelons. So, your next step would be to search for your friends’ friends. Each time you remove someone from the list, you add their friends to the list.
Now you‘re not only searching your friends, but you’re also searching their friends. Your aim is to find one watermelon seller in the network. So, if your friend is not a watermelon seller, you add their friends to the list too. You keep on searching till you find a watermelon seller or you have removed all names from the list (in this case there is no watermelon seller in your network). In case two of your friends have a mutual friend (other than you), you will write the mutual friend’s name only once. This algorithm is known as Breadth-First Search.
Coming to the two main questions that BFS answers-
- Does a path exist between two nodes — for our example, this can be reframed as “Does a watermelon seller exist in your network?”.
- What is the shortest path between two nodes — this can be reframed as “Who is the closest watermelon seller?”
We’ve already seen how the first question is answered. Let’s look at the second question.
Your friends are your first-degree connections, and their friends are your second-degree connections. You will always prefer first degree connections to second-degree connections. So, you wouldn’t be searching your second-degree connections before your first-degree connections which would help you in finding the closest watermelon seller. BFS already does this. The search radiates out from the source node (in this case the source node is you). The first-degree connections are added to the list before the second-degree connections and hence they are checked first, as long as you search people in the same order in which they are added. Therefore, if you find a watermelon seller in your network, they will also be the closest seller to you.
Firstly, we have to make a graph. In python, to implement a graph easily, we use a dictionary. A key in this dictionary represents a node and the values are mapped with the help of an array. For simplicity, we are going with a directed graph here.
To maintain the list we can use a queue because we need the FIFO structure (the name which comes first is checked first). Here I’ll be using deque to maintain the list.
Remember the case of mutual friends? Tom is friends with both Jerry and Taffy. So he can be included in the list twice. To avoid this kind of problem we will keep an array that keeps track of all the nodes we have already visited. If a person is a watermelon seller, we are done, if not, we continue searching until we find a seller or visit all nodes. The code for that is:
The last thing we need is a function to determine whether a person is a watermelon seller or not (definition for is_seller()). The condition I have put here for that is that the person’s name begins with ‘T’. So the answer for this case is Taffy. This function can be replaced by any condition to find the destination node. So the final code is:
Since you are searching your entire for a watermelon seller, you’ll be following all the edges (connections). At the very least, the running time is O(number of edges). If the number of edges is E, we get O(E).
You also need to keep a queue for this algorithm. Pushing an element in the queue takes constant (O(1)) time. Doing this for all nodes or vertices will take O(number of nodes) time. If there are N nodes in the graph, it will take O(N) time.
So in total, Breadth-First Search takes O(number of nodes+ number of edges) time or O(N + E).
Applications of BFS
Like I mentioned before, you have already used the Breadth-First Search algorithm in your daily life. It is a really simple algorithm, which is why it makes its way into many real-life scenarios. A few applications are listed below:
- GPS Navigation
- Unweighted Graph Traversals (Shortest path and minimum spanning tree)
- Search Engine Crawlers
- Broadcasting in Network
References and Resources
This article was an attempt at a simple explanation of BFS. There’s a lot more to know about BFS. Check out the following links to know more: