Short circuit: breadth-first search, part II

Andy Elmsley
The Sound of AI
Published in
5 min readFeb 27, 2019

The third post in our AI coding tutorial series.

The calm before the coding. Photo by Jay Wennington on Unsplash

Welcome back, AI searchers. Last post, we explored graph data structure, the breadth-first search (BFS) algorithm, and an implementation of BFS for exploring graphs. This time I’ll show you how to use BFS to find the shortest path between two nodes in a graph. I’ll then do some meta reflection on this algorithm (in my hyperbolic time chamber). If you’ve done whatever warm-up ritual you require, then let’s begin.

For the implementation, we’re going to use the same graph from last time.

Back for round two.

Finding the shortest path with BFS

Finding the shortest path in a graph is a problem with loads of applications. For example, when enemies in shooter games hunt you down, they calculate the shortest path to find and eliminate you in the quickest time frame.

Just like defining the problem and representation from the last few posts, we should define what the algorithm does, meaning its behaviour. We can easily understand this in terms of the inputs and outputs of the algorithm itself. The inputs are what we give the algorithm, and the outputs are what the algorithm spits out afterwards. For this task, we need three inputs:

  1. The graph itself
  2. The initial state (e.g. “G”)
  3. The goal state (e.g. “D”)

If the algorithm is able to connect the initial state and the goal state, it should return the path between the two, meaning we only need one output:

  1. The shortest path

The nice thing about BFS is that it always returns the shortest path, even if there’s more than one path linking two vertices.

There are some major differences between the implementations of BFS for simply exploring a graph, and for finding the shortest path between two nodes. First, we to keep track of the possible paths being explored (which we can implement as a list of nodes). Second, when the algorithm checks for a neighbour node, it needs to check whether the neighbour node is the goal state. If it is, then we already have a solution and there’s no need to keep exploring the graph. Here’s the code implementation:

New skill acquired! Now you know how to implement BFS for exploring graphs and finding the shortest path between two nodes. But that doesn’t mean you can go using it with reckless abandon. As with any search algorithm, there’s various pros and cons to consider.

The benefit and massive cost of BFS

The greatest thing about BFS is it’s complete. That means BFS will always find a solution to a problem, if a solution exists. If a path exists that connects two nodes in a finite graph, BFS will find it like Thanos did the infinity stones (sorry if you didn’t see the movie).

Completeness is a nice-to-have feature for an algorithm, but in the case of BFS, this comes at a high cost. The execution time of BFS is fairly slow, because the complexity of the algorithm is exponential — O(n²). What’s worse are the memory requirements. That’s because BFS has to keep track of all the nodes it explores. When applied to huge graphs, the expensive memory requirements make BFS unfeasible. For example, to solve a Rubik’s cube with BFS we’d need to explore 43 quintillion states. That is 43 followed by 18 zeros. This would take around 10 zettabytes of RAM (1x10²¹ bytes) which, the last time I checked, is not yet available on our computers (maybe in a galaxy, far, far away).

Heed this warning: only use BFS for relatively small problems!

What can you use BFS for?

Even though BFS is not the best option for problems involving large graphs, it can be successfully used for a number of applications. I’ll list a few to give you an idea:

  • Find the shortest path of unweighted graphs (we did this already — score).
  • Discover all nodes reachable from an initial state (we did this too; we’re on a roll)
  • Find neighbour nodes in peer-to-peer networks like BitTorrent.
  • Visit all links on a webpage, and keep doing the same recursively.
  • Find people’s degree of separation from each other online or offline.
  • Identify all neighbour locations in GPS systems.
  • Allow broadcasted packets to reach all nodes of a network.

Conclusion

Breadth-first search is an algorithm used to explore and search a graph. If you’ve carefully followed these last two posts, you can develop a python implementation of BFS for finding the shortest path between two nodes in a graph with BFS.

Basically, what you should remember is:

  • Graphs are the data structure of choice to search for solutions to complex problems.
  • BFS explores nodes one depth level at a time. It starts from a node, then checks the neighbours of the initial node, then the neighbours of the neighbours and so on.
  • BFS is complete, in that it always returns a solution in case there is one.
  • BFS is often impractical for large real-world problems, because of excessive memory and computation requirements.

As always, you’ll find the source code of the BFS implementation on our GitHub.

Here’s a couple of tasks to test yourself:

  1. Can you change the implementation to return all possible paths to the goal (rather than the shortest)?
  2. Can you add weights to the edges of our graph?
  3. Can you find the path with the lowest cumulative weight?

Next time, we’ll move from search to another fascinatingly essential area of AI: stochastic models.

Here are some useful resources to help you:

As usual, you can continue your training to supreme master-level AI coding here.

And give us a follow to receive updates on our latest posts.

--

--

Andy Elmsley
The Sound of AI

Founder & CTO @melodrivemusic. AI video game music platform. Tech leader, programmer, musician, generative artist and speaker.