# Depth-First Search vs. Breadth-First Search in Python

## The simplified explanation of the two traversals algorithm.

When it comes to learning, there are generally two approaches: we can go wide and try to cover as much of the spectrum of a field as possible, or we can go deep and try to get specific with the topic that we are learning. Most good learners know that, to some extent, everything we learn in life — from algorithms to necessary life skills — involves some combination of these two approaches.

In this note, we will see two of the most basic searching algorithms — Depth-First Search and Breadth-First Search, which will build the foundation of our understanding of more complex algorithms.

The structure of this note:

- Tree traversal
- Depth-First Search
- Breadth-First Search
- Depth-First Search vs. Breadth-Frist Search

Let’s begin with tree traversal first.

# What does it even mean to traverse a tree?

Since trees are a type of graph, tree traversal or tree search** **is a type of graph traversal. However, traversing through a tree is a little different from the more broad process of traversing through a graph.

Traversing a tree is usually known as checking (visiting) or updating each node in the tree exactly once, without repeating any node. Because all nodes are connected via edges (links), we always start from the root (head) node. That is, we cannot randomly access a node in a tree. There are three ways which we use to traverse a tree:

- Preorder traversal
- Inorder traversal
- Postorder traversal

## Preorder traversal

In** preorder traversal**, we are reading the data at the node first, then moving on to the left subtree, and then to the right subtree. As such, the nodes that we visit (and as we print out their data), follow that pattern: first we print out the root node’s data, then the data from the left subtree, and then the data from the right subtree.

Algorithm:

`Until all nodes are traversed`

Step 1 − Visit the root node

Step 2 − Recursively traverse left subtree

Step 3 − Recursively traverse the right subtree.

We start from the root node, and following preorder traversal, we first visit node one itself and then move to its left subtree. The left subtree is also a traversed preorder. The process goes on until all the nodes are visited. The output of the preorder traversal of this tree will be `1,2,3,4,5,6,7`

## Inorder traversal

In **inorder traversal**, we are following the path down to the leftmost leaf, and then making our way back to the root node, before following the path down to the rightmost leaf.

Algorithm

`Until all nodes are traversed `

Step 1 − Recursively traverse left subtree

Step 2 − Visit the root node

Step 3 − Recursively traverse the right subtree.

We start from the root node 4, and following inorder traversal, we move to its left subtree. The left subtree is also traversed inorder. The process goes on until all the nodes are visited.

## Postorder traversal

Finally, in postorder** traversal**, we visit the left node reference first, then the right node, and then, if none exists, we read the data of the node we are currently on. We end up reading the root node at the end of the traversal (after visiting all the nodes in the left subtree and the right subtree).

Algorithm

`Until all nodes are traversed `

Step 1 − Recursively traverse left subtree.

Step 2 − Recursively traverse the right subtree.

Step 3 − Visit the root node.

We start from the root node 7, and following postorder traversal, we first visit the left subtree. The left subtree is also traversed postorder. The process goes on until all the nodes are visited.

We have learned that the order of the node in which we visit is essential. Based on the order traversal, we classify the different traversal algorithms. There are two main techniques that we can lean on to traverse and visit each node in the tree only once: we can go wide or go deep.

The more common terms to describe these two options are breadth-first search and depth-first search, and they are probably exactly what we would expect them to be.

# Depth-First Search (DFS)

In a DFS, we always explore the deepest node; that is, we go one path as deep as possible, and if we hit the dead end, we back up and try a different path until we reach the end.

Note: The DFS uses a stack to remember where it should go when it reaches a dead end.

In DFS, we have to traverse a whole branch of the tree and traverse the adjacent nodes. So for keep tracking on the current node, it requires last in first out approach which can be implemented by the stack, after it reaches the depth of a node then all the nodes will be popped out of the stack. Next, it searches for adjacent nodes which are not visited yet.

If it was implemented with the queue, which is first in first out approach, we could not reach the depth before that it would dequeue the current node.

We use a simple binary tree here to illustrate that idea. Starting from the source node A, we keep moving to the adjacent nodes A to B to D, where we reach the farthest level. Then we backtrack to the previous node B and pick an adjacent node. Once again, we probe till the most distant level where we hit the desired node E.

Let’s break down those steps. We first initialize the stack and visited array.

Push node A (root node) to the stack

We mark node A as visited and explore any unvisited adjacent node from A. We have two nodes, and we can pick any of them. For this example, we shall take the node in alphabetical order.

We mark B as visited and explore any unvisited adjacent node from B. Both D and E are adjacent to B, we push them into the stack.

We visit D and mark it as visited. Here D does not have any unvisited adjacent node. So, no node is pushed into the stack.

We check the stack top for return to the previous node — E and check if it has any unvisited nodes.

As E does not have any unvisited adjacent node, we keep popping the stack until we find a node with an unvisited adjacent node. In this case, there’s none, and we keep popping until the stack is empty.

**Advantages:**

- DFS on a binary tree generally requires less memory than breadth-first.
- DFS can be easily implemented with recursion.

**Disadvantages:**

- DFS doesn’t necessarily find the shortest path to a node, while the BFS does.

## DFS in Python

We are representing the tree in code using an adjacency list via Python Dictionary. Each vertex has a list of its adjacent nodes stored.

`graph = {`

'A' : ['B','C'],

'B' : ['D', 'E'],

'C' : [],

'D' : [],

'E' : []

}

Next, we set `visited = set()`

to keep track of visited nodes.

Given the adjacency list and a starting node A, we can find all the nodes in the tree using the following recursive depth-first search function in Python.

`dfs`

function follows the algorithm:

1. We first check if the current node is unvisited — if yes, it is appended in the visited set.

2. Then for each neighbor of the current node, the `dfs`

function is invoked again.

3. The base case is invoked when all the nodes are visited. The function then returns.

`def dfs(visited, graph, node):`

if node not in visited:

print (node)

visited.add(node)

for neighbor in graph[node]:

dfs(visited, graph, neighbor)

# Breadth-First Search

In BFS, we search through all the nodes in the tree by casting a wide net, that is, we traverse through one entire level of children nodes first, before moving on to traverse through the grandchildren nodes. And we traverse through an entire level of grandchildren nodes before going on to traverse through great-grandchildren nodes.

BFS explores the closest nodes first and then moves outwards away from the source. Given this, we want to use a data structure that, when queried, gives us the oldest element, based on the order they were inserted. A queue is what we need in this case since it is first-in-first-out(FIFO).

Let’s see if queues can help us out with our BFS implementation. We use a simple binary tree here to illustrate how the algorithm works. Starting from the source node A, we keep exploring down the branches in an ordered fashion, that is, from A to B to C where level completes. Then we go to the next level and explore D and E.

We first initialize the queue and a visited array.

We start with visiting A (root node).

We mark A as visited and explore unvisited adjacent nodes from A. In this example, we have two nodes, and we can pick any of them. We shall take the node in alphabetical order and enqueue them into the queue.

Next, we mark B as visited and enqueue D and E, which are unvisited adjacent node from B, into the queue.

Now, C is left with no unvisited adjacent nodes.

We mark D as visited and dequeue it. We keep on dequeuing to get all unvisited nodes. When the queue gets emptied, the program is over.

**Advantages**:

- BFS is simple to implement.
- BFS can be applied to any search problem.
- BFS does not suffer from any potential infinite loop problem compared to DFS. The infinite loop problem may cause the computer to crash, whereas DFS goes deep down searching.
- BFS will always find the shortest path if the weight on the links are uniform. So BFS is complete and optimal.

**Disadvantages**:

- As discussed, memory utilization is poor in BFS, so we can say that BFS needs more memory than DFS.
- BFS is a ‘blind’ search; that is, the search space is enormous. The search performance will be weak compared to other heuristic searches.

# BFS in Python

We are representing the tree in code using an adjacency list via Python Dictionary. Each vertex has a list of its adjacent nodes stored.

`graph = {`

'A' : ['B','C'],

'B' : ['D', 'E'],

'C' : [],

'D' : [],

'E' : []

}

Next, we set `visited = []`

to keep track of visited nodes.

we set `queue = []`

to keep track of nodes currently in the queue

Given the adjacency list and a starting node A, we can find all the nodes in the tree using the following recursive breadth-first search function in Python.`bfs`

function follows the algorithm:

1. We first check and append the starting node to the visited list and the queue.

2. Then, while the queue contains elements, it keeps taking out nodes from the queue, appends the neighbors of that node to the queue if they are unvisited, and marks them as visited.

3. We continue until the queue is empty.

`def bfs(visited, graph, node):`

visited.append(node)

queue.append(node)

while queue:

s = queue.pop(0)

print (s, end = " ")

for neighbor in graph[s]:

if neighbor not in visited:

visited.append(neighbor)

queue.append(neighbor)

The code in this note is available on Github.

# BFS vs. DFS

So far, we understand the differences between DFS and BFS. It is interesting to know when it’s more practical to use one over the other? At the early stage of taking an algorithm class, I faced this problem as well. Hopefully, this answer could explain things well.

- If we know a solution is not far from the root of the tree, BFS might be better.
- If the tree is very deep and solutions are rare, DFS might take an extremely long time, but BFS could be faster.
- If the tree is very wide, a BFS might need too much memory to be completely impractical.
- If solutions are frequent but located deep in the tree, BFS could be impractical.

In general, usually, we would want to use:

- BFS — when we want to find the shortest path from a particular source node to a specific destination. (Or more generally, the smallest number of steps to reach the end state from a given initial state.)
- DFS — when we want to exhaust all possibilities and check which one is the best/count the number of all possible ways.
- either BFS or DFS — when we just want to check connectedness between two nodes on a given graph. (Or more generally, whether we could reach a given state to another.)

# That’s it!

In this note, we learned all the theories and understand the two popular search algorithms — DFS, BFS down to the core. We also know how to implement them in Python. It’s time to see the information transfer from the note to the real world; you should start your first coding assignment immediately. It’s way more exciting than my note.

Never stop learning!

# Resource:

The searching algorithm seems to come up quite often in coding interviews, and it can be hard to wrap your head around it at first. Once you learn the fundamentals, you must practice coding skills if you are eager to learn more about how the algorithm works and the different search strategies, you can get started with excellent the links below.

- Comp 160 lecture note, Tufts University
- BFS, DFS lecture, Instructor — Greg Aloupis, Tufts University
- CLRS: Chapter 22, p.594–610. (the book analyzes BFS and DFS in a formal way)
- Visualizations for BFS and DFS by David Galles.
- DFS in Python: Recursive and Non-recursive
- DFS, BFS(graph)
- Applications of DFS, BFS
- Leetcode