Graph Traversal in Python: BFS,DFS,Dijkstra,A-star parallel comparision

Miao Bin
Nerd For Tech
Published in
4 min readApr 1, 2021

--

We have discussed about Breadth First Search (BFS), Depth First Search (DFS), Dijkstra’ Search, A-star (or A*) algorithm. Let’s do a quick review and comparision of all the past 4 articles introducing Graph Traversal algorithm. This will be the last article about graph traversal.

Briefly, BFS and DFS just go through all the nodes with their own preference. One goes broadwise and the other dig deeper. Whereas Diskstra looks smarter and able to pick a smaller step for its preference. A* cheated by getting answer from user.

In a single case, which one is more efficient is all by luck. Since A* got the cheat sheet, it’s generally faster than the others.

BFS and DFS:

For BFS and DFS, the sequence of node scanning are fixed. In the BFS, nodes are pushed to a queue-like FIFO storage and that maintained the sequence of first-come-first-serve scanning. Those shallow nodes join the queue prior to those deeper nodes so they can be scanned first. Whereas for DFS, shallow nodes goes into the stack-like LIFO storage and “stucked ” in the bottom! Those deeper nodes are stacked at the entrance and comes out first, as can be seen in figure below.

The above illustration is achieved by the following code:

#BFS
queue.append(new_node)
queue.pop(0)
#DFS
queue.append(ned_node)
queue.pop()#default pop out the -1, i.e., the last one

They do need another container to record the visited node. But they do not have weights on their edge. The accumulation of distances are not necessary.

Now if you see their codes, they are almost the same except the way of node popping out for the next round scanning. The other difference is that for DFS the scanning sequence isn’t the same as the pushing-in sequence. This will result to a wrong sequence in ‘visited’ list. We don’t return ‘visited’ list, instead, we need to either create another container to store ‘s’ or print ‘s’ directly as the output. To make the BFS and DFS similar, we choose to print ‘s’ directly.

#BFS
def bfs(graph,node):
visited=[]
queue=[]
visited.append(node)
queue.append(node)

while queue:
s=queue.pop(0)

for x in graph[s]:
if x not in visited:
visited.append(x)
queue.append(x)
return visited
#DFS
def dfs(graph,node):
visited=[]
queue=[]

queue.append(node)
visited.append(node)

while queue:
s=queue.pop()
print(s)
for x in graph[s][::-1]:
if x not in visited:
visited.append(x)
queue.append(x)

Dijkstra and A*

Now we have the weights on the edges. That simply means each road has its length. BFS and DFS are not able to handle this situation since they don’t have the “accumulator” to account the distances. We can simply add the “accumulator” to make them smarter. In fact, we can make a BFS and DFS version of Dijkstra or A*!

Let’s see the Dijkstra and A* first and get back to the “Breadth/Depth-Dijkstra” and “Breadth/Depth A*”. Allow me to paste all the previous code here and start to see what are the special places of these algorithm.

#Dijkstra
import heapq
def dijkstra(graph,node):
distances={node:float('inf') for node in graph}
distances[node]=0
came_from={node:None for node in graph}
queue=[(0,node)]

while queue:
current_distance,current_node=heapq.heappop(queue)
# relaxation
for next_node,weight in graph[current_node].items():
distance_temp=current_distance+weight
if distance_temp<distances[next_node]:
distances[next_node]=distance_temp
came_from[next_node]=current_node
heapq.heappush(queue,(distance_temp,next_node))
return distances,came_from
#A*
def astar(graph,start_node,end_node):

f_distance={node:float('inf') for node in graph}
f_distance[start_node]=0

g_distance={node:float('inf') for node in graph}
g_distance[start_node]=0

came_from={node:None for node in graph}
came_from[start_node]=start_node

queue=[(0,start_node)]
while queue:
current_f_distance,current_node=heapq.heappop(queue)

if current_node == end_node:
return f_distance, came_from
for next_node,weights in graph[current_node].items():
temp_g_distance=g_distance[current_node]+weights[0]
if temp_g_distance<g_distance[next_node]:
g_distance[next_node]=temp_g_distance
heuristic=weights[1]
f_distance[next_node]=temp_g_distance+heuristic
came_from[next_node]=current_node

heapq.heappush(queue,(f_distance[next_node],next_node))
return f_distance, came_from

Apparently they looks more complicated but in fact they only have a few more temporary containers to store information.

From BFS/DFS to Dijkstra we have one more container that store all the distances. We have another container store the traversal path. (Do we need this path recorder in BFS/DFS?I think we might need it!)

From Dijkstra to A* we have another cheat sheet container called “heuristic”. The oridinary distance container still here but changed its name to “g_distance ”or “g_score”, the additional of heuristic and g_distance made the “f_distance ”or “f_score”.

Finally we need a filter to select the smallest item. heapq.heappop() method performs such function.

If we generalize the pattern of all these graph algorithm, they have the following structure: The graph algorithms do need containers to store up-coming information, and process the information, and return the processed information. That’s all.

This pattern may not may not represents all the algorithms in this world! We will explore more pattern of other algorithms in the future to validate this hypothesis. Thanks for reading!

The BFS/DFS version of Dijkstra and A*

We can make these algorithm by placing the heapq.heappop() method with queue.pop(0) to get the BFS-Dijkstra/A*, or queue.pop() to get the DFS-Dijkstra/A*. Without picking the smallest value, the queue now pop out node simply in the way of FIFO and LIFO. Now you have a slower Dijkstra/A*!

But again, the speed is all by luck, unless you are “cheating”, which means A* still faster than the others statistically.

--

--