Graph Traversal in Python:A* algorithm

Miao Bin
Nerd For Tech
Published in
4 min readMar 29, 2021

We have gone through Breadth First Search (BFS), Depth First Search (DFS), Dijkstra’s Search in Python previously. In this articles we will go through the A* algorithm with a few examples and illustration. Then we will try to compare these algorithms in parallel.

Briefly, BFS and DFS are literally traversal around all the node in the entire graph with no particular “destination ”or “goal”. Dijkstra did the same but its destination is “go everywhere” (all node).

These are not what we want because intutively a search algorithm should start from somewhere and end somewhere, point to point. Now we have the A* algorithm that have the thing done.

We can’t blame BFS,DFS, and Dijkstra for not going anywhere or just go everywhere. Because we never told the algorithm to do so. Here is the thing, we need to “tell” the algorithm that there is a goal by giving it some hits, it is called heuristic and can be illustrated in figure below: the ~2,~5 are the estimated distance of that node to the final goal.

Because of this new information, we need to create new containers to store it and do “relaxation” base on the new feature. We still have the initial distance between node, people call it g-score. We have the heuristic value, which is also a weight or a distance, called h-score. The summation of g-score and h-score is the new feature called f-score.

The relaxation is now applied on f-score and pick the smallest to push to a temperary queue for next round loop.

The above graph exampleis represented as follow: node that there is one more information for each node. We created a list to store the edge length as well as the heuristic weight (i.e., the distance from that node to the final destination).

graph={
# now we need a list as the value to store g-score and h-score
# list first value is the g-score, second value is the h-score,i.e., heuristic
'A':{'B':[2,2],'C':[3,2]},
'B':{'D':[3,5],'E':[1,1]},
'C':{'F':[2,0]},
'D':{},
'E':{'F':[1,0]},
'F':{}
}
# the algorithm will retrieve the graph as follow:
graph['A']
# this return {'B':[2,2],'C':[3,2]}
graph['A']['B']
# this return [2,2]
graph['A']['B'][0]# return the edge length
graph['A']['B'][1]# return the distance of the node to destination

Let’s put the entire algorithm here and explian it later. (Sorry about the indentation, please adjust the indentation if necessary.)

def astar(graph,start_node,end_node):
# astar: F=G+H, we name F as f_distance, G as g_distance,
# H as heuristic
f_distance={node:float('inf') for node in graph}
f_distance[start_node]=0

g_distance={node:float('inf') for node in graph}
g_distance[start_node]=0

came_from={node:None for node in graph}
came_from[start_node]=start_node

queue=[(0,start_node)]
while queue:
current_f_distance,current_node=heapq.heappop(queue)

if current_node == end_node:
print('found the end_node')
return f_distance, came_from
for next_node,weights in graph[current_node].items():
temp_g_distance=g_distance[current_node]+weights[0] if temp_g_distance<g_distance[next_node]: g_distance[next_node]=temp_g_distance
heuristic=weights[1]
f_distance[next_node]=temp_g_distance+heuristic
came_from[next_node]=current_node

heapq.heappush(queue,(f_distance[next_node],next_node))
return f_distance, came_fromastar(graph,'A','F')
# return {'A': 0, 'B': 4, 'C': 5, 'D': 10, 'E': 4, 'F': 4} and
# {'A': 'A', 'B': 'A', 'C': 'A', 'D': 'B', 'E': 'B', 'F': 'E'}

Indeed, there are more containers and more auxilary lines. However it feels exactly like Dijkstra’s algorithm with only one line different. It is the new way of feature constructing made the difference. Here it is: the initial edge length (g_distance) now have to add up the heuristic.

f_distance[next_node]=temp_g_distance+heuristic

Illustration go through is shown below: by using our “hint”, algorithm bypass some of the node and go directly to the destination. Note that there is arrow on the edge.

One thing to take note that we need a g-score container to store distance from start to that node, This explains why the node D weight is calculated as 2+3+5, in which the 2 is distance between A->B, 3 is the distance between B->D, the 5 is the heuristic of node D itself. In the code, how do we accumulate 2+3? We need the following container and initialize it with infinity.

g_distance={node:float('inf') for node in graph}

Just like the Dijkstra’s algorithm, the .heappop() method ensured the minimum f-score to be popped out. Though we get the current_f_distance, we didn’t utilize it as we only need the g_distance of that node.

current_f_distance,current_node=heapq.heappop(queue)

The same as Dijkstra, we have the came_from container to store the predecessor node that so far forms the cheapest path. But take node that we don’t have to fill all of this container. Some of them remain None until the end.

came_from={node:None for node in graph}

Here is another example that is slightly more complex. The example is taken from the reference youtube link. The lecturer in that video has explained the A* concept very well. It’s highly recommended to go through his video in detail.

complex_graph={
'S':{'A':[5,7],'B':[9,3],'D':[6,6]},
'A':{'B':[3,3],'G1':[9,0]},
'B':{'A':[2,7],'C':[1,4]},
'C':{'S':[6,4],'G2':[5,0],'F':[7,6]},
'D':{'S':[1,5],'C':[2,4],'E':[2,5]},
'E':{'G3':[7,0]},
'F':{'G3':[8,0]},
'G1':{},
'G2':{},
'G3':{}
}

The illustration of the graph is shown below: note that there are arrows pointing upward or backward in between some nodes. Please try out the code by yourself. You will notice that the algorithm bypasses some of the nodes and goes directly to it’s goal. Also, the algorithm may roam around just to make sure it won’t miss out potential options (the roaming is ensured by the .heappop() method, it always look for smaller f-score and check them all out, despite for human we can see that the goal is only one step forward!).

--

--