An overview of the blind, informed, and optimal searching algorithms for artificial intelligence.
Searching algorithms are defined as a way to identify and find a specific value within a given data structure. Not only do they allow you to find what value you are looking for, but searching algorithms are also a key element of artificial intelligence; they teach computers to “act rationally” by achieving a certain goal with a certain input value. Essentially, artificial intelligence can find solutions to given problems through use of searching algorithms.
We will be discussing three different kinds of searching algorithms: blind, informed, and optimal. Whichever algorithm you choose to utilize depends entirely on how you want to traverse through your data structure, as well as for what you want to optimize (time versus speed, for example).
A Programmer's Guide to Creating an Eclectic Bookshelf | Data Driven Investor
Every developer should have a bookshelf. The possible set of texts in his cabinet are myriad, but not every collection…
Searching algorithms which are blind do not consider any sort of quantitative element of a path when searching. In other words, they completely disregard all potentially indicative values such as heuristics and edge weights and, thus, search blindly.
British Museum is my favorite searching algorithm of them all, solely because of its absolute chaos. Imagine you are in the literal British Museum, and you have no maps or guidance as to where certain paintings are located. How, then, could you find Leonardo DaVinci’s Study of The Madonna And Child With A Cat? You choose to release a number of monkeys into the museum and have them randomly run around until one of them finds this painting. If you’d like, your monkeys could remember the paths they’ve already taken through the building so as to not repeat them. This is exactly how the algorithm works: as if monkeys were let loose in the British Museum.
Breadth-First Search traverses through a graph with more order than the British Museum; by moving sideways before then moving downwards. Since it begins by moving sideways, BFS uses a queue to store its visited nodes. This is because after visiting some node
A, BFS is going to look at all immediate extensions of
A before moving back to
A, and back along the graph’s width. Thus, a FIFO data structure such as a queue is essential because it allows BFS to quickly access the oldest node it looked at.
Depth First Search is opposite in traversal: it moves all the way down a path before moving sideways. Thus, unlike BFS, depth-first search requires a LIFO data structure such as a stack, because the latest node it visited is the one it wants to visit again.
Unlike blind algorithms, informed algorithms do consider quantitative values such as heuristics while searching, and thus make more informed decisions regarding down which paths to traverse.
A node’s heuristic value is a measure of its available information and a guide on which branch to follow. For example, heuristic values for a node
N is often an estimate of how costly it will be to get from node
N to our destination goal
D. Heuristics are never overestimates of the true cost, and are a way in which search algorithms sacrifice precision and accuracy for the sake of finding something which is faster and “good enough”.
Hill Climbing is an informed searching algorithm which considers a node’s heuristic while searching. It is basically depth-first search except if DFS considered heuristic values. Thus, like DFS, hill climbing uses a stack to store its recently-visited nodes and traverses down a tree downwards before moving sideways. At each node before moving downwards, HC picks which node to traverse down by whichever is “least costly”; whichever has the lowest heuristic value.
Beam is essentially breadth-first search if BFS considered heuristic values, and limited the number of nodes we could observe at each level. Say we had the following nodes with the following heuristic values and a width cap of
w = 2 :
We would begin at node A, and move down to observe each of its children. However, with
w = 2 , we can only observe a maximum of
2 children at each level. In beam, we prune the node with the highest (or most costly) heuristic value. In this case, we prune off node
B , and only take a look at nodes
D . From these nodes, we can extend downwards and repeat this same process for the new levels of nodes.
What’s interesting about Beam in contrast to other searching algorithms is that the given value for
w significantly influences our result. What if our destination node was extended off of
B , and we mistakenly pruned that node due to its misrepresented heuristic value? This is why, out of all the searching algorithms, a beam is the only one which is not guaranteed to find the path you’re looking for if it exists.
Best First Search is very similar to DFS, except that it sorts its entire agenda based off two factors: first, based on the least costly heuristics and, in the case of a tie, in lexicographic order. What does this mEaN?!? Let’s say we have the following tree:
We begin at node
A , and move down to look at node
C since it is the child of
A with the lowest heuristic value, or which is less costly. However, once we have arrived at
C , we have a tie in heuristic values for which node to observe next: we can either go to node
D which extends off of
C , or move back up to node
D which extends off of
A . As mentioned, Best First breaks ties by sorting paths in lexicographic order. So, if we extended downwards from
C we’d end up with path
ACD , and if we extended sideways we’d end up with path
AD . We know we must then extend downwards from
C , since
ACD comes before
AD in the alphabet. Problem solved!
While other algorithms guarantee to find a path from start node
A to destination node
Z , optimal search algorithms are the only ones which guarantee to find the shortest path from
Z . Thus, optimal search algorithms all consider edge weights in a tree while traversing.
Branch & Bound describes an algorithm which essentially operates much like DFS and BFS, except it only explores additional paths if they are better than the current solution. This means that, unlike blind algorithms, only in the worst-case scenario will B&B traverse explore all possible permutations. Regular Branch & Bound considers the cost of paths based solely on the edge weights, not heuristics.
+Heuristic. When you add a heuristic to B&B, you are now considering the cost of each path based off of the
[edge weight] + [heuristic value] . The only difference in execution is your attempting to find a more precise way to traverse the tree. However, both regular B&B and B&B+Heuristic are guaranteed to find the shortest path from your start to your destination.
A* is an optimal searching algorithm which also considers both the edge weights & heuristic values when determining which paths are most costly. Often considered an extension of Dijskra’s algorithm, A* is considered incredibly efficient and often outperforms other algorithms in graph traversal.
Implementing the above algorithms in python is no more complicated than tree traversal, storing visited nodes in data frames, and considering specific values throughout traversal. Searching algorithms are a crucial part of most applications, and being able to determine which to use is monumental in ensuring your program is as fast and efficient for what you’re trying to achieve.
If you’re training artificial intelligence programs to arrive at certain conclusions, or if you’re simply trying to complete a problem set, I hope this post helped you find what you’re looking for… ;)
This post was written on a fine Tuesday evening in preparation for a quiz on Wednesday morning for MIT’s introductory artificial intelligence class, 6.034. Yes, writing this blog post was how I’ve chosen to study. Wish me luck!
If you’re working on searching algorithms’ implementation in python, I’d love to hear from you!