Search Algorithms Part 2: Uninformed Search Algorithms — 1

Published in

Kredo.ai Engineering

8 min readJul 15, 2018

Previously, we have discussed on the definition and formulation of the problem, and introduced the concept of general search strategies. In this post, we will discuss on some of the popular classical search algorithms, mainly focusing on the uninformed search algorithms.

Uninformed and Heuristic Search Strategies

Based on the information about the problem available for the search strategies, we can classify the search algorithms into uninformed and informed (or heuristic) search algorithms. For the uninformed search algorithms, the strategies have no additional information about the states beyond that provided by the problem definition. All they can do is generate successors and differentiate between goal and non-goal states. Strategies that know whether one non-goal state is more promising than the other is called informed search strategies or heuristic search strategies.

(There is a slight difference between a strategy and an algorithm. Strategy refers to a rough idea, while an algorithm is a step-by-step implementation of a strategy. But since this blog, and the subsequent ones, are more focused on ideas, the words strategy and algorithm could be interchanged at few places. Sorry for the inconvenience caused.)

Uninformed Search Strategies

Breadth-first search (BFS)

BFS is a search strategy where the root node is expanded first, then all the successors of the root node are expanded, then their successors, and so on, until the goal node is found. All the nodes at a given depth in the search tree is expanded before a node in the next depth is expanded.

Figure 1: The breadth-first search on a search tree. The marker shows the node which is to be expanded next.

Breadth-first search always expands the shallowest unexpanded node. To achieve this, we will take the help of a First-in First-out (FIFO) queue for the frontier. The newly generated nodes always go to the back of the queue, while the older nodes get expanded first.

Figure 2: Pseudo-code of the Breadth-first search algorithm.

Let us check if the BFS algorithm satisfies the 4 criteria:

BFS is complete — if the shallowest goal node is at depth d, it will eventually find it after expanding all the nodes shallower than d.
As far as optimality of the solution is concerned, the BFS algorithm stops at the shallowest goal found. We must remember that the shallowest goal node need not necessarily be the optimal goal node. BFS is optimal if the path cost is a non-decreasing function of d. Usually, BFS is applied when all the actions have the same cost.
To find the time complexity, let us assume all the non-goal nodes have b successors (b is the branching factor of the tree), and generating each successor from the parent node takes constant time. Then, the root of the search tree generates b nodes, and so it consumes b time. Then each of the b nodes in depth 1 further generates b nodes, consuming b² time, and so on, until the goal is reached in depth d. Hence, the time complexity is:

4. As mentioned in the previous blog, all the states must be remembered to avoid redundant and loopy paths. Hence, until the goal node is reached in depth d, all the nodes until d-1 must be stored in the memory. Hence, the space complexity is also:

Figure 3: The time and memory required for BFS. We assume *b = 10, the processing speed is 1 million nodes per second, and the space required is 1kB per node (a rather realistic assumptions)*

From the table in Figure 3, we can infer that memory requirements are a bigger problem for BFS than time complexity. We can afford to wait for 3 hours to obtain a solution at d = 10 and b = 10, but 10 terabytes of space is too much to ask. Also, time is still a major factor. For d = 16, we have to wait for 350 years for BFS to generate a solution, a seemingly unrealistic situation.

Uniform Cost Search

BFS is optimal if all the step costs are the same. For any step-cost function, uniform cost search expands the node with least path cost. To implement this, the frontier will be stored in a priority queue.

Other than the ordering of queue there is one more important difference between BFS and uniform-cost search. In breadth-first search, the goal test on the node is performed when it is first generated. But in uniform-cost search, goal test on the node is performed when it is selected for expansion. This is because the first node generated could be a sub-optimal path.

Figure 5: A section of Romania, showing two paths from Sibiu to Bucharest

For example, let us consider a smaller region of Romania, where our goal is to reach Bucharest from Sibiu with least path cost. The successors of Sibiu are Riminculus Vilcea (with cost of 80) and Fagaras (cost = 99). The least-cost node is Riminculus Vilcea, which is expanded next to get Pitesti whose path cost from Sibiu is now 80 + 97 =177. The least-cost node is now Fagaras, which is then expanded to get Bucharest with path cost 99 + 211 = 310. Now, we have generated the goal node, but the search still continues. What if the path through Pitesti reaches Bucharest with a lesser cost? Turns out this is the case in this situation. The Pitesti is expanded next, to give Bucharest with path cost 177 + 101 = 278. Bucharest, now with path cost 278, is chosen for expansion, and the solution is returned.

Does the uniform-cost search satisfies the metrics for performance measure?

Uniform-cost search doesn’t care about the number of steps a path has, but only the total path cost. It will get stuck in an infinite loop if there’s a path with infinite sequence of zero-cost actions. Completeness is guaranteed only if the cost of every step is some positive number.
Uniform-cost search is optimal. This is because, at every step the path with the least cost is chosen, and paths never gets shorter as nodes are added, ensuring that the search expands nodes in the order of their optimal path cost.
To measure the time complexity, we need the help of path cost instead of the depth d. If C* is the optimal path cost of the solution, and each step costs at least e, then the time complexity is: O(b^[1+(C*/ e)]), which can be much greater than that of BFS. When all the step costs are the same, then the optimal-cost search is same as BFS except that we go one more step deeper (Remember that we perform goal test while we expand the node, not when it is generated?).
In the first step, all the successors of the root node are stored. Then, the least-cost node is removed, and its successors are added. Since we need the least cost of all the nodes explored, we need to store all the nodes explored until the goal node is found. Hence, the space complexity is also O(b^[1+(C*/e)]).

In terms of time and space complexity, we can see that uniform-cost search is worse than BFS. Hence, we must apply uniform-cost search only if the state costs are different, and path costs are not a non-decreasing function of depth. In other cases, we could use BFS.

Depth first Search (DFS)

DFS always expands the deepest node in the frontier of the search tree. The search proceeds to the deepest level of the search tree, which does not have any children yet (otherwise it is not the deepest node). The node is expanded, and then the successor (or one of the successors) is expanded, and this process is continued until the goal node is reached, or the node has no more successors. If the latter has occurred, the search backs-up to the previous node and explore its other successor, if any of them is still unexplored.

Figure 6: Depth-first search on a search tree

Coming to the properties of DFS:

DFS is not complete. In the problem of traveling from Arad to Bucharest, the search agent might get stuck in Arad — Sibiu — Arad loop forever. The only way to avoid loopy path in DFS is to remember all the nodes present in the path from the root to the current node. This could solve the problem of loopy paths, but still cannot avoid redundant paths.
DFS is non-optimal in nature. In the Figure 6, DFS will explore all the left nodes first, even if node C is the optimal solution. Suppose node K is also a solution (but a non-optimal one), DFS returns K as the optimal solution.
Time taken by DFS depends on the depth of the entire search tree (which could be infinite, if loopy paths are not eliminated). Even in the case of a search tree with finite depth, the time complexity will be O(b^m), where m is the maximum depth of the search tree, which could be much larger than d (the depth of the shallowest goal).
The main reason for the use of DFS over BFS in many areas of AI (like constraint satisfaction and logic programming) is because of its very less space consumption. In DFS, we need to store only the nodes which are present in the path from the root to the current node and their unexplored successors. For state space with branching factor b and maximum depth m, DFS has space complexity of O(bm), a much better improvement over that of BFS.

For example, using same assumptions from Figure 3, DFS would require only about 160kB instead of 10EB for d = 16, about 7 trillion times lesser space.

The pseudo-code for DFS is not listed here. This is because we will be discussing the variants of DFS in the next blog along with the pseudo-code.

Summary

In this blog, we have discussed the three popular uninformed search algorithms — BFS, uniform-cost search and DFS.

BFS, by exploring all the nodes at same depth before going to next depth, is complete, but faced the problem of non-optimality, and also has huge time and space requirements. The non-optimality of BFS s solved by the uniform-cost search where the least cost path was chosen at every step, and the huge space consumption is solved by DFS, where the deepest node is expanded first, keeping the nodes in the frontier to a minimum.

In the next blog, we will continue our discussion on uninformed search strategies, including depth-limited search, Iterative deepening DFS, and bidirectional search.

References

Stuart Russel and Peter Norvig, Artificial Intelligence: A Modern Approach, 4th edition