Optimal Binary Search Tree

Devipriya Kudumala
6 min readApr 16, 2024

--

Introduction:

In computer science, an optimal binary search tree (Optimal BST), sometimes called a weight-balanced binary tree, is a binary search tree which provides the smallest possible search time (or expected search time) for a given sequence of accesses (or access probabilities).

Basic concepts of Binary Search Tree:

What is Binary Search Tree?

In Binary Search Tree for each and every node in binary tree on the left side of that node every element is lesser than that node and right side of that node is greater than that node.

Time Complexity for searching an element in BST is O(log n) (for Balanced BST)

The Worst case time complexity for searching an element in BST is O(n).

External Nodes : External Nodes are those nodes where your search of a particular element stops when it is an unsuccessful search.

Extended Binary Search Tree : Every Binary Search Tree that has an external nodes.

Extended Binary Search Tree

No. of External Nodes = No. of Internal Nodes + 1

Total Cost of the BST : It is the time complexity or amount of cost that we actually take to search for the different elements in the BST.

Problem Statement:

Let’s consider an example :

P1,P2,P3 are the probabilities of success and q0,q1,q2,q3 are the probabilities of failure.

a1,a2,a3 are the Internal Nodes and e0,e1,e2,e3 are the External Nodes.

Keys : 30,60,90

Pi: 0.2 ,0.1 ,0.1

qi:0.3,0.1,0.1,0.1

Total cost = summation of i=1 to n(Pi x (level(ai))) + summation of i=1 to n(qi x (level(ei))- 1)

Total cost = 0.2 x2 +0.1 x 1 + 0.1 x 2+ 0.3 x 2 +0.1 x 2 + 0.1 x 2 +0.1 x 2 = 1.9

While solving the Optimal Binary Search tree there is an issue of overlapping subproblems and optimal substructure that is why the method of dynamic programming is the best way to solve the Optimal BST.

Dynamic Programming Approach:
Introduction to the dynamic programming approach for constructing optimal BST.

The Structure of the optimal BST with the given nodes and given associated probabilities of the internal and external node use dynamic programming in order to reduce the time complexity.

Cost[i, j] = min{ summation of i=1 to n(Pi x (level(ai))) + summation of i=1 to n(qi x (level(ei))- 1)} or

Cost[i, j] = min of i<k ≤j {c[i,k-1] + c[k, j]}+w[i, j]

where w[i, j] is the weight of the sub tree ranging from i to j.

c[i, j] is the total cost of the sub tree/tree ranging from i to j.

w[i, j] =summation k=i+1 to j( (Pk +qk ) + qi)

w[i, j] = w[i, j-1]+Pj +qj

w[i, j] => qi when i = j

Algorithm:

function optimal_bst(keys, freq):
n = length(keys)
Create a 2D array cost[1..n][1..n] and initialize all values to 0
Create a 2D array root[1..n][1..n] and initialize all values to 0

for i = 0 to n-1:
cost[i][i] = freq[i]
root[i][i] = i

for l = 2 to n:
for i = 1 to n-l+1:
j = i + l — 1
cost[i][j] = INFINITY
for r = i to j:
c = sum(freq[i:j]) + cost[i][r-1] + cost[r+1][j]
if c < cost[i][j]:
cost[i][j] = c
root[i][j] = r

return cost[1][n]

Time Complexity Analysis :
1.To initialize the cost table with 0 and it has n² elements to fill .so it takes o(n²).
2. We fill the table using dynamic programming. For each subproblem size “n” we iterate through all possible subproblems of size “n”. For each subproblem we calculate the cost of constructing an optimal BST at each key in the given range. This calculation takes O(n) time. Since there are O(n²) subproblems and each subproblem takes O(n) time to calculate .The total time complexity is O(n³).
3.After both the steps the algorithm takes maximum time complexity that is o(n³).
Finally the complexity of Optimal Binary Search Tree is o(n³) where n is the number of keys. This is one of the best method to find the key in optimal way.

Example Illustration:

Let’s consider an example :

Keys : 20,40,60,80

Pi: 3,3,1,1

qi:2,3,1,1,1

c[0,1]=min 0<k≤ 1 {c[0,0] + c[1,1] } +w[0,1] (k=1) = min{0+0} +8 = 8

w[0,1] = q0 + (P1 +q1) = 2 + 3 + 3 = 8 and

the root is taken as min k value.

( By calculating the remaining in the same method using the formula we get the below table)

c[0,4] = 32/16 = 2 => Optimal BST total cost

The Structure of the Optimal BST :

Applications and Use Cases :

An Optimal Binary Search Tree (BST) is a data structure used for organizing keys in a sorted manner to enable efficient search, insertions, deletions, and other operations. The key characteristic of an optimal BST is that it minimizes the average cost of searching for a key. Here are some applications and use cases of optimal BSTs:

  1. Database Systems: Optimal BSTs are widely used in database systems to store and retrieve data efficiently. In databases, indexes are often implemented using BSTs to speed up search operations
  2. Compiler Design: Compilers often use symbol tables to store identifiers and their associated information such as data types and memory addresses. Optimal BSTs can be used to implement symbol tables, enabling fast lookup and updating of identifiers.
  3. Text Editors: Text editors often use data structures like BSTs to implement features such as autocomplete or spell check. Optimal BSTs can be employed to efficiently search through dictionaries or lists of words.
  4. Dynamic Programming: Optimal BSTs are also used in dynamic programming algorithms to solve various optimization problems, such as the matrix chain multiplication problem or the optimal binary search tree problem itself.
  5. Compression Algorithms: Optimal BSTs can be used in compression algorithms to efficiently represent and search through dictionaries or symbol tables, reducing the size of compressed data.

These are just a few examples of the many applications and use cases of optimal BSTs across various domains. Their efficiency in searching and organizing data makes them a valuable tool in algorithm design and optimization.

Conclusion:

An optimal BST is a binary search tree that minimizes the average search time for keys by arranging keys in a specific order. The primary objective is to minimize the expected search cost, which is the weighted sum of search costs for each key, where the weights correspond to the probability of accessing each key. The construction of an optimal BST is typically achieved using dynamic programming techniques. This involves breaking down the problem into smaller subproblems and solving them in a bottom-up manner.

Further Reading and Resources:

BOOK: ** Introduction to Algorithms** By, Thomas H. Cormen, Charles E. Leiserson , Ronald L. Rivest and Clifford Stein, PHI.

Github Links: https://github.com/Devipriya622/Optimal-Binary-Search-Tree.git

--

--