# Binary Search Trees

## Sounds familiar? but we’re going to take a closer look.

A binary search tree (BST) is a binary tree in a symmetric order, where each node has a key (and an associated value).

A binary tree means it consists of nodes, and each node has at most two children(left and right child).

While being in a symmetric order is that every node is larger than all the nodes in the left subtree, and smaller than the keys in the right subtree.

It’s used to implement the symbol tables (associative arrays), and it’s the key that would be used to sort the nodes accordingly.

# Binary Search Trees

We are going to implement the symbol tables using binary search tree. So, The main operations of a binary search tree would be the same; get, put, delete, contains & isEmpty.

We’ll start by the binary search tree (BST) class. It has a reference variable called “root” points to the root node.

The Key and Value type is any generic type. And, for keys, they are Comparable. So, we can use compareTo() method, to compare whether one key is less than the other, or greater than, or equal.

`public class BST<Key extends Comparable<Key>, Value> {    private Node root = null;             // root of BST}`

Then, we have an inner class called “Node” to represent each node within the tree. Each node has a key, a value, a reference to the left and right child node, and an integer “count” the number of nodes in the subtree rooted at the node. The number of nodes in the subtree rooted at every node — algs4.cs.princeton.edu
`private class Node {    private Key key;           // sorted by key    private Value val;         // associated data    private Node left, right;  // left and right subtrees    private int count;         // number of nodes in subtree    public Node(Key key, Value val, int count) {        this.key = key;        this.val = val;        this.count = count;    }}`

It searches for a node using it’s key recursively, and returns it’s value.

It starts from the root node, if it’s less than, then go left, if it’s greater than, go right, and if it’s equal, we are done; search hit. On the other hand, If the tree is empty, or we reached null link, we have a search miss.

`// searches for a node with a given key, and returns associated val.public Value get(Key key) {    return get(root, key);}// A helper method to search for a node given a starting Nodeprivate Value get(Node current, Key key) {    if (current == null) return null;                // search miss        int cmp = key.compareTo(current.key);    if      (cmp < 0) return get(current.left, key);   // go left    else if (cmp > 0) return get(current.right, key);  // go right    else              return current.val;           // search hit} `

From now on, during our implementation for symbol table using binary search tree, It might be the case when key is null, at this point, you may throw an exception.

Insert a node of a key, and a value in the tree. It searches the tree recursively until it finds a null link, and all that we need to do is replace that link with a new node.

If the key already exists, it updates the value (avoid duplicates). It removes the node from the tree if the value is null.

`public void put(Key key, Value val) {    if (val == null) {     // remove key from table if value is null        delete(key);        return;    }    root = put(root, key, val);}// A helper method to insert a node given a starting Nodeprivate Node put(Node current, Key key, Value val) {    if (current == null) return new Node(key, val, 1);        // if key exists, overrides the old value (avoid duplicates)    int cmp = key.compareTo(current.key);    if      (cmp < 0) current.left  = put(current.left,  key, val);    else if (cmp > 0) current.right = put(current.right, key, val);    else              current.val   = val;         // update number of nodes rooted at Node current    current.count = 1 + size(current.left) + size(current.right);    return current;}`

We’ll skip it for now, but, we’ll come back to it later (see below).

`// Is the given key exists?public boolean contains(Key key) { return get(key) != null; }`

When the tree is empty, the root node will be null, and hence number of nodes rooted at the root node is 0; no nodes in the tree.

`// Is the table empty?public boolean isEmpty() { return size(root) == 0; }// A helper method to get number of nodes rooted at Node currentprivate int size(Node current) {    if (current == null) return 0;    else                 return current.count;}`

The cost is the number of compares, which equals to the depth of a node in the tree + 1. The best case for searching is O(1); the root maybe the answer.

The running times of get and put operations equals to the number of compares, which equals to to the depth of a node in the tree + 1.

The depth of the tree depends on the shapes of the trees, which, in turn, depends on the order in which keys are inserted.

• The worst case when the keys are inserted in descending order; max tree depth is O(N).
• The best case when the tree is perfectly balanced; max tree depth is O(LogN).

If keys are inserted in randomly order, the number of compares in search or insert operation is ~ (2 LogN) on average case, while the worst case is about ~ (4.311 LogN) (proposed by Reed, 2003). There is a small chance for the height to be ~ (N) in the worst case if the keys are inserted in random order.

Unlike Binary Heaps, where all levels are full (almost except the last one).

Balanced binary tree means for every (non leaf) node, the left and right subtrees’ heights differ by at most one, while perfectly balanced means all leaf nodes have the same depth or same level; every path from the root to the null link has the same length.

# Ordered Operations

An important reason for why binary search trees are widely used is that they allow us to keep the keys in order.

So, using the binary search tree implementation for symbol tables, where keys are Comparable, we can support extra operations like: min, max, floor, ceiling, rank, select, …etc.

We keep moving to the left from the root to find the minimum, and to the right for the maximum.

`// get the smallest keypublic Key min() {    if (isEmpty()) throw new NoSuchElementException("...");    return min(root).key;} // A helper method to get the smallest node starting from Node curreprivate Node min(Node current) {     if (current.left == null) return current;     else                      return min(current.left); }// get the largest keypublic Key max() {    if (isEmpty()) throw new NoSuchElementException("...");    return max(root).key;} // A helper method to get the largest node starting from Node currenprivate Node max(Node current) {    if (current.right == null) return current;     else                       return max(current.right); }`

For finding the floor, If current node is greater than key k, keep going to left. If current node is smaller than k, go right. Whenever we go right, the current node is the floor, until we go right again.

`public Key floor(Key key) {    if (isEmpty()) throw new NoSuchElementException("...");    Node x = floor(root, key);    if (x == null) return null;   // if empty, return null    else           return x.key;} // A helper method to get the floor of a key starting from Node currprivate Node floor(Node current, Key key) {    if (current == null) return null;        int cmp = key.compareTo(current.key);    if (cmp == 0) return current;    if (cmp <  0) return floor(current.left, key);        Node t = floor(current.right, key);     if (t != null) return t;    else           return current;}`

Finding the ceiling is similar, just interchange right and left.

`public Key ceiling(Key key) {    if (isEmpty()) throw new NoSuchElementException("...");    Node x = ceiling(root, key);    if (x == null) return null;   // if empty, return null    else return x.key;}// A helper method to get the ceiling of a key starting from Node cuprivate Node ceiling(Node current, Key key) {    if (current == null) return null;        int cmp = key.compareTo(current.key);    if (cmp == 0) return current;    if (cmp >  0) return ceiling(current.right, key);        Node t = ceiling(current.left, key);     if (t != null) return t;    else           return current;}`

For rank, suppose that we want the rank of key k …

If current node is greater than k, keep going to left; It means the key we are searching for is before(less than) the current key.

If current node is smaller than k, go right; It means the key we are searching for is after(greater than) the current key, and so, we need to count the current node, and all of the nodes on it’s left subtree.

Remember, rank of key k the number of keys that are less than k. In other words, it returns the index of k in a sorted array, or an ordered tree.

`// get the number of keys in the subtree less than given key.public int rank(Key key) {    return rank(root, key);} // A helper method to get the rank of a key starting from Node curreprivate int rank(Node current, Key key) {    if (current == null) return 0;        int cmp = key.compareTo(current.key);     if      (cmp < 0) return rank(current.left, key);     else if (cmp > 0)           return 1 + size(current.left) + rank(current.right, key);     else              return size(current.left); }`

For select, suppose that we want the key at rank k(at index k in a BST)

If the number of keys t in the left subtree is larger than k, we go left, and if t is smaller than k, we go right, and update k = k-t-1(subtract the current node, and all of the nodes on it’s left subtree).

`// return the key at a given rank k; return kth smallest keypublic Key select(int k) {    if (k < 0 || k >= size(root)) {  // same as when the key is null        throw new IllegalArgumentException("...");    }    Node x = select(root, k);    return x.key;}// A helper method to return key of rank k starting from Node currenprivate Node select(Node current, int k) {    if (current == null) return null;        int t = size(current.left);     if      (t > k) return select(current.left,  k);     else if (t < k) return select(current.right, k-t-1);     else            return current; }`

Finding how many keys fall within a given range [lo, hi] can be done by subtracting rank of lower key lo from rank of higher one hi (adding 1 if the higher key exists in the tree).

`public int size(Key lo, Key hi) {    if (lo.compareTo(hi) > 0) return 0;    if (contains(hi))         return rank(hi) - rank(lo) + 1;    else                      return rank(hi) - rank(lo);}`

While finding which keys fall in a given range can be done by traversing the tree in order; left subtree, current node, then the right subtree recursively.

`public Iterable<Key> keys(Key lo, Key hi) {    // create an Iterable object    Queue<Key> queue = new Queue<Key>();        // fill the Iterable object(queue) with keys in range[lo, hi]    keys(root, queue, lo, hi);        // return the Iterable object so we can iterate over the keys    return queue; }// A helper method to fill the Iterable object(queue) with the keysprivate void keys(Node current, Queue<Key> queue, Key lo, Key hi) {    if(current == null) return;        int cmplo = lo.compareTo(current.key);     int cmphi = hi.compareTo(current.key);    // if lo is less than current node, go left    if (cmplo < 0) keys(current.left, queue, lo, hi);    // if current node within the range[lo, hi], add to the queue    if (cmplo <= 0 && cmphi >= 0) queue.enqueue(current.key);     // if hi is greater than current node, go right    if (cmphi > 0) keys(current.right, queue, lo, hi);}// if no given range is specified, then iterate over all the keyspublic Iterable<Key> keys() { return keys(min(), max()); }`

All the operations for the Binary Search Tree takes h(except keys() method), where h is the height(depth) of the tree. The height of the tree is proportional to (LogN), if the keys are inserted in random order.

The keys() method always take O(N) in-spite of the height.

# Deletion in BSTs

We skipped the delete method in our implementation (see above). This is because the delete operation needs a careful look.

To remove a node with a given key, set it’s value to null, and leave the key. This node becomes useless, it’s just there to guide the search through the tree using it’s key.

This approach will cost ~ (2 * LogN) per insert, search and delete operation if keys are inserted in random order (where N is the number of nodes including deleted nodes in the BST), but it’s inconvenient because you’ll have large number of useless nodes (memory overload).

For delete the minimum, we go left until finding a node that that has a null left link and then replace the link to that node by its right link.

For delete the maximum, it works similar, just interchange right and left.

The deleted node is then becomes available for the garbage collector.

`// delete the smallest nodepublic void deleteMin() {    if (isEmpty()) return;    root = deleteMin(root);}// A helper method to delete the min node starting from Node currentprivate Node deleteMin(Node current) {    // go to left until finding a node where left is null    // once, we found it, return it's right node    if (current.left == null) return current.right;        // assign the left link of current node    current.left = deleteMin(current.left);        // update the size of each node as you go up    current.count = size(current.left) + size(current.right) + 1;    return current;}`

We can proceed in a similar manner to delete any node that has one child (or no children), but what can we do to delete a node that has two children? The solution is by replacing the node with its successor.

This replacement preserves order in the tree because there are no keys between the node’s key we want to delete and it’s successor’s key.

The successor node of node x is the leftmost node (the node with the smallest key) in the right subtree of x.

`// delete a node given it's keypublic void delete(Key key) {    root = delete(root, key);}// A helper method to delete a node starting from Node currentprivate Node delete(Node current, Key key) {    if (current == null) return null;        // search for the node you want to delete    int cmp = key.compareTo(current.key);    if      (cmp < 0) current.left  = delete(current.left,  key);    else if (cmp > 0) current.right = delete(current.right, key);    else {          // once you found it        // If no right child, return the left child        if (current.right == null) return current.left;        // If no left(but right child), return the right child.        if (current.left  == null) return current.right;                // Otherwise; current node has two children        // 1. Save a link to the node to be deleted in t        Node t = current;        // 2. Set current to point to its successor min(t.right).        current = min(t.right);        // 3. Set the right link of current to deleteMin(t.right);        // the link to the tree containing all the keys that are        // larger than current.key after the deletion.        current.right = deleteMin(t.right);           // 4. Set the left link of current(which was null) to t.left        // (all the keys that are less than both the deleted key and        // its successor).        current.left = t.left;    }     // update the size of each node as you go up    current.count = size(current.left) + size(current.right) + 1;    return current;}`

The expected height that results from a sequence of random insertion and Hibbard deletions results in un-balanced tree. The height of the tree becomes sqrt(N) instead of (LogN).

# Summary: Symbol Table Implementations

In BST, If the keys are in random order, It ensures a probabilistic guarantee for search and insert operations of (2 LogN) in average case, and (4.311 LogN) in the worst case.

In BST, The delete operations takes O(N) in the worst case, and O(sqrt(N)) in average. If delete operations is allowed, then average case for search and insert is also O(sqrt(N)).

— Now, the question is, Can we guarantee (LogN) performance for all the operations in the worst case? Yes, Red-black BSTs can do that.

Written by

## OmarElGabry's Blog

#### This is my freedom area. Don't underestimate it. The devil is in the detail.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just \$5/month. Upgrade