Red Black Trees

Borzoo Esmailloo
5 min readMar 23, 2020

--

Red black trees are a variant of self-balancing binary search trees that use colored nodes to keep the tree balanced after modifications.

The problem with binary search trees

Regular binary search trees have O(n) times for dictionary operations (insert, delete, search) in worst case. The worst case happens when nodes are inserted in increasing or decreasing order. In this case the tree will be similar to a linked list.

Although the dictionary operations has expected time complexity of O(lg n) on a randomly built binary search trees, we might still end up with skewed tree structures that degrade performance.

Self-balancing binary search trees

One way to mitigate this problem is to balance the tree after every modification. The binary search trees that have this property are called self-balancing binary search trees.

There are many variants of such trees such as:

  • AVL trees preserve their balance by not allowing left and right children of nodes to have heights that differ by more than 1.
  • Treaps mix binary search trees and binary heaps. Each node has a random priority. The balance is enforced by requiring that each child has a higher priority than it’s parent.

Red black trees

Red black trees are a variant of self-balancing binary search trees that use colored nodes to keep the tree balanced. One characteristic of red black trees is that the height of left and right sub-trees of any node will differ by at most a factor of 2.

The rules of balancing are:

  1. Every node is either red or black
  2. Root is black
  3. If a node is red, both of it’s children are black
  4. Every path from a node to a descendant has the same number of black nodes, this is called the Black Height of the node
  5. Every leaf (null) is black

Some of these rules might be broken after performing modifying operations on the tree, therefore insert and delete operations should restore these rules after they’re done.

The main tool in restoring the rules are left and right rotations. You can think of rotations as swapping the levels of a child node with it’s parent without violating the binary search tree properties.

One difference between regular binary search trees and red black trees is that the value NULL is represented by a special black colored node, called nil. Every empty pointer points to nil instead of being NULL.

Insert Operation

The insert operation has the following steps:

  1. Find the appropriate place to insert the new node
  2. Insert the node and color it red
  3. Fix the broken rules in bottom-up fashion

To fix the broken rules, a procedure is called as the last step of insert. This procedure will perform at most 2 rotations.

Alternatively, instead of the bottom-up approach outlined here we can use a top-down approach. The red black tree recoloring and rotations are performed while traversing down the tree so when the new red node is inserted, no rules are violated. Although more complex, the top-down approach is more efficient and can used more fine grained locking.

Delete Operation

Deleting a node is a little more complicated. To make it simpler to understand, let’s distinguish between two types of cases when deleting a node x:

  1. x has two children
  2. x has less than two children

The first case will cause two replacements, first x will be replaced with the minimum of it’s right sub-tree, y. Then y will be replaced with it’s right child, z, which might be nil. If y is black, some of the rules might be violated.

In the second case, x will be replaced by it’s left or right child, z, which again might be nil. If x is black, some of the rules might be violated.

Because a black node was removed from the sub-tree that is now rooted at x, the black heights of the descendants of x are decreased by one. To mitigate this problem, x is assigned an extra black. Hence if x was red, it’s now red and black and if it was black, it’s now double black.

Every node should be either black or red, we’re violating that rule now. To fix this problem, we have to move the extra black to another node in the tree without violating any other rules. To fix the broken rules, just like insert operation, a procedure will be called. After changing the colors of some nodes and perhaps some rotations, the tree is re-balanced. This procedure will perform at most 3 rotations.

As with insert operation, instead of the bottom-up approach outlined here we can use a top-down approach which has the same advantages over the bottom-up approach.

Time complexity of dictionary operations

The height of the tree will be O(lg n).

  1. Search: In the worst case, a path is traversed from root to a leaf. This will take O(lg n)
  2. Delete: In the worst case, the height of the tree is traversed twice, one for deleting a node and one for restoring the rules. Since each rotation will take O(1) and we perform at most 3 rotations, this operation will take O(lg n)
  3. Insert: In the worst case, the height of the tree is traversed twice, one for inserting a node and one for restoring the rules. At most 2 rotations are performed thus insert will take O(lg n)

Usage in real world

Red black trees are widely used because of the efficiency of their dictionary operations. Here are a few examples:

  • In Java, TreeMap and TreeSet are implemented using Red Black trees. They are also used in HashMap to store the elements with colliding hash keys instead of a linked list.
  • In most implementations of std::set and std::map in c++, Red Black trees are used.
  • In .Net framework, SortedDictionary and SortedSet use Red Black Trees.
  • Red Black trees are used extensively in Linux Kernel, a few examples include Completely Fair Scheduler, tracking of virtual memory areas (VMA) and more

Red black tree implementation

You can find an implementation of Red black tree in Scala programming language here. I’m a beginner Scala programmer so you might see some weird stuff in there!

--

--