How to analyze the unbalancing scenarios during red-black tree insertions?

Cifer
Programming Notes
Published in
5 min readJun 28, 2020

How red-black trees guarantee balance?

Firstly, nodes in a red-black tree can either be red color or black color. Besides that, two more constraints help guarantee the balancing of a red-black tree.

  1. every path from the root to NULL contains the same number of black nodes.
  2. if a node is red, its children must be black

The first constraint makes an attempt for balancing but it’s surely not enough, for a tree with one black node on the left and one black node and a bunch of red nodes on the right meets this constraint well, but certainly breaks the balancing. So there comes the second constraint which doesn’t allow the consecutive occurring of red nodes, this ensures that a subtree won’t be 2 times deeper than the other.

How do we analyze the unbalancing scenarios and fix them?

It’s obviously insertions and deletions of nodes may cause unbalancing. When we are thinking about how insertions and deletions would cause unbalances we should keep in mind that the current tree must be already balanced, i.e. meets the above two constraints. This is the precondition.

Now let’s analyze insertion and deletion respectively.

Insertion

Based on the precondition, we can easily infer that when inserting a new node, the node must be red, for inserting a new black node will certainly break the 1st constraint.

Known the inserting node must be a red node, and since the 2nd constraint doesn’t allow two consecutive red nodes, we will naturally think about the color of the parent of the inserting node. If it is black, the insertion can happily end up here still meeting the constraints. While if the parent is also red, the new inserting red node will certainly break the 2nd constraint, let’s dive into it and see how to fix it.

Now the scenario is that we want to insert a new red node but its parent is already red, continue inserting it will certainly break the 2nd constraint so we need to figure out how to avoid breaking it. A straight forward way occurs to us may be flipping the parent’s color to black, but this will probably break the 1st constraint as it would make the parent’s subtree got one more black nodes than the uncle’s sibling subtree if the uncle is red, then this can be solved by flipping the color of the uncle as well, otherwise more operations need to be taken to fix it. Anyway now we have awared that to fix the unbalancing the uncle needs to be considered as well, there are various cases that can be foreseen so we’d better list and analyze them one by one.

Thinking: How to fix the various unbalancing cases?

We should have known that in AVL tree the way to fix the unbalanced tree is rotation, in red-black tree, we have another method: color flip. But how do we decide whether color flip or rotation is used?

The most important thing is that after the adjustings, the black nodes from root to NULL in the subtree should be the same with before, otherwise it will certainly break the 2nd constraint. Based on that we can choose either color flip or rotation but I’d prefer color flipping first since it’s low effort, if it doesn’t fix, take rotation then.

Let’s start with the first simplest case.

Case 1, Uncle does not exist at all

I have drawn two subcases called “Zig-Zig” and “Zig-Zag” respectively, actually, there are another two symmetric subcases and the fixing approach is the same.

This is a case that we can’t just flip the color of the parent node(noted as P) to fix because if we do so the new node’s grandparent(G) will get different black nodes on it’s left and right subtree. So we need to consider rotation as well. It’s not hard to figure out the following procedures for these two subcases respectively:

Case 2, Uncle is red

Similarly, there are “Zig-Zig” and “Zig-Zag” cases and another two symmetric cases(not being drawn).

It looks we don’t need to do a rotation for this case at first, just flips the color for G, P and S. The trouble comes when G's parent is also red, in that case, we got two consecutive red nodes, we need to percolate the resolving process until the whole tree meets the constraints again, during this process, there are further subcases emerging out, let's take the "Zig-Zig" case for example and do it step by step.

The first step is to do a color flip on G, P and U.

After that, we will get two further subcases shown below, the GS is the G's sibling subtree, it's doesn't matter how this subtree looks like but it's easy and important to know that this subtree must contain one black node, and only contains one black node.

We have known that if G's parent is black, everything will be ok, so we can just skip it and see how to handle it if G's parent is also red.

The “Zig-Zag” case can also be inferred in the same way.

That’s all for insertion!

You may be surprised that only two cases in the insertion scenario, but it’s true. I have also read books and articles that list more than two cases to analyze the insertion scenario. But I’m thinking in a different way to get more compact cases.

You may be wondering if I left out a “Case 3, Uncle is black”, I didn’t. If the new inserting node’s uncle is black, then the initial tree would be an invalid red-black tree, checking the following graph. If the new inserting node has a black uncle: U, then P must also have two black children, S is one child, the other one surely doesn't exist or we don't need to insert it. So this case doesn't exist.

--

--