A Visual Introduction to Treap Data Structure (Part I: The Basics)

Published in

Carpanese's Blog

10 min readFeb 1, 2020

A treap is a binary tree that maintains simultaneously the property of binary search tree (BST) and heap.

This is the first article of a series that aims to explain what a treap is, how to implement it and when to use it. It’s an advanced data structure with some really beautiful properties.

The treap data structure

Formally, a treap (tree + heap) is a binary tree whose nodes contain two values, a key and a priority, such that the key keeps the BST property and the priority is a random value that keeps the heap property (it doesn’t matter if it’s a max-heap or min-heap). See figure 1.

**Figure 1:** A example of a treap. The rose numbers are the keys (BST values) and the blue numbers are priorities given randomly (heap values).

In the next article, we’ll see an efficient implementation and the true power of treaps. For teaching purposes, however, let’s start with the most basic way of thinking about them, which is just another self-balancing binary search tree.

Code 1: A simple implementation of treap.

Random Binary Search Trees

Let’s talk about one of the most beautiful interpretations of treaps.

Suppose we have, a priori, all the keys that will be inserted in a BST (a simple one, not a treap). We know that if the keys are sorted, our BST will degenerate into a list. What should we do to minimize the height of the BST if we could change the order of the insertions?

**Figure 2a:** It’s easy to see that the order of insertions in the BST will affect the height of the tree. In the left figure (worst case scenario), we inserted elements from 1 to 10 sequentially. In the right figure (best case scenario) we inserted the elements 4, 8, 2, 3, 6, 9, 5, 7, 1, 10 in that order.

In fact, it can be prove that a simple shuffle in the list of keys is sufficient to made the tree balanced (see CLRS, 3rd edition, theorem 12.4). The intuition is that the more sorted the input, the higher will be tree. The shuffle breaks the order, making the height of the tree almost minimum. In practice, we can assume that the h = lg(n).

**Figure 2b:** After a random shuffle, the new input list was 7, 6, 8, 2, 1, 9, 4, 10, 5, 3. Note that that’s not the smaller possible height, but it’s way better than the worst case scenario.

That’s an interesting fact, but it’s also almost useless because we have to read all insertions first. The treap, however, is a data structure that can help us shuffle the keys in a smart way after each insertion.

The idea is that we can insert all the keys in the worst possible way (which means in a sorted way), but the random priority will make sure to shuffle it “in real time”.

**Figure 3:** A treap with keys from 1 to 10. Try to figure out which key was inserted first and which key was inserted last. Was 8 inserted before 9? Do you have any clue about the order of the insertions? If you ignore the priorities, will you be able to say which key was inserted first?

We don’t know the order of the insertions in that treap (we don’t even know how to insert elements in a treap yet), but we can say for sure that the treap simulates one specific order.

Take a look at the priorities of each node. Let’s sort the vertices based on their priorities.

Figure 4: The nodes of the treap sorted based on their priority values (in blue).

The insertion order a treap simulates is given by the priority of its nodes. Note that, in our case, we don’t know the “real” insertion order, but the treap made a real time shuffle to 7, 6, 8, 2, 1, 9, 4, 10, 5 and 3.

In fact, the only difference between figures 2b and 3 is that the first one we shuffle a already known input, and the last one we shuffle the input in real time using random priorities.

Insertion

We already know that there could be more than one BST for a given set of nodes (see figure 2a). In fact, there’s two simple operations that allows us to modify the tree and keep the BST property, the right and left rotations.

**Figure 5:** The right and left rotations explained without words. It’s easy to see that the new tree isn’t just an isomorphic variant (the node now is directly connected to a green node).

Both operations are common in self-balancing BST, like AVL and red black trees. Each data structure uses different strategies to identify the best moment and the best place to rotate the tree. We’ll, on the other hand, use rotations in a different way.

To insert a node on a treap, we’ll, in a first moment, ignore the priority and insert it like a simple BST.

**Figure 6:** The first step to insert a node in a treap is to ignore the priorities and insert it like a BST. In this example, we inserted the node (7, 12).

Even though BST property is being preserved, the heap property is not. We should move the new node (7, 12) up, but if we do that the BST property will break. How can we fix that?

We’ll move it up using the rotations! The steps are described in figure 7.

**Figure 7a:** The node (9, 4) is the father of (7, 12), but it doesn’t have a higher priority. To fix that we must make a right rotation.

**Figure 7b:** The node (10, 5) is the father of (7, 12), but it doesn’t have a higher priority. To fix that we must make another right rotation.

**Figure 7c:** The node (5, 9) is the father of (7, 12), but it doesn’t have a higher priority. To fix that we must make another rotation, but a left one this time.

**Figure 7d:** The treap after all rotations.

In the first step, we go down from the root to the leaf. In the second step, we go up from a leaf to the root at maximum using O(1) rotations. It means that the insertion time complexity is O(2h) = O(log(n)).

We can understand a treap insertion as a BST insertion (considering only the key) followed by a heap insertion (start as a leaf and go up to the tree until you fix the heap property), which is another way to thing about the O(log(n)) time complexity.

Let’s think a little more about the second step, when we start at a leaf and go up to the tree. The only way to go from the leaf to the root is if the new priority has the highest value. In other words, if you are generating random numbers using an uniform distribution, the new value must be the highest from a set of |T| (number of nodes of the tree), witch will happen with a probability of 1/|T| (try to prove that). It means that the rotations occur often near the leaves.

Code 2: How to insert a key into a treap. You might want to see figure 5 to understand the rotation functions.

Offline Building

It’s easy to see that is always possible to insert a node in a treap. It means that to build a treap given a set of n nodes we just have to make n insertions.

If we have all the keys sorted, we can build the treap in O(n) time complexity instead. The idea is to split the array into two parts, from 0 to m-1 and from m+1 from n-1, where n is the size of the array and m is n/2.

**Figure 8:** The idea is to select the median element of the array to be the root node and then apply the same rule to the left and right partition.

Now we apply the same rule recursively for the left child and the subarray [0, m-1] and the right child and the subarray [m+1, n-1].

**Figure 9:** The final treap. Now we just have to assign any random values to each element/node and them heapify the priorities only.

Note that we choose m = n/2 to minimize the height. It means that we don’t have to worry about the priorities. We can assign any priority to any node and just heapify them in linear time.

Code 3: Implementation of offline build.

If it’s the first time you see this construction, it’s not easy to understand why the heapify function is linear. You can check the formal proof here.

Deletion

Given a key, we can’t just find its node and remove it. Pulling out internal nodes splits the tree and we don’t want it. The solution is to move the node down to the tree and remove it only when it becomes a leaf.

The question now is what path do we choose to go down? That node may have two children, which one should we select?

To maintain the heap property, it’s easy to see that we have to rotate the tree from the children with higher priority (see image 10).

**Figure 10a:** We want to lower the node (7, 12), but if we rotate it with the children with lowest priority, we’ll break the heap property of the treap because the node (10, 5) will be above the node (5, 9).

**Figure 10b:** If we choose to rotate with the children with highest priority, on the other hand, we’ll maintain the heap property. Note that there is no problem in nothe (5, 9) being above the node (10, 5).

The idea is simple: After the rotation, one of the children will be the father of the other one, so we have to “promote” the children with highest priority, otherwise we’ll break the heap property.

Code 4: Function to delete a given key from a treap.

Split

The power of treaps comes from their unique operations, which are impossible to implement in a efficient way using others BSTs. This article will cover one of those operations, the split.

We want to split the original treap into two parts L and R such that both L and R are treaps as well, all nodes of L are smaller or equal to a given value X and all nodes of R are greater than X.

**Figure 11a:** A treap to be splitted at 5.

**Figure 11b:** The L and R treaps after the split operation at 5.

It’s real simple to implement it. Just insert a new node with key X and infinity priority. The heap property will make that artificial node become the root of the treap, and the BST property will guarantee that its left and right child satisfy the conditions we want.

**Figure 12:** In fact, the split operation is just a trick if we already know how to insert.

Note that if X is already a key in the treap, by the definition of BST, the original node will be the left child of the new artificial node.

Code 5: Split function with the changes necessary.

Search

Nothing really special here. It’s exactly the same algorithm used in any other BST.

Code 6: Search function.

Treaps and cartesian trees

If you search for “treap data structures” on Google, you’ll probably find some results related to cartesian trees. What are they? Is the treap a cartesian tree? First things first, let’s define it.

A cartesian tree can be define recursivelly as follows: Each one of its nodes can be written in the form (x, y) for x, y ∊ ℝ such that x_l < x < x_r and y_l, y_r < y for the left child (x_l, y_l) and right child (x_r, y_r). Geometrically, it means that for each node, if we translate the origin to (x, y) the right child will be in the first quadrant and the left child will be in the second quadrant (see figure 13).

**Figure 13:** An example of a cartesian tree.

Cartesian trees are really useful in some contexts, for example, to prove the equivalence between the range minimum query and the lowest common ancestor (I will write about that someday). It appears in that proof in a slightly different way. Without going into details, we’ll select the minimum value of the array to become the root of the tree, let’s say it is found at position i. The left child will be the minimum value in the interval [0, i-1] and the right child will be the minimum value in the interval [i+1, n-1]. We’ll continue this process recursively (see figure 14).

**Figure 14:** A cartesian tree built from an array. The y value of that cartesian tree is its respective position relative to the array. For example, the root node may be written as (-5, 5) and its children as (-1, 0) and (2, 6).

Now that we understood what a cartesian tree is, what’s its relation to treaps?

Note that the x value of cartesian tree’s node maintains the BST property, while the y value maintains the heap property. We just said it with different words, but it’s the same definition. If fact, figure 10 and figure 7d are different visualizations of the same tree (see figure 15)!

**Figure 15:** Figures 7d and 14 side by side.

It’s not totally right to say that a cartesian tree is a treap because treaps have random values for y. On the other hand, it’s not totally wrong to say that both data structures are the same thing because randomization is not essential and is only used to guarantee a logarithmic height. I like to think about them as different, but equivalent data structures. Something similar to the difference between points and vectors.

In the next episode

This article introduced a first view of treaps. We saw how to use them as simple binary search trees. This way of using it is far from explore the full power of this data structure.

In the next part will focus on split and merge operations and how to use them to write a simple and efficient implementation of treaps. We’ll also see more interesting problems, like how to reverse an subarray in logarithmic time complexity, and how to add/remove values in any position of the array, also in logarithmic time.

Exercises

1. If we remove a key from treap and reinsert it immediately afterwards, the treap structure will not change. Prove it or give a counterexample.

2. Given a set of nodes of the form (key, priority), there’s only one possible treap associated with those nodes (excluding all isomorphic variations). Prove it or give a counterexample.

3. Write a function to merge two treaps L, R such that all keys of the left tree are smaller than all keys of the right tree. You can understand the merge operation as the inverse of a split.

Bibliography

This article would not be possible without the effort made by previous contributions.

Randomized Search Trees

The original article written in 1989 by Cecilia R. Aragon and Raimund G. Seidel. You can read it here.

Open Data Structures

The article wrote by Open Data Structures gives good insights about the treap properties and some formal explanations as well. Its implementation is not the cleanest one, however.

Techie Delight

The implementation I used in this article is strongly based on the Techie Delight’s one. It’s really simple and easy to understand. The page has some really cool problems, very similar to the Geeks for Geeks of a few years ago.

E-Maxx (CP Algorithms)

E-Maxx is a wide-known site about competitive programming and it has an rich page about treaps. My next article is strongly based on that material, but I used some of their concepts here as well, like the off-line building in linear time.