On Heaps

Caleb Winston
Index Zero
Published in
10 min readOct 3, 2018
A heap of data

Almost every data structure ever invented was designed for doing specific things really well. For example, lists are really good for storing stuff that’s ordered. Sets are really good for storing stuff where we don’t care about order. Similarly, heaps are really good at storing stuff where the only thing we care about is the one with the highest priority.

For example, let’s say we want to store data for a bunch of tasks on a to-do list.

Each task has a priority attached to it.A to-do list

And we have a priority attached to each task…

A to-do list with priorities

And we really want to be able to quickly get the task with highest priority at any moment… What data structure could we use?

As it turns out, there’s one data structure that is perfect for doing this — heaps.

What heaps do

Like sets, heaps don’t store data in a strict order. The only order they maintain is one element with the highest priority and all other elements with other priorities beneath it.

How heaps store data

And heaps really only have three main operations.

  • Add an element to the heap with a certain priority
  • Get the element with highest priority
  • Remove the element with highest priority

As you can see, none of these operations do anything with any of the elements with less than the highest priority.

So you might be wondering — what happens when we remove the highest element? and what happens to the current highest element when we add a new element with higher priority?

Well, removing the top element simply brings the next highest element to the top of the heap.

Removing element with highest priority from a heap

And adding a new element with highest priority will bury the previous element with highest priority.

Adding new element with highest priority to a heap

How to implement heaps (with trees)

So how can we structure our heaps? Well, a tree-like structure turns out to work pretty well.

A tree we can use to store a heaps elements

The above tree has a a couple properties that make it perfect for implementing a heap.

  • Every element node has a priority greater than each of its children nodes
  • Every row of the heap must be filled as much as possible from left to right

As long as we can maintain a tree with these two properties satisfied, we can implement an efficient heap that can have elements added to and removed from.

Let’s take a look at implementing such a tree by first looking at how we would implement addition to the heap.

How to implement addition to heaps (with trees)

The simplest way of doing this is to add an element all the way at the end of the bottom-most row of the tree.

Adding a new element with priority 9 to the end of the bottom-most row

As you can see, it certainly satisfies the property that our tree must be filled completely on each row from left to right. If we put our new element anywhere else, this property would no longer be true.

However, notice that the other property is not satisfied — the property that each element node must have a greater priority than all of its children nodes. The element node with priority 7 is not less than its new child with priority 9.

The simple solution to this problem is that whenever we add a new element at the end of the tree, we repeatedly swap the new element with its parent element until its parent element does indeed have a greater priority than the new element.

So in this case, we first check if the new element with priority 9 is less than its parent node with priority 7. It’s not, so we swap.

Swapping the elements with priorities 9 and 7

Next, we check if the new element with priority 9 is less than its parent node with priority 10. It is, so we’re done — we have added a new element and satisfied both properties of our tree.

So adding new elements to our heap requires us to do two things in our tree —

  1. Add a new node with the new element at the end of the bottom-most row
  2. Repeatedly swap the new element with its parent node as long as it has a greater priority than its parent

How to implement removal from heaps (with trees)

Heaps have another really important operation they need to be able to do — removing the element with highest priority.

There happens to be a really simple way of removing the top-most elements from trees while preserving both of our heap properties.

  1. Swap the top-most node with the node at the end of the bottom-most row
  2. Remove the node at the end of the bottom-most row
  3. Repeatedly swap the new top-most node with the child with the greatest priority as long as that child has a greater priority than the node we’re swapping.

Let’s look at an example.

A tree we want to remove the top-most element from

Let’s say we want to remove the top-most node with priority of 10. We can first swap it with the last node on the bottom-most level, the one with priority 7.

The same tree with the top-most element swapped with the bottom-most element

Our element with highest priority is now at the end at the bottom, so we can easily remove the node without losing the property that every row is filled as completely as possible.

The same tree with the element with highest priority removed

Now we have the element with highest priority removed. But as you can probably see, we have lost the other property that each node must have a higher priority than all of its children. The new node at the top of the tree no longer has a greater priority than all of its children.

We can solve this problem by repeatedly swapping the top-most element with its child with greatest priority as long as that child has a greater priority than the element we’re swapping.

So first we look at the children of the element with priority 7.

The children of the new top-most element

Well, the child with largest priority is the one with priority 9 and that is greater than the priority of the top-most element (7). So we swap.

The top-most element swapped with its child with greater priority

Now we have swapped the top-most element with one of its children, we take a look at its new set of children at this new position in the tree.

The children of the element at its new position in the tree

If we look at the children of the element now, we can see that none of them have a greater priority than the element. So the element is now in its correct place in the tree and both properties are satisfied.

Now lets see how we can implement this 3-step process for removal and the 2-step process for addition we looked at earlier with an array.

How to implement heaps (with arrays)

We’ve looked at how we could implement heaps fairly easily with trees. It so happens that we can implement these “heap trees” fairly easily with a simple array.

We can have an array that stores each element left to right and top to bottom in that order.

For example, let’s take a look at the following tree representation of a heap —

A tree representation of a heap

We can store the data for this tree in the following array —

An array to store the data of our tree representation

As you can see, we simply order the elements by how they appeared in our tree from left to right, top to bottom.

Let’s see how we can realistically implement addition and removal from heaps with arrays.

How to implement addition to heaps (with arrays)

If we want to add a new element to a heap, the first thing we do is add the new element to the end.

A new element added at the end with priority 8

But the end might not be the correct spot for this new element with priority 8 to be. To get the element to its correct spot, we should first look at its parents.

If we know the index we added the new element at (in this case it’s 6), we can quite easily find its parent element.

Finding the parent element of the new element we just added

If we take the index of the new element, subtract 1, divide by the number of children of each node (3 in this case), and truncate the answer (remove everything after the decimal point), we get the index of the parent.

All we have to do then is compare the index of the new element we just added and its parent.

Comparing the priority of the new element we just added with the priority of its parent element

Since our new element has a priority of 8 while its parent has a priority of only 7, we swap them.

The new element swapped with its parent with smaller priority

We now look at the new element’s new parent.

The parent of the new element at its new position

But now we can see that its parent actually does have a greater priority as it should. So we don’t need to swap and we leave our new element as it is.

As you can see adding new element to our “heap array” is really just the same two-step process we had discussed earlier —

  1. Add a new node with the new element at the end of the bottom-most row (now that’s just the end of the array)
  2. Repeatedly swap the new element with its parent node (the element at position (n-1)/3 where n is the current index of new element) as long as it has a greater priority than its parent.

How to implement removal from heaps (with arrays)

Removing the top element from our heap is also pretty straightforward with arrays. The first thing we do is swap it with the last element.

The top-most element swapped with the last element

Now we can safely remove the element from the end of the array.

The (former) top-most element removed

But now we have a bit of a problem. We’ve safely removed the element with priority 9, but remember the element we swapped it with? the one with priority 7? That’s now in the wrong place.

This can be fixed by repeatedly swapping the element with its greatest priority child as long as the child has a greater priority than the element.

We first find its children using the index of the top-most element (initially it’s 0). We simply multiply the current index by the number of children of each node (in our case, that’s 3), then add 1, 2, or 3 to get the indices of the element’s children.

The children of the top-most element

If we look at the children, we can find the one with highest priority.

The priorities of the children of the top-most element

The one with priority 8 is the largest and its priority is also larger than the current top-most element (7), so we need to swap.

The top-most element swapped with its largest child

We can now look at the children of the element in its new position.

The children of the top-most element in its new position

Since none of its children in its new position have a larger priority than it, we can stop.

Once again, we can see that the process of removing an element from a heap is not really any different from the process we had discussed earlier —

  1. Swap the top-most node with the node at the end of the bottom-most row
  2. Remove the node at the end of the bottom-most row
  3. Repeatedly swap the new top-most node with the child with the greatest priority as long as that child has a greater priority than the node we’re swapping.

So there you have it! 👏 Heaps are a simple data structure that are really good at storing data where the only thing you care about is the element with highest priority. Now you know what they are and now you know how to implement them! 🙂

If you enjoyed reading this article, you can find more articles by me and projects I’m working on (such as this Markov chain compiler) at calebwin.github.io.

--

--

Caleb Winston
Index Zero

Building efficient programming systems for cloud and edge computing (calebwin.github.io)