Heap Data Structure Unveiled: Mastering its Concepts and Implementations in Python

Abhilashtellabiyyam
9 min readJun 15, 2023

--

Welcome to our blog, where we explore the heap data structure in Python. Heaps are essential data structures used in various algorithmic problems, providing efficient prioritization and organization of elements. In this article, we unravel the mysteries of heaps, discussing their inner workings and demonstrating their implementation in Python code. By the end, you’ll have a solid understanding of heaps and the ability to leverage their power to optimize your algorithms and tackle real-world challenges. Let’s dive into the fascinating world of heaps together!

A heap is a special type of tree-like structure that follows a specific rule. Think of it as a collection of values arranged in a particular way. In a heap, each value has a relationship with its parent and child nodes.

For example, in a max heap, every parent value is greater than or equal to its children. In a min heap, it’s the opposite, where each parent value is less than or equal to its children.

This arrangement allows us to easily find the highest or lowest value in the heap, which is located at the top.

To work properly, a heap needs to meet certain conditions.

In a max heap, every parent value must be larger than or equal to its children.

In a min heap, it’s the opposite, where every parent value must be smaller than or equal to its children.

Another important thing is that the heap should be a complete binary tree, meaning it’s filled from the left side, and there are no “holes” except, possibly, on the last level.

Following these conditions ensures that we can add elements, remove the top value, or update the heap efficiently.

As we venture further, let’s take a closer look at the two distinct representations of heaps: the linked list representation and the array representation. Exploring these representations will provide valuable insights into the benefits and considerations associated with each method.

Linked List Representation:

Each node in the heap is represented using a class with a value and references to its left and right child nodes. It offers flexibility for dynamic resizing but may require more memory due to node references.

Array Representation:

The heap elements are stored in a one-dimensional array, with parent-child relationships maintained using array indices. For example, given a node at index ‘i’, its left child is at index ‘2i’, and its right child is at index ‘2i+1’. Conversely, for a node at index ‘i’, its parent is at index ‘i/2’.It provides memory efficiency and efficient random access. However, resizing may be necessary when the heap grows beyond the initial capacity.

In Python, the array representation of a heap does not have the limitation of a fixed size. Python’s dynamic arrays automatically resize as needed, eliminating the need for manual array resizing. This allows for seamless growth of the heap without worrying about exceeding the initial capacity.

heap representation in array and linked list

The choice between the two representations depends on specific requirements, such as dynamic resizing needs or efficient memory usage.

Building a heap can be accomplished using two different approaches, each with its own charm.

The first method involves taking each node, one after another, and gradually transforming it into a heap. Starting from the first node, we compare its value with its parent and swap them if necessary, ensuring that the parent is always greater (for a max heap) or smaller (for a min heap) than its children. We repeat this process for each subsequent node until the entire heap satisfies the heap property. This method is intuitive and allows us to construct a heap from any arbitrary collection of elements.

The second approach revolves around a complete binary tree, where all levels are filled, except possibly the last level, which is filled from left to right. In this case, we apply the heapify algorithm to the internal nodes, starting from the last internal node and moving up to the root. Heapify ensures that each internal node maintains the heap property by comparing its value with its children and swapping if necessary. This method is efficient and works well when we already have a complete binary tree structure.

Both approaches have their own merits, and the choice depends on the context and available data structure. Whether you prefer the gradual transformation of individual nodes or the systematic adjustment of internal nodes, both methods contribute to constructing a well-organized heap, ready to unleash its power in prioritization and sorting tasks.

Let’s get started with the implementaion of the first method :

Imagine being able to effortlessly insert elements into a max heap while ensuring that the heap property is preserved — the provided code makes it all possible!

# Considering the code only for the MAX HEAP anyways
# for MIN HEAP you just need to check for the min node in heapify function

def insert_node(arr,new_data):
arr.append(new_data)
n = len(arr)

# one node tree is always a heap
if n>1:
#traversing the internal nodes in reverse
for i in range(n//2 -1 ,-1, -1):
heapify(arr,i,n)
print(f"Max heap after inserting {new_data} :",arr)

def heapify(arr,ind,n):
largest = ind

# checking if there are left and right nodes
left = ind*2 + 1 if (ind*2 + 1)<n else 0
right = ind*2 + 2 if (ind*2 + 2)<n else 0

# finding the largest from the parent and the child nodes
if left and left <=n and arr[largest]<arr[left]:
largest = left
if right and right <=n and arr[largest]<arr[right]:
largest = right

# swap the elements if got found a largest value than the parent
if ind != largest:
temp = arr[ind]
arr[ind] = arr[largest]
arr[largest] = temp
heapify(arr,largest,n)

arr = []

# inserting the nodes into the tree which is in array representation

insert_node(arr,5)
insert_node(arr,6)
insert_node(arr,7)
insert_node(arr,8)
insert_node(arr,9)
insert_node(arr,10)

The code implements the insertion of elements into a max heap and maintains the heap property, which ensures that the maximum element is always at the root of the heap.

- The `insert_node` function takes an array `arr` and a new element `new_data`.
- The new element is appended to the array.
- If the array has more than one element, the heapification process is performed.
- Starting from the last internal node, each node is checked and swapped with its larger child if necessary.
- The process is repeated recursively until the entire heap satisfies the max heap property.
- After every insertion, the updated max heap is printed.

Output for the Program looks like:

Output of the heap after each insertion of node
Max heap after inserting 5 : [5]
Max heap after inserting 6 : [6, 5]
Max heap after inserting 7 : [7, 5, 6]
Max heap after inserting 8 : [8, 7, 6, 5]
Max heap after inserting 9 : [9, 8, 6, 5, 7]
Max heap after inserting 10 : [10, 8, 9, 5, 7, 6]

Now, let’s dive into the second method that efficiently performs heapification to maintain the max heap property in a complete binary tree.

In the second method, we begin with a complete binary tree, and then focus on the internal nodes where we apply the heapification process to ensure that the max heap property is preserved throughout the tree.

To build the tree, we initially start with the linked list representation followed by array representation, which allows us to create a tree structure easily. To get a clear understanding of the process, I have written a blog on creating a tree in Python.

Tree Representaion Blog : Click Here

Take a quick look at it, and it will provide you with a simple and concise explanation.

# class for a node 
class Node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None

# can initially have an empty tree and insert the new nodes into this
# ca use this object as the head pointer of the tree as well
class BinaryTree:
def __init__(self):
self.root = None

def insert(self, data):
if self.root is None:
self.root = Node(data)
return

# this logic is used to insert the values into the tree in
# complete binary tree property
queue = [self.root]
while queue:
current = queue.pop(0)

if current.left is None:
current.left = Node(data)
return
else:
queue.append(current.left)

if current.right is None:
current.right = Node(data)
return
else:
queue.append(current.right)

And next coming to the insertion of nodes into the tree :

tree = BinaryTree()

tree.insert(10)
tree.insert(14)
tree.insert(19)
tree.insert(26)
tree.insert(27)
tree.insert(31)
tree.insert(33)
tree.insert(35)
tree.insert(42)
tree.insert(44)

Let’s now try to print the nodes in level order(BFS) :

class BinaryTree:
...
...
def print_nodes(self):
start = self.root
que = []
if not start:
return
que = [start]
while que:
node = que.pop(0)
print(node.data)
if node.left:
que.append(node.left)
if node.right:
que.append(node.right)

Add the print_nodes() in the class BinaryTree as shown above.

# calling the print_nodes funtion
tree.print_nodes()


# the output will be like

10 14 19 26 27 31 33 35 42 44

As our tree is ready, let’s now understand the logic how to make the tree a max heap

  1. In anyway the leaf nodes of the tree satisfies the heap property. so, the remaining we need to focus on is the internal nodes.
  2. heapfication should be in done from the bottom but one level nodes to top (which is bottom up heapification) but not from the top node to the bottom. The reason for starting from the bottom internal nodes is that they are the nodes that potentially violate the heap property. By fixing the heap property at the lowest levels first, we can gradually move up the tree, fixing violations at each level until we reach the root.
  • To do so we first find the internal nodes in the tree.
  • then loop them in reverse order and do the heapification for each node.
  • we code for a heapify function. It compares the current node with its left and right children, and if necessary, swaps values to ensure that the largest value moves towards the top. This process is applied recursively to internal nodes, starting from the bottom of the tree. By repeatedly calling heapify on each node, we transform the binary tree into a valid heap structure, either a max-heap or a min-heap, depending on the desired ordering.
# this function is to find the internal nodes of the tree and 
# return them in a list

def inter_nodes(root,i_nodes):
que = [root]
while que:
node = que.pop(0)
if node.left or node.right:
i_nodes.append(node)
if node.left:
que.append(node.left)
if node.right:
que.append(node.right)
return i_nodes

# this is function is called from the build heap function.
# it will check for the nodes satifying the heap property and
# tries to swap them with the parent if needed
def heapify(start_node):
large = start_node.data
left = start_node.left
right = start_node.right

# if the node consists of bothe children
if left and right:
if left.data > right.data and left.data > large:
temp = start_node.data
start_node.data = start_node.left.data
start_node.left.data = temp
heapify(start_node.left)
return
elif right.data > left.data and right.data > large:
temp = start_node.data
start_node.data = start_node.right.data
start_node.right.data = temp
heapify(start_node.right)
return
else:
return

# if it has only eiether of the children
if left:
if left.data > large:
temp = start_node.data
start_node.data = start_node.left.data
start_node.left.data = temp
heapify(start_node.left)
return
if right:
if right.data > large:
temp = start_node.data
start_node.data = start_node.right.data
start_node.right.data = temp
heapify(start_node.right)
return

# this function will loop through the intrenal nodes in reverse order
def build_heap(i_nodes):
for i in range(len(i_nodes)-1,-1,-1):
heapify(i_nodes[i])
return

Calling of these functions and the Output is :

tree.print_nodes()
i_nodes = inter_nodes(tree.root,[])
build_heap(i_nodes)
print("after heaping")
tree.print_nodes()

# output
10 14 19 26 27 31 33 35 42 44
after heaping
44 42 33 35 27 31 19 10 26 14

Additionally, we can extend these concepts to the array representation of a binary heap, although it has been omitted in this blog due to the length.

Please try to do it. It’s as simple as this. If you are unable to do so try to look for it in my github repository : Click Here

In conclusion, we have explored the concept of building a complete binary tree and performing heapification. We have seen how these operations are crucial for maintaining the heap property and creating a valid heap structure.

Thank you for taking the time to read this blog. I hope it has provided you with a clear understanding of building complete binary trees and performing heapification. If you have any further doubts or questions, please feel free to reach out. Stay curious and keep exploring!

--

--