Getting Started With Trees 🌲

Published in

The Startup

8 min readJun 14, 2020

A Tree is an important element, not just in the real world but also when it comes to Programming, Data Science, etc. In Computer Science, a Tree is an important Data Structure which is useful for storing hierarchically ordered data.

We say Trees are the beauty of nature. They need not get ugly when the term — data-structure is appended to them.

If you have read about data-structures, you are probably aware and well acquainted with linear data structures like Arrays, Linked-list, Stacks, Queues, etc. Trees are non-linear data structures which most of us think as something very complex and end our journey into data structure at Trees itself .

Well, this article will be a dive-in into the concepts of Trees, Basic Terminology, their Types, Working and Applications in real life and explain it in an easily understandable beginner-friendly approach.

TREE DATA STRUCTURE

In simplest of terms, a Tree is a data structure consisting of collection of nodes connected by edges (which don’t form a closed loop). They are a hierarchical non-linear structure because they don’t store the data in a linear fashion as in case of arrays or lists.

Each Tree will have root node and the nodes related to it will be connected to the root node as its children. Let’s look into some important terms used in Trees and what these terminology exactly mean.

BASIC TERMINOLOGY 🧐

Root → The topmost node in the hierarchy of the tree data-structure is known as the Root node.
Child → A node which has a parent is called as the Child node of the parent.
Parent → A node that is just previous to the current node is called as the Parent node of current node.
Edge → A link between any two nodes is called as edge or link of those two nodes.

Leaf → A node that doesn’t have any child nodes is called as Leaf node of the Tree.
Siblings → Two nodes which have the same parent are known as Siblings of each other.
Internal Nodes → A node which has at least one child is called a Internal of the Tree.
Height → The height of a node is number of edges from given node to the Root node.
Depth → The depth of a node is length of the path from Root of the Tree to the given node.
Degree of a Node → The maximum number of children that a node has is known as the degree of the node.

There are few other lesser used terminologies like Ancestors, Descendants, Tree degree, etc which are pretty self-explanatory and not commonly used. Now that we have understood the basic terminologies of a tree, let’s look into the different kinds of Trees that we can have.

🌳 TYPES OF TREES 🌴

1. BINARY TREE —

Binary Tree Representation— only two children allowed, left and right.

A binary tree is a tree in which each node consists of at most two children. They can further be classified into :

Full Binary Tree — A binary tree where every node has exactly 2 children (except the leaf nodes, ofcourse !) .
Complete Binary Tree — A binary tree which is completely filled with a possible exception made only at the bottom-most level.
Perfect Binary Tree — A binary tree where each interior node has exactly two child nodes and each leaf node is at the same level.

The following illustration will make the difference clear.

Binary Trees are the most important and commonly used trees. We have just looked at the terminology of Trees and the types of binary trees. But once we have constructed our tree, we do it with the intent to reuse the data stored in the Tree later on for some purpose. Let’s see how we can access the the data stored in a binary Tree.

Binary Tree Traversals

Pre-Order Traversal : In this type of traversal, we visit the root node first, then the left sub-tree and lastly the right sub-tree. This happens recursively for all nodes in the Tree.

preorder(node)
    if (node == null)
        return
    visit(node)             // visit the root node at first
    preorder(node.left)     // then traverse left subtree first
    preorder(node.right)    // lastly traverse the right subtree

Here’s an visualization to make it clear on how the traversal functions —

In-Order Traversal : Here, we traverse the left sub-tree, then visit the root node and lastly move on to the right sub-tree.

inorder(node)
    if (node == null)
        return
    inorder(node.left)    // traverse left subtree first
    visit(node)           // visit the root node
    inorder(node.right)   // traverse the right subtree

Here’s the visualization —

Post-Order Traversal : Here, we start from the left sub-tree and move on traversing the right sub-tree and at last visit the root node.

postorder(node)
    if (node == null)
        return
    postorder(node.left)     // first traverse left subtree first
    postorder(node.right)    // then traverse the right subtree
    visit(node)              // lastly visit the root node at first

Visualization for post-order traversal —

There are other traversal techniques as well like the level-order traversal ( also known as BFS ) in which we visit each node according to their level that they appear in the Tree. There’s also an iterative version of these traversal which is slightly more difficult to implement. The recursion makes everything easier in Trees. Trees are themselves also known as recursive data structure since they involve sub-tree which recursively make up the tree as a whole

2. BINARY SEARCH TREE —

A Binary Search Tree (BST) is binary tree in which —

All nodes in the left sub-tree of root node are less than or equal to the root node
All nodes in the right sub-tree of root node are greater than the root node.

Sometimes, the right sub-tree might hold the value which is equal to the root node but that’s implementation dependent and both can be considered correct.

A BST is used mainly for the purpose of optimizing the search of a node in the Tree. The time complexity of search in BST is O(logN) as compared to O(N) in a normal Tree, where N is the number of nodes.

We can easily eliminate one sub-tree of BST while searching for a value by just comparing the given value with root node value. If the value to be searched is greater than root value, we go to right sub-tree, else we go into the left sub-tree. For this, we must also maintain the BST property while insertion as well as deletion from the Tree.

Insertion Visualization —

Search Visualization —

Lets look at other types of trees in short…

3. AVL TREES —

AVL Trees are just Binary Search Trees which are well balanced. A simple BST can get unbalanced and in the worst case, we may have a tree which has all its nodes on only the left or right side. To ensure that this case does not occur, we use AVL Trees which use different rules to keep a BST balanced.

4. RED-BLACK TREE —

Red-Black Tree are another type of self-balancing tree. It is given the name Red-Black because each node is either painted Red or Black according to properties of Red-Black Tree. Whenever new nodes are added, the nodes are rotated and painted again to maintain the property.

5. TRIE —

Trie is an N-ary search Tree which is based on prefix of a string. They are used to store associative data structure and for efficient retrieval of data and hence the name Trie. They are also called digital tree or a prefix tree.

The most common ones are Binary Tree and Binary Search Tree and constitute the majority of use cases. There are other types of Trees as well like Splay Trees, Segment Trees, Suffix Trees, B-Tree, B+ Tree etc. which are advanced data-structures used for various applications.

🍂APPLICATIONS OF TREES

Trees are used to store information that naturally forms a hierarchy. Different tree data structures allow quicker and easier access to the data in such cases. One real-life example might be corporate structure of employees at different levels. Other similar applications include folder/file structure on our computer, XML/HTML data,etc

2. Binary Search Tree are used to quickly retrieve an element quickly by storing them in some sorted manner. Using BST, AVL Trees or Red-Black Trees this reduces time complexity of retrieval of data to O(Logn) . They support all the operations you can do on sorted array with added benefits of efficient insertion and deletions.

3. B- and B+ Trees are used to implement indexing in many of the popular databases.

4. Trees are also used as syntax tree in compiler construction where the syntax of every program is checked by the compiler using this tree. They are also called as Abstract Syntax Trees.

5. Tries are a type of tree which are used in many applications like autocomplete search, auto-correct, string matching, storing routing information in modern routers,etc which are really useful in our daily life. How many times do we type the full search in google without using the auto-complete feature ?

6. Huffman Encoding and decoding uses Trees. Huffman encoding is an compression algorithmic technique used in files like jpeg and mp3 formats to compress the files and reduce the storage space needed.

7. Decision Trees are used in many Machine Learning algorithms use for decision analysis, used in statistics, data mining, etc.

8. Spanning Trees are used to in routers and bridges in computer networks, cluster analysis, Traveling Salesman Problem, etc.

9. Heap is a specialized tree data structure used in heap-sort, implementing Priority Queue, Dijkstra’s algorithm, etc.

10. Suffix Tree are used for pattern matching and searching in a fixed given text using algorithms like KMP algorithm, Rabin Karp Algorithm, etc.

This article gave you a brief overview of Tree data structures, their working and their usage and applications. It might be overwhelming to understand all the different kinds of Trees at first, but the basics are the same in every case🙂. I hope this article will help you in understanding the basics and getting started with using Trees !