Application of Tree Data Structure

Johnny Prencipe
Feb 7 · 3 min read

Let’s talk about the tree data structure (programmers don’t have their trees upside-down, nature does).

The first thing to talk about is, well… what is it, exactly?

I’m sure everybody is familiar with the following visual representation of a tree:

Visual representation of a tree data structure (courtesy: Wikipedia)
Visual representation of a tree data structure (Courtesy: Wikipedia)

How does Wikipedia define it?

In computer science, a tree is a widely used abstract data type that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.

Not very helpful.

The long and short of it is: A tree is a way to structure your data such that each node can reference any number of children, but any child can only be referenced by one node, while the root (first) node cannot be referenced by any children.

Okay, great. But why should I care?

Tree data structures are uniquely useful in that almost any tree traversal algorithm is going to be resolved in O(n) time. If you have a dataset that has a natural form of hierarchy, for example a folder structure, the ability to write your traversal algorithms such that they return data in a reasonably consistent timeframe is a very powerful ability. This time is going to be brought down to O(log(n)) if you use almost any form of ordered tree, such as a binary search tree, wherein each node on the tree is given an index number based on their position in the tree.

Tree Traversal: Depth First

The different methods of tree traversal will be advantageous in different circumstances. The two main approaches to traversing a tree are breadth-first and depth-first.

Depth-first traversal (Courtesy: Wikipedia)

A depth-first algorithm is an algorithm which prioritizes getting as deep into the tree as possible, before going up one or more levels and branching moving to a sibling branch. The pattern for recursively travelling depth-first down a binary tree is as follows:

  • Check for a child node on the left,
  • If there is one, take it,
  • Otherwise, go right.
  • If you can’t go right, keep going up until you can go right.
  • After you go right, go left again, and
  • Repeat recursively until all nodes are exhausted.

This ensures that the algorithm will exhaust all the nodes on one leg before moving on to the next, and repeating the cycle.

Tree Traversal: Breadth First

The parallel to the depth-first traversal method is the breadth-first method.

Breadth-first traversal (Courtesy: Wikipedia)

A breadth-first approach does the opposite of depth-first, and indeed does what the name implies. A breadth-first searching algorithm looks something like this:

  • If I’m on the root node, go down one.
  • If I have a sibling, check out the sibling. Do this until I have checked all siblings.
  • If I have no unchecked siblings, go down one.
  • Repeat recursively until all nodes are exhausted.

Both of these tree traversal methods will achieve the same result at the end and have similar speeds assuming a binary data tree, and the usage of one over the other will depend on the specific situation.

To Summarize

The application of a tree data structure over other data structures, or different types of traversal methods over others, depends largely on the situation and context of the problem you’re trying to solve. This article covers only the two most basic and common traversal methods and only for binary trees, but it must be stated that it is not at all exhaustive, and does not touch on ordered trees or more advanced traversal methods, such as inorder or postorder traversal.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…