Developing an Algorithm and Data Structures collection in Elixir Part II — Binary Search Tree

Sasha Fonseca
4 min readFeb 20, 2016

--

In the first part I presented my project Exads (Elixir Algorithms and Data Structures) with my goal of building the largest and most complete algorithms and abstract data structures collection for Elixir. In the last post I talked about my implementation of the Priority Queue. Like in the previous part, I’ll gladly take contributions to my project in any form (issues, pull requests to implement another algorithms or ADS, test suites, ideas).

Now in this second part I’ll describe my experience developing the Binary Search Tree abstract data structure.

Specification

A Binary Search Tree is a data structure made of nodes and where each one these nodes has a value and is linked to another two nodes named left and right (and linked by its parent node, of course). Unlike in the regular Binary Tree, this tree has defined rules to about its nodes way of being stored. Given a node x and its successor node y, then y will be on the left side if y < x or on the right side otherwise. In Introduction to Algorithms (3rd Edition 2010), by T. Cormen et al. (aka the Algorithms Bible) has the following quote about Binary Search Tree (BST):

The search tree data structure supports many dynamic-set operations, including
SEARCH, MINIMUM, MAXIMUM, PREDECESSOR, SUCCESSOR, INSERT, and
DELETE.

I’ll skip their description since their name is quite self explanatory.

Implementation

The first decision to make when implementing these kind of ADS is what other data structure will be used to hold the data and to represent the ADS itself. In this particular case, my first instinct was to either use Tuple or Map. Although both would serve my needs quite well, maps in Elixir are much powerful than Tuple, having many more operations available. However, Elixir has the Map and the Struct, which can be seen as a named module (due to a special __using__ macro that receives the module’s name and uses it as the struct’s). Initially, I implemented every operation using the struct, which seemed a good idea at the time, but I soon started running into a infuriating problem. As I stated, structs take their module’s name as their own. This means that they can end up with pretty long names and chain of maps, which although Elixir offers a short-cut to reference the module’s name (typing __MODULE__, also a special macro) when developing, it does not present any kind of abbreviation when outputting results, which will lead to some serious headache and confusion when trying to read the results and debug the code. For this reason, the following implementation is presented with maps.

Pattern matching and expressiveness come into play onde again, when declaring the maps used as having the value of :leaf when they are “empty”, i.e. they are null. Inserting elements can be done in a quite straightforward manner, not different from other functional languages:

Combined with the new/1 function (to build a new BST), insert/1 takes a three keyed map, one for the node’s value and another two for the left and right nodes, which will themselves be another three keyed maps (and therefore also BST’s) and takes a value. This value is compared against the current node’s value and its insertion into the left or right sub-tree is done recursively, according to the BST’s sorting rule.

Deletion in BST’s is not exactly trivial. I had to refresh my memory on BST’s for this one.

I’ll leave this link for anyone who wants to fully understand the requirements to properly delete a node in a BST, but the underlying problem revolves around not wanting to leave “gaps” or “holes” in the tree. Therefore, these gaps must be covered linking the remaining sub-tree to the deleted node’s parent. Although this may be a larger snippet of code, Elixir is quite expressive and lets it be readable and succinct.

Elixir also has many different constructs for control flow and conditions. Besides the more common if and cond do (similar to a switch construct), the language also provides unless (has the same function as if but offers a different interpretation), case do, which is great for pattern matching and since version 1.2, the with special form is also present.

For the SEARCH operation I implemented the two more common forms of BST searching algorithms, Depth-First Search (all three order variants) and Breadth-First Search. Elixir provided me with a fun implementation of these:

I think Sublime Text is messing up my indentation when uploading to GitHub

This operation takes the tree to search and an atom named order to input the desired order. Using one of the conditional constructs, I implemented all the three variants (pre-order, in-order and post-order) inside the same helper function. However, another way to achieve the same result would be pattern matching different function based on the received order. Elixir does not take any kind of performance hit with either of these approaches, since internally, the functions are “unrolled” and compiled into similar compositions. Using lists, the results from each branch are concatenated into a single list, upon each end reaching its leaf.

Breadth-First Search didn't have anything special to it, in comparison to other functional programming languages:

PREDECESSOR and SUCCESSOR were achieved by slight modification of INSERT, by returning the whole sub-tree when the particular value was found. My BST implementation also has other operations, like node_depth/2 and tree_height/1, as well as exists?/2 and how_many/2 (self explanatory). The MINIMUM and MAXIMUM operations can be achieved through sorting the list output of one of the search functions.

Conclusion

Once again I invite anyone to help me on this project and collaborate, either by providing improvements, new algorithms or data structures or even test suites for the existing ones. The README has a roadmap with other stuff I’m thinking of implementing, but you don’t have to necessarily implement those, be free to explore other things. I’ll also take any constructive criticism!

For the next chapter I’m thinking about taking a break from ADS’s and implementing the Binary Search algorithm. Until the next one!

--

--