Application of Tries and Ternary Search trees

Akshat Anand
DataX Journal
Published in
5 min readDec 19, 2020

Learn how you can effectively implement tries and ternary search trees in real-world applications.

Motivation

So the motivation behind learning Tries and Ternary Search trees was my inability to answer one of the questions which were asked in one of my internship interviews. The question which was asked as “How you can implement or design a data structure so that you can search a word in a dictionary?”, my answer was “We can implement using Hash Tables and discussed the structure and process of finding a particular word by searching the key of the word in the dictionary”, so after this interviewer asked, “How you can avoid collision handling, also the one word with a different meaning?”, I was like speechless because I didn’t know about Tries and Ternary Search trees.

Ps: If you haven’t read about tree data structure, please refer to this link

Update: Got an Offer from the same company as the FTE role, so be consistent and never give up :) .

Tries

Tries are a tree that stores strings. The maximum number of children of a node is equal to the size of the alphabet. Trie supports search, insert and delete operations in O(L) time where L is the length of a key.

It is an efficient information reTrieval data structure. Using Trie, search complexities can be brought to the optimal limit (key length). If we store keys in the binary search tree, a well-balanced BST will need time proportional to M * log N, where M is maximum string length and N is a number of keys in the tree. Using Trie, we can search the key in O(M) time. However, the penalty is on Trie storage requirements

A trie is, like other tree-based data structures, made up of a set of nodes connected by pointers. These pointers indicate a parent-child relationship between the nodes. Parents are above their children in the tree. Typically the root node of the tree is an “empty” node so that it can point to all members of the alphabet the trie is using to store.

Unlike other tree-based data structure like the AVL tree, the trie is not necessarily balanced. It is totally dependent on the contents of the tree.

Nodes in the trie can hold either single members of the alphabet or the entire word so far. For example, the tree could look like the graphic above, or it could look like this:

So why try Trie?

  1. With Trie, we can insert and find strings in O(L) time where L represent the length of a single word. This is obviously faster than BST. This is also faster than Hashing because of the ways it is implemented. We do not need to compute any hash function. No collision handling is required (like we do in open addressing and separate chaining)
  2. Another advantage of Trie is, we can easily print all words in alphabetical order which is not easily possible with hashing.
  3. We can efficiently do a prefix search (or auto-complete) with Trie.

Disadvantages with Trie:-
The major disadvantage of tries is that they need a lot of memory for storing the strings. For each node we have too many node pointers(equal to a number of characters of the alphabet), if space is concerned, then Ternary Search Tree can be preferred for dictionary implementations. In Ternary Search Tree, the time complexity of search operation is O(h) where h is the height of the tree. Ternary Search Trees also supports other operations supported by Trie like prefix search, alphabetical order printing, and nearest neighbour search.

Ternary Search Tree

A ternary search tree is a special trie data structure where the child nodes of a standard trie are ordered as a binary search tree

Representation of ternary search trees:
Unlike the trie data structure where each node contains 26 pointers for its children, each node in a ternary search tree contains only 3 pointers:
1. The left pointer points to the node whose value is less than the value in the current node.
2. The equal pointer points to the node whose value is equal to the value in the current node.
3. The right pointer points to the node whose value is greater than the value in the current node.

Apart from the above three-pointers, each node has a field to indicate data(a character in case of dictionary) and another field to mark the end of a string.
So, more or less it is similar to BST which stores data based on some order. However, data in a ternary search tree is distributed over the nodes. e.g. It needs 3 nodes to store the word “INK”.

The advantage of using ternary search trees over tries is that ternary search trees are more space-efficient (involve only three-pointers per node as compared to 26 in standard tries). Further, ternary search trees can be used any time a hashtable would be used to store strings.

Tries are suitable when there is a proper distribution of words over the alphabets so that spaces are utilized most efficiently. Otherwise, ternary search trees are better. Ternary search trees are efficient to use(in terms of space) when the strings to be stored share a common prefix.

Applications of ternary search trees:
1. Ternary search trees are efficient for queries like “Given a word, find the next word in the dictionary” or “Find all phone numbers starting with 8928” or “typing few starting characters in a web browser displays all website names with this prefix”(Autocomplete feature)”.

2. Used in spell checks: Ternary search trees can be used as a dictionary to store all the words. Once the word is typed in an editor, the word can be parallelly searched in the ternary search tree to check for the correct spelling.

If you want to start practising tree, here is some set of questions:-

Tree basic question — link

Tree Intermediate question — link

I hope my shared experience will help someone in the preparation for their interviews. I would be grateful if you do clap and share among your friends and colleagues.

You can follow me here -

Linkedin — link

Github — link

Peace!

--

--