Tries

Petter Daae
4 min readNov 23, 2021

--

A trie is a search tree that lets you insert and lookup keys. In this post we assume that the key is a string consiting of letters from a to z, but it could potentially consist of other letters or parts as well. Tries behave a lot like dictonaries or hashmaps, but they give other runtime guarantees and have some additional interesting properties.

Algorithm

When building a trie, we start with an empty root node.

Now, let’s insert the word “foo”. We start by checking if the root node has a child for the letter “f”. It doesn’t, so we create it.

We proceed with the letter “o”. Does the “f” node have a child for the “o” letter? It doesn’t, so we create it. Similarly, we check if the new “o” node has a child for the “o” letter, and you can see the complete result of inserting “foo” below.

You can also see that we marked the last letter of “foo” so that we know that we have not inserted words like “f” or “fo”.

Now, let’s insert another word “bar” just to make our trie a little bit more interesting.

But we can make it even more interesting by inserting words that have letters in common with the words that are already in the trie. This is where the trie really shines. We insert the word “football”. And as you might have suspected, we get the below result.

We see that words that have common prefixes share nodes. This is a good thing since it bounds the size of the tree to something that is less than the total number of letters in all keys. Additionally, we know that the depth of the search tree is never more than the length of the longest key.

Now, when looking up a key, we do exactly the same as when inserting a key. We iterate over the letters in the key and travel down the tree selecting nodes that match the current letter in the key. If we arrive at a blue node at the last letter, we know that the key has been inserted previously. Otherwise, it has not. We might, for example, end up at node that does not have a child for the current letter.

Note that we could also store values at the blue nodes, emulating the functionality of a dictionary or hashmap.

Code

Below, you can see a simple python implementation of a trie.

Why not use a hashmap instead?

Well, hashmaps have constant lookup right? So, why do we need tries? Actually, the constant lookup time for hashmap is just an average. This means that it is not guaranteed constant lookup, we might be unlucky and get linear lookup. So, the trie might give a better guarantee if we have a bound on the size of the keys (because the trie will never be deeper than it’s longest key). In other words, it depends on the problem.

Other interesting properties

Tries have some features and properties that are not immediately obvious. We discuss two of them here.

If we sort the children of each node lexiographically, we can sort all the strings in the trie by traversing it pre-order with depth first search.

If you think about it, this is maybe one of the most natural ways to sort strings manually. You first sort by the first letter and then you look at next letters in all the words that start with the same letter.

This actually also leads into the next property. Tries are really good for autocomplete. If we know the prefix of a word, the trie makes it quite simple to list all possible words. Simply look at the sub tree of the prefix. Because of the sorting property, we can also, very effortlessly, list them in lexiographical order.

Problems

Below are some online (competetive) programming problems you can try the trie on.

References

--

--