Learning new things always brings the opportunity to have your mind completely blown. Mind you, this always doesn’t happen — at least when it comes to the realm of computer science. However, every once in awhile, you learn something that truly awes you, something that is actually awesome in the actual dictionary definition sense of the word; something that really feels like a lightbulb going off, making you feel very grateful to have set about trying to understand it in the first place.
I’m speaking from personal experience, of course, because this was exactly what happened to me when I learned about the wonders of AVL trees. It turns out that computer science has elements of mathematics under the surface, some of which we have already seen in the course of this series. But mathematics also has another hidden layer to it, as well: a connection to the natural world around us.
Oftentimes, when we learn about super theoretical stuff (like a lot of core computer science concepts) it’s hard to connect the dots between where these concepts were created, why we use them, and why we should care. Understanding the history of why something was invented is just as important as comprehending its applications and usage. It’s a bit rare for us to be able to see how something in computer science directly connects to other fields that have nothing to do with computing!
In today’s post, however, mathematics, computer science, and nature all come together in the most magical of ways. So, let’s get to it!
Looking for tree patterns
Last week, we learned about a unique data structure called an AVL tree, which is a type of self-balancing binary search tree. Hopefully you remember that an AVL tree is a height-balanced tree, which means that it follows a golden rule: no single leaf in the tree should have a significantly longer path from the root node than any other leaf on the tree.
The important thing to keep in mind about this golden rule is that as an AVL tree grows in height, it must have a minimum number of nodes at each level before another one can be added.
The illustration shown here exemplifies this. In this drawing, we can see five different AVL trees, each with a varying heights.
In the AVL tree with a height of 0, there are no nodes, so n is equal to
0. In the AVL tree with a height of 1, there is exactly
1 node, while in an AVL tree with a height of 2, there are exactly
2 nodes. Remember, we’re only considering the minimium number of nodes. Technically, we could have added a third node to our AVL tree with a height of 2, and that would still be fine; but, we need at least two nodes in order to create a height-balanced AVL tree with a height of 2.
What happens when we need to create an AVL tree with a height of 3? Well, we need
4 nodes at minimum to do that, since we need to add a right subtree before we can go about adding another level. It’s a similar situation for the tree with a height of 4: we need
7 nodes in order to create a height-balanced right subtree before adding another node to the left subtree.
Let’s take it one step further: what is the minimum number of nodes that we need in order to create an AVL tree with a height of 5?
Well, based on what we learned last week about AVL trees, we know that the difference between the left and the right subtrees cannot be more than one level in height. This means that we’ll need to add to our right subtree before we can grow our left subtree.
The example shown here illustrates what our minimum-node number for an AVL tree with a height of 5 would be: we need, at a minimum, 12 nodes in order to create an AVL tree with a height of 5. If we break this down further, we can see that we need 7 of those nodes in the left subtree, and 4 nodes for the right subtree. And, of course, we need a node for the root.
If we look at the left subtree and the right subtree for long enough, we’ll notice that they seem familiar. That’s because we just saw them a minute ago when we were looking at our AVL trees with a height of 4 and 3! Those two trees just happen to be the two structures that came before the AVL tree with a minimum node count and a height of 5.
If you feel like there’s an interesting pattern here, you’re be right.
The height of an AVL tree with the minimum number of nodes is a combination of the two minimum-height AVL trees preceding it.
This is not unique to the AVL tree with a height of 5; in fact, if we look back at the earlier illustration and close closely at the AVL tree with a height of 4, we’ll notice the exact same pattern. The left subtree of the an AVL tree with a height of 4 is the AVL tree with a height of 3, while the right subtree is the the AVL tree with a height of 2.
We could even continue to draw larger height-balanced trees and see this pattern continue, even as the height of a tree gets immensely large.
So, how can transform this pattern out into something more abstract? We derive this pattern to the following formula: “the height of a tree
n is equivalent to the sum of the heights of trees
Another way to think about this is that the minimum number of nodes to create a tree of height n is by combining the two trees that come before it, and adding another node for the root node. In other words, in order to determine the minimum number of nodes necessary to create a tree with a height of 10, you’d need the number of nodes from the tree with a height of 9 and with a height of 8, plus one additional node for the root node.
If we sum the two AVL trees before the one we’re looking for, we can programmatically figure out the minimum number of nodes we’ll need to create any given height-balanced AVL tree.
Voilà! We’ve discovered the Fibonacci sequence right under our noses!
Finding Fibonacci and the golden ratio
The pattern of “summing two numbers to get the third number” is known as the Fibonacci sequence, named after Leonardo Fibonacci. What young Leonardo stumbled upon could be summarized as a mathematical sequence that follows a single rule, which is exactly what we know to be the Fibonacci sequence:
every number after the first two numbers is the sum of the two numbers that preceded it.
Once we have the first two numbers of the Fibonacci sequence, we have enough to build the entire thing (but it goes on forever and forever, so we’ll just skip that for now)! And it just so happens that the first two numbers are pretty easy to remember: they’re 0 and 1.
We already know that there’s some kind of correlation between the Fibonacci sequence and the left and right subtree of a height-balanced AVL tree. Namely, we know that the minimum number of nodes for a tree with a height of H will be the two minimum node trees that come before it, plus one more node for the root node.
Since we derived a formula in the previous section to determine the height of the tree based on the number of nodes, can we use the Fibonacci sequence to determine the reverse? Can we abstract out a formula to determine the minimum number of nodes for a tree based on it’s height?
Of course we can!
Let’s break down the math that’s working behind the scenes here and see if we can figure out what’s actually going on.
In the illustration shown here, we’re going to do the reverse of the formula we deduced earlier, which means that we’ll continue working with a tree with a height H of
5. If we’re looking for n, the minimum number of nodes for a tree of height H, we can find the Fibonacci number located at the index of
Fibonacci[H+2], and subtract one from it in order to determine the minimum number of nodes to create a tree of height H.
So, in the case of a tree with a height H of
5, we’ll need to find the Fibonacci number at the index of
5 + 2, or
7. Recall that we indexed our Fibonacci sequence beginning with
0 as the first index, so the element at
Fibonacci will give us
13 - 1 is
12, we know that we’ll need a minimum of
12 nodes to create a tree with a height of 5.
When we drew out a balanced tree with a height of 5 in the previous section, that’s exactly how many nodes we had!
Okay, what about a tree with a height H of
6 + 2 is
8, and the element located at the index of
Fibonacci would be the number
21. Remember, even if we don’t know the element at the index of
Fibonacci, we know that
Fibonacci = 8 and
Fibonacci = 13, which means we can sum these two in order to get the next element in the sequence. Since
Fibonacci = 21, we can subtract one from it and know that we need, at a minimum,
20 nodes in order to create a height-balanced AVL tree with a height of 6.
Rad! We just reversed our first formula, which used the Fibonacci pattern, into its opposite…which also uses the Fibonacci pattern!
Okay, that’s pretty cool, but it’s about to get even cooler.
The sequence that has long been attributed to (and named for!) Fibonacci was actually discovered hundreds of years earlier by ancient Indian mathematicians, way back in the 6th century. It was only with Fibonacci’s discovery of it 600 years later that the western world came to know of this sequence, and were thus able to study it in even greater depth.
As more mathematicians, scientists, and artists started studying this sequence, they began to understand its deep connections to the world around us. One of the things that they discovered was that the Fibonacci sequence was very closely linked to another special property in geometry and mathematics: the golden spiral and the golden ratio.
It’s unclear when the golden ratio was discovered; although it was Greek mathematician Euclid who first mentioned it in his famous, well-studied and well-cited text, “Elements”, it’s also very possible that the ancient Egyptians used it when they built the pyramids, thousands of years ago. The golden ratio is often referred to as “phi” (φ or ϕ), and is also known as DaVinci’s “golden proportion”. In mathematical terms, two quantities have a golden ratio if the ratio between the two quantities is the same as the ratio of the two elements summed when compared to the larger of the two quantities.
The golden ratio is usually illustrated by using the example of a rectangle made up of a square and a rectangle, respectively.
In the example shown above, we can see the golden ratio at work. When we divide a line into two parts such that the whole length divided by the larger part is equal to the larger part divided by the smaller part, we end up with the golden ratio. In this particular illustration, the ratio of the longer side of the rectangle, the sum of a + b, to the side a will be the same as the ratio of the length of a to the side b.
But what exactly is the golden ratio? We’re comparing a’s and b’s, but what does that mean, really? Well, let’s look at the golden ratio using actual numbers, and see if we can get to a more concrete answer.
We know that the golden ratio is often described as follows:
a + b is to a, as a is to b
Let’s replace a and b here with integers:
a = 5 and
b = 3.
If a + b is to a, as a is to b, then we should be able to prove that
5 + 3 to
5 has the same ratio as
5 does to
Well, it’s time to do the math and find out if that’s really the case!
5 + 3 = 8, we determine the ratio here by dividing
8/3, which is
1.6. If we divide
5/3, we get 1.6, repeating.
In both of these instances, we’re actually coming very close to the value of phi (φ), which is an irrational number that is the sum of 1 and the square root of 5, divided by 2. This irrational number roughly approximates to
1.6180339887, often just referred to as
1.618, for short.
So how does the golden ratio tie into the golden spiral? Well, the golden spiral is a geometric, logarithmic spiral whose growth factor actually happens to be — surprise, surprise! — the golden ratio. And guess what? The Fibonacci sequence comes into play here again, too!
If we construct squares with the width of each consecutive element in the Fibonacci sequence, we can start to build up rectangles. We’ll notice that each one of these rectangles has the golden ratio within it! And, as these rectangles grow, a spiral shape begins to emerge.
We’ll notice that each square’s width is made up of two parts: another square and a rectangle, each of whose widths are the sums of the two squares (the two Fibonacci numbers) that precede it. For example, the width of the square 8, drawn in pink, is the sum of the widths of the two squares that came before it: 5 and 3. Similarly, the width of the red square 13 is the sum of the two squares that came before it: 8 and 5.
The Fibonacci sequence appears once again! The ratio between each of these squares that we used to build up our golden spiral is tied directly to the golden ratio. If we divide each Fibonacci number by the number that preceded it, this becomes more obvious.
Notice that we get closer and closer to the actual value of phi (φ) as our Fibonacci numbers grow in size. We won’t ever hit the golden ratio exactly, since it’s an irrational number; but the larger our Fibonacci number, the closer we’ll get to it as we compare one ratio to another.
The Fibonacci sequence and the Golden Spiral exist all around us — in the arms of a starfish, the patterns of a seashell, the branching of a tree — and even within us — inside the human ear, within the proportions of our fingers, and even the ratio of the human arm! My personal favorite example of this sequence in nature comes from my favorite flower: the sunflower.
According to research from The American Association for the Advancement of Science,
The telltale sign is the number of different seed spirals on the sunflower’s face. Count the clockwise and counterclockwise spirals that reach the outer edge, and you’ll usually find a pair of numbers from the sequence: 34 and 55, or 55 and 89, or — with very large sunflowers — 89 and 144. Although the math may be beautiful, plant biologists have not worked out a mechanistic model that fully explains how the sunflower seed patterns arise.
And guess what? There’s yet another way that this pattern is connected to AVL trees, which is perhaps what makes them so damn golden.
Time to find out!
The golden (tree height) ratio
What is the final petal in this metaphorical flower of Fibonacci?
In order to bring it all home, we’re going to need to go back to where we started: determining the relationship between the minimum number of nodes and the height of an AVL tree. We’ve looked at two different formulas: first, we figured out how we could determine the height based on the minimum number of nodes we needed; then, we determined at the number of nodes we’d need based on the height that knew a tree had.
Well, it turns out that the Fibonacci sequence and the golden ratio were under the surface of both of these formulas the entire time!
When we used the minimum number of nodes in a tree in order to determine its height, what we were really doing was taking the log, with a base of the golden ratio, of the minimum number of nodes, which gave us an approximation of the height of the tree. In fact, this is exactly where the logarithmic complexity of an AVL tree comes from! We just generally don’t refer to the base, so most of the time it’s never explicitly stated that we’re taking log base 1.618.
For example, when we determined that a tree with a minimum number of 54 nodes would give us a height of 8, what we were really doing was taking the log base 1.618 of 54, which approximately yields us 8.29, which rounds to an integer value of 8.
If we were to graph the relationship between the number of nodes and the height of the tree, we would effectively be graphing log base golden ratio of n, where n is the number of nodes in a tree.
Another way of thinking about this is that the relationship between the number of nodes and the height of a tree is logarithmic.
As the minimum number of nodes in an AVL tree grows linearly, the height of the tree grows logarithmically.
If you remember learning about logarithms earlier in this series, then you might recall that a logarithmic function is the inverse of an exponential function. So, what about the second formula we looked at, where we determined the minimum number of nodes based on the height of the tree?
The relationship is going to be the inverse of a logarithmic relationship: it’s going to be exponential. We can see this play out in the illustration below.
As the height of an AVL tree grows linearly, the minimum number of nodes it has will grow exponentially (approximately, of course).
We can see that each time that we took the height of a tree as the input and used it to derive the minimum number of nodes for an AVL tree of our inputted height, we were effectively raising the golden ratio to the power of our height, h.
For example, when we wanted to calculate the minimum number of nodes for an AVL tree with a height of 8, we raised φ to the power of 8, which gave us approximately 46.98. Remember, since we’re dealing with a rounded version of an otherwise irrational number, as our numbers get larger, our approximations have a much higher margin of error. This explains why our calculation for a tree with a height of 2, 3, 4, or 5 are much more accurate than our calculation for a tree with a height of 8. If we were to graph these results, we would actually be graphing the golden ratio raised to the power of h, the height of our tree.
If you’re feeling amazed, in awe, and completely in wonder of what you just learned, don’t worry — I felt the exact same way, too. I don’t think I’ll ever think of a sunflower or an AVL tree in the same way, ever again.
(And, if I’ve done my job well, neither will you.)
I had trouble containing all my excitement about Fibonacci, golden ratios, and AVL trees, and did a whole lot of research on all three of these topics. If you’re as fascinated and amazed by them as I am, you’ll enjoy the links below.