Published in

The Startup

As an artist transitioning into computer science, nothing struck fear into my heart like seeing mathematical notation.

When it came to understanding the runtime of algorithms and how it scales given an input, (time complexity) I was overwhelmed with the mathematical notation, coefficients, graphs, proofs, and concepts that seemed a lot more complex than they actually are. I thought I’d need to solve equations to be able to effectively find the runtime of my functions. Thankfully, I discovered that for the most part the runtime of algorithms can be figured out intuitively by following a small set of rules.

First, a small crash course on time complexity. You may hear it called many names so it can get confusing, especially when you realize that there are multiple notations for time complexity. Generally when looking for the time complexity of an algorithm, we are looking for the upper bounds (or the worst case) of the growth rate of an algorithm runtime, so for our intents and purposes — and in most cases — big O and time complexity can be used interchangeably.

So we want to find the growth rate of an algorithm — what does that mean? Because we are interested in the rate of growth of the runtime, rather than a precise time estimate of the runtime, time complexity is machine and environment independent. It describes the pattern of growth, which will be the same no matter the machine or environment for a given algorithm. Even if benchmarks of the same algorithm on two machines are different, their rate of growth will be the same.

In the graph above we can see the most common growth rates:

O(1) — Constant

O(n) — Linear

O(n²) — Polynomial (In this case quadratic)

O(log n) — Logarithmic

Now that we understand what finding the growth rate means, let’s tackle what the growth rate depends on — a given input. If our input is the size of an array, we want to see how our algorithm will perform given an array of size n. How will the runtime of our algorithm grow if there are 5 elements in our array, or 5 million? As you can imagine, this is extremely valuable knowledge when it comes to optimizing resources, especially when working with large data sets.

We’ve broken down what it means to find the growth rate of an algorithm given a certain input — so now let’s apply the growth rates listed above and think about what that rate of growth means for an input of size n.

The first, and perhaps easiest to digest is O(1), or constant time complexity. This is exactly what it sounds like, no matter the size of the input— one element or 1 million elements — our algorithm will take a constant amount of time, so its growth rate will be constant. An example of a constant time operation is a hash table look up; no matter how many key value pairs there are in our hash table, looking up the value of a key takes a constant amount of time.

Linear time can also be conceptualized easily. Let’s imagine that our array is a line of people, and we need to find the oldest person in line. Our line is not sorted, and we can’t simply glance at the line and guess, we have to be precise. In order to find out we would need to ask every person in line, since the oldest person could be at the very end of the line. If it takes one minute to ask every person and there are 5 people in line, it will take 5 minutes to find out who the oldest is, but if there are 600 people, it will take 600 minutes. Linear time means that the runtime has a direct relationship with the size of n, and will increase the same amount for every element in an array. As you can imagine, linear time algorithms are not ideal for repetitive actions on large data sets. The most common example of a linear time function is a for loop that iterates through an array, touching every element once. Because it is purely dependent on the size of the input, we express this as O(n).

Going back to our line of people example, let’s imagine that it’s a new day and this time we have to find if any two people have the same age. We are still feeling masochistic, and refuse to sort the line, so this time the process will be even more time consuming. Enter quadratic time.

This time we have a friend helping us, let’s name them Jay. Our plan is that for every person in line, I will ask them their age, and Jay will run through the entire line to see if anyone else’s age matches the person that I just asked. So if asking a person still takes one minute, and there are 60 people in line, I will have to wait 60 minutes until Jay is finished asking everyone in line before I can move on to the next person. Does this sound familiar? It’s a nested for loop! Because Jay asks every person in line per person in line, the time complexity of this approach is O(n²). Needless to say, this approach would be impossible for humans to accomplish on a line of 60 people, but our computers can handle these operations much better than we can. With data sets in the millions or hundreds of millions however, we try as hard as we can to prepare our data in ways that will spare us a quadratic runtime fate.

If you’re like me, logarithmic time complexity is where not having a strong math background may begin to affect you. If you’ve studied logs this may seem intuitive, but I will break it down for those who, like me, were not drawn to math in past lives.

A logarithmic runtime can be thought of as the opposite of a polynomial runtime, so it’s very desirable, but it doesn’t happen as effortlessly. For our logarithmic example, let’s think of a binary tree. Every node has a value, and luckily for us, this is a binary search tree.

If you’re not familiar with a binary search tree, it is a tree where every node has two children, and the value of the left node will always be less than or equal to the value of the parent node while the value of the right node will always be greater. If we wanted to search for the largest value in the tree above, even though there are 12 nodes in our tree, we would only have to access 4 nodes. Because we are looking for the largest value, we simply always go to the right. Every time we do so, we are almost halving the pool of nodes we have to look through. You can think of n as halving with every step taken by our function, which gives this algorithm a logarithmic time complexity that we express as O(log n).

Now that we’re equipped with some basic knowledge about big O, it’s easier to conceptualize what would affect the runtime of an algorithm. Sorting is a key factor in most cases. For example, if we have an unsorted array, finding the max value will be linear if we aren’t creative and use a simple for loop to iterate and check every element. However, if we sort our array, it wouldn’t matter if our array had 5 elements or 5 million elements, we still only need to check the last element if we've sorted in ascending order, so suddenly getting the max value can be done in constant time! Of course sorting algorithms themselves have their own time complexity, but we can apply the same understanding of runtimes to find the appropriate sorting algorithm for a given use case.

Although I have made very simple and clean examples here, I don’t mean to imply that this is the extent of time complexity. It can in fact get more complex, and I've only addressed the most common growth rates (let’s not forget space complexity either.) The time complexity rabbit hole runs deep, but thinking about these common cases in simple and intuitive terms has helped me understand more complex runtime calculations, and I hope they help you do so as well!

--

--

--

More from The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +756K followers.

Patricia Arnedo

Curious and passionate software engineer that uses way too many gifs.