Time Complexity & Big O Notation Explained, Part 1

Sebastien Dornel
5 min readAug 10, 2020

As developers, a common question that should be considered when writing code is the ‘cost’ of what you are writing. By ‘cost’, what I mean is the time it takes for a particular algorithm to execute.

For example, if we were to write an algorithm to find a particular number in an array I could write the following:

For those of you who don’t know, this is called the brute force approach. In the code above, we look at each index of an array to see if it contains a specific number. If the number exists in the array, return true. Else return false.

In order to estimate how long a procedure takes, a quick method is to count how many lines of code are involved. The code (excluding closing brackets) is four lines long. However, there is definitely more at play here. In this particular example, the array needs to be iterated through four times before it finds the number we are looking for. As a result, the code in this particular example is actually 7 lines long.

Because the array here is very small, the brute force approach works just fine. When working with large sets of data what you’ll find is that an algorithm like this one can be very inefficient. If the array had one thousand numbers and I was looking for a number that was not in the array, the for loop would fire one thousand times in order to determine that the number we are looking for does not exist inside of the array.

We call this the time complexity of a function.

Time complexity is defined as:

Time complexity is very strongly influenced by whether or not there is a match as well as where this match occurs.

If I were to input this data into the function:

I would no longer require my code to examine every number in the array to find a match. Therefore, the algorithm would take far less time to run.

When evaluating the performance of an algorithm, it is standard practice to choose the worst case scenario that could occur when using that algorithm. This means that we would evaluate the performance of the algorithm above by finding the longest time it would take for it to finish running.

In other words, instead of evaluating the algorithm like this:

Best case scenario

we would instead do this:

Worst case scenario

The reason why we don’t use the best case scenario is that any algorithm can perform extremely well when given its best case scenario. One could draw a number of parallels to this in the real world.

One parallel is that of somebody in the gym trying to see how good they are at doing pushups. In order to assess their ability to do pushups, it wouldn’t make sense to merely do the bare minimum and avoid taxing the body. It makes far more sense for this person to assess pushup ability by performing as many pushups as possible.

We also don’t typically use the average case scenario. The reason for this is that it is often difficult or even impossible to determine the average time it would take for an algorithm to run. For the algorithm we wrote earlier, the array parameter that we pass in could have fifty numbers or it could have fifty million numbers.

In this particular case, calculating the average case scenario is simply not possible as there is no limit to how high a number can be. It’s true that we could figure out which numbers are more likely to be included in the array as well as how long the arrays tend to be in order to estimate an average case scenario but even then it would still be quite the challenge. Accordingly, developers tend to stay away from using the average case scenario.

The worst case scenario makes the most sense to utilize. This is because even if an application is generally very fast, if every once in a while it ends up becoming very slow, people will be less likely to continue using said application.

The worst case scenario can be summed up like this:

Given a certain algorithm, find an input that somebody who wants the algorithm to be as slow as possible would choose.

Earlier, the algorithm we used to figure out if an array included a certain number was this:

As you can see, this is an extremely inefficient algorithm. If we were to put two loops together to iterate through nested data, the worst case scenario would be far worse. Next week I’ll talk about how to refactor this code into something more efficient as well as discuss Big O notation.

--

--