Big-O Demystified

Algorithms for Everyone

When discussing algorithms, computer scientists typically use Big-O notation. It’s the most common way to express algorithm performance, but it’s also the quickest way to confuse new programmers or developers without a computer science background.

If you’re trying to understand what this mystical O(…) abstraction means, the first thing you might try is a quick Google search:

Apparently, there’s a Japanese animated series called “The Big O”, which has nothing to do with algorithms. If you’re like me, you probably clicked on that link instead and just finished watching the entire animated series.

Now that you’re back, let’s take a look at the definition of Big-O on Wikipedia:

Formal definition of Big-O Notation (Wikipedia)

NOPE.

If you have a math degree, that probably makes complete sense and you can stop reading now because this article is not for you.

If your eyes glazed over, keep reading. This article is for everyone else.

What is Big-O?

When you read articles or books about algorithms, there’s no doubt that you’ll see the performance of the algorithm expressed as one of the following:

  • O(n²)
  • O(n)
  • O(log(n))
  • O(1)

Let’s gloss over these expressions for a moment and get to the heart of what Big-O notation tries to express:

Big-O is a way to describe how something grows based on the size of the input.

That something is usually time or space. Computer scientists care about time and space because these two factors are tightly coupled to computer hardware.

Time Complexity

Time complexity is how long it takes the algorithm to run. Programs are run by a computer’s processor (or CPU), so one of the ways we can reduce the amount of time it takes to run an algorithm is by using a faster processor.

However, it’s not always an option to buy a faster processor.

For example, if you create a website using JavaScript, that code is sent to the user’s web browser, where it executes on their computer, not yours. As a developer, you don’t have control over what kind of computer they use.

If they complain that your website is too slow, you could buy them a new computer… or you could improve the code’s performance so that it takes less time to run. However, it’s probably cheaper to improve the algorithm rather than buy new computers for all of your users.

Space Complexity

Space complexity is how much memory the algorithm uses while it’s running. In a computer program, variables correlate to the computer’s memory (RAM). When you buy a computer, memory is usually listed as one of the tech specs:

2017 MacBooks come with 8GB of RAM

Although modern computers have a lot of RAM, if your program loads the entire input into memory, you’ll eventually run out of memory.

Let’s look at an example. If our program loads all of our users into an array to process them, this would work fine for hundreds or even thousands of users, but if we have too many users, our algorithm won’t have enough memory to run! As of right now, Facebook has over 2 billion users. If you try to load 2 billion users into an array, it would take a lot of memory!

One solution is to buy more memory, but that’s not always an option. As we saw in the time complexity section, it’s not realistic to buy new computers for all of our users.

The better solution is to improve our algorithms so that they use less memory.

Unpacking the Notation

When you see Big-O notation, the important part is what is inside the parentheses.

Let’s take a look at an example: O(n).

The expression inside the parentheses is a math function:

“a function is a relation between a set of inputs and a set of … outputs”

If you took algebra in high school, you might remember this thing:

TI-83 Graphing Calculator

It’s a TI-83 graphing calculator. Aside from being able to play games in math class, there’s also this little blue “graph” button that lets you graph functions.

Math functions are usually graphed on the X and Y axis, so you’ll normally see them expressed in the form of y=x.

Here is the function y=x on a graph:

The relationship that we see on this graph is: when the input is x, the output is x.

In the context of algorithm performance, Big-O notation expresses a relationship between the size of the input and how much time or space the algorithm needs to process the input.

Since n represents the size of the input, what we see between the parentheses describes how the algorithm’s performance is affected as the size of the input grows:

  • O(n) is shorthand for saying that the relationship when the size of the input is n, the time or space required to process the input is n.
  • O(n²) is shorthand for saying that the relationship when the size of the input is n, the time or space required to process the input is .

Upper Bound

Big-O notation describes the worst-case scenario, which gives us an upper bound on the performance as the input grows.

Let’s imagine that we’re searching for a specific user in a list of names.

The brute-force algorithm iterates through the entire list and if the user is in the list, it returns. This algorithm is O(n) because n is the size of the list and we iterate through the entire list to find the name.

Performance is not an issue with a small number of users, but what happens when we have 1 million users?

Let’s assume that it takes 1 millisecond to check each name in the list. For O(n), the algorithm would take n milliseconds to find the name in the worst case.

  • For 5 users, it takes 5 milliseconds in the worst case
  • For 1,000 users, it takes 1,000 milliseconds in the worst case
  • For 1,000,000 users, it takes over 16 minutes in the worst case!

What happens if the name that we’re looking for is the first name in the list? We could stop on the first try, but that doesn’t change the performance of the algorithm. If the name that we’re looking for is the last name in the list, then it will still take 16 minutes.

When we use Big-O notation, we’re really just stating an upper bound on how much time or space is required.

Drop the Constants

You might have noticed that math functions can include constants, but Big-O notation doesn’t.

One example of a math function is y=2x + 1, but you’ll never see O(2n+1).

The reason for this is because Big-O is an estimation that focuses on the upper bound. We don’t really care about the performance when n is small. We care more about what happens when n becomes arbitrarily large so that we can get sense of the performance as the input grows.

When n becomes very large — millions, billions, or trillions — multiplying by 2 or adding 1 doesn’t really have a significant effect, so we leave them out to keep things simple.

Orders of Magnitude

You can think of the “O” in Big-O as standing for “Orders of Magnitude”.

Orders of magnitude provide a sense of scale for how big something is. Most countries use the decimal system (base 10), so one order of magnitude larger is 10x the previous order of magnitude.

Here are some examples:

  • Ten (10)
  • Hundred (100)
  • Thousand (1,000)

Let’s say that we’re planning to throw a party for 10 people. However, we find out at the last minute that 100 people are coming. We’d need to acquire a lot more food and we might even need to find a larger venue! A party for 100 people is much bigger in scale — an order of magnitude bigger.

Now let’s imagine that instead of 10 people, you find out that a few guests are bringing a friend. There are more people, but it’s the same order of magnitude. We might need some extra food, but it’s not a big deal.

You can also categorize algorithms by their orders of magnitude.

Comparison of different orders of magnitude

Most algorithms fall into a few common categories:

  • O(1) is said to have constant complexity (because it’s always the same)
  • O(log(n)) is said to have logarithmic complexity (because it’s the shape of a logarithmic curve)
  • O(n) is said to have linear complexity (because it’s a line)
  • O(n²) is said to have quadratic complexity (because it’s the shape of a quadratic equation)

You can think of O(1) as the holy grail for algorithms because it always takes the same amount of time no matter how big your input is.

On the other end of the spectrum, it doesn’t take long for an O(n²) algorithm to run out of space (memory) or time. Remember from earlier that if it takes 1 millisecond to process a single item in a list, it will take an O(n²) algorithm over 277 hours just to process 1 million items!

Summary

Big-O notation gives programmers a concise and common way to express the performance characteristics of an algorithm.

Big-O notation:

  • describes time complexity or space complexity
  • is shorthand for a math function
  • is concerned with upper bounds (the worst-case scenario)
  • doesn’t care about constants
  • describes complexity in orders of magnitude

Hopefully, this article clears up any confusion that you might have. If you still have questions, please let me know!


This is part of a series that I’m creating to teach the fundamentals of computer science and algorithms in a way that’s approachable to people without a computer science background.