Orders of growth in algorithms

A brief introduction to complexity, in the context of sorting

Harrison Hughes
7 min readDec 29, 2019

Using algorithms to process data is key to many aspects of programming. If you want answers, and you want them fast, you may well turn to a computer — working at rates of thousands, if not tens or hundreds of thousands of operations every second, for repetitive tasks they put our human minds to shame, plowing through data like the terminator after 7 espressos and a redbull.

However. Things can get messy pretty fast. Or, more to the point, things can get pretty slow, pretty fast… Say we’ve put together a little sorting algorithm of our own, one that accepts an array of unordered integers and gives us back a new one with the numbers in ascending order. You checked it on a couple of small arrays already and you’re happy it works perfectly well.

You show it to your boss. She loves it. So much in fact, that you’re instantly promoted. Fast forward 30 minutes and you’re sipping champagne with the CEO.

“So”, they begin, “what do you say we feed this masterpiece of yours some reaaaally big arrays and see what happens, eh?”

Your blood runs cold.

We feed our algorithm 10 numbers, and it replies before we even know we’ve hit enter.

Feed it 100, the same again.

But, we feed it 1000… hmmm, your bosses brow furrows, there was definitely some hesitation there. Nothing too bad I’m sure. But what if…

10,000? All of a sudden, we’re sat in front of a blank screen. An eternity passes, as the oceans rise and fall, and mountains turn to dust — it looks like someone forgot to check their algorithm’s complexity.

The growth of a function

Let’s get technical, just for a moment. The order of a function (or an algorithm) can be defined as such:

Let f, g : N → R be real-valued functions on N. We say that f is of order g, written O(g), if there exists a constant c ∈ R such that f(n) ≤ c g(n) for every n ∈ N.

In simpler terms? The order of an algorithm is essentially a measure of its efficiency, specifically in the worst-case scenario. An algorithm may take many inputs (in the case of a sorting algorithm, we only need one — an array), and it’s order will depend on both the nature of the size of these inputs.

The order of the algorithm is calculated, in most basic circumstances, by taking the number of steps needed to compute the solution as a polynomial function, f(n), of the inputs n, and then reducing it to its highest polynomial degree:

(1) f(n) = 2n³ + 10n
This is of order n³, i.e. O(n³)
(2) f(n) = 17n⁴ + 200n³ + 10n + log(n) + 30
This is of order n⁴, i.e. O(n⁴)

The idea is that, as n gets very large (or, in our sorting example, our array of numbers gets very long), the highest polynomial degree will be significant enough to render the effects of the other components essentially negligible (ignorable).

So what?

Well, let’s assume that your computer can perform 10,000 operations (e.g., data structure manipulations, database inserts, etc.) per second. Not bad, right?

Well, given algorithms of various orders, operations to perform a given task on n items, here’s how long it would take to process 10, 50, 100 and 1,000 items (from here):

The units are a little funny, but you get the gist of it: n! = bad
The units are a little funky, but you get the gist of it: n! = very bad

Yep, that’s right. I wasn’t completely over exaggerating earlier with the whole ‘mountains turn to dust’ thing — if your algorithm for sorting had been of order O(n⁴) or above, you’d be waiting literally years for your computer to put your list of numbers into ascending order.

The growths of some algorithms, such as those with orders O(2^n) and O(n!) in particular, are known to be explosive, and it’s not hard to see why. Furthermore, algorithms with orders of the form O(x^n) can be particularly prickly, due to the delay of their explosivity: looking at the penultimate example for instance, if you only tested it with an array of length 10 or less , you’d likely not notice anything alarming at all.

Sorting algorithms

For some context to this whole subject, we can look at two different, yet widely used algorithms, and see how they compare.

The bubble sort

The bubble sort is conceptually one of the simplest sorting algorithms that exists (in my opinion at least). In the ascending case, it works by moving through the array from left to right, comparing adjacent numbers and swapping them if they are not ordered correctly: i.e. if the number on the right is smaller. The sort will terminate when the algorithm can sweep through the entire array without making any more changes.

Here’s a visualisation, provided by Timo Bingmann:

There are, I think, two things that this video illustrates particularly well:

  1. How the name of the algorithm is derived — from the manner in which the largest values rise, or ‘bubble’, to the top of the array
  2. This method is TEDIOUS

Yep, it’s a pretty long video all things considered, and no wonder — the bubble sort is of order O(n²)! That’s because in the worst case scenario, where the numbers are already in descending order, it will go have to compare every single element with every single other element!

Let’s look at another, more streamlined algorithm for comparison, shall we?

The merge sort

The merge sort is known as a divide and conquer algorithm, because it works by splitting our ‘problem’ (or in this case, our array) iterativley into smaller and smaller sub-problems, recursively solving these, and then combining (or merging) the answers.

Specifically, the merge sort splits the arrays recursively, then rebuilds them in the correct order: (image from here)

p, q, and r represent the start, acting midpoint, and end of each array respectively

Here is a visualisation of this process in real time, again provided by Timo Bingmann:

Comparing the algorithms

Now, the merge sort didn’t necessarily seem to be more efficient than the bubble sort at first glance, but let’s have a look at the algorithm’s order. As briefly explained in The Algorithm Design Manual, by Steven S Skeina:

“Mergesort works by dividing nodes in half at each level until the number of nodes becomes 1 hence [the] total number of times we would need to do this would be ⌈log2𝑛⌉. [Then] for each level we perform O(n) work hence [the] total time taken will be O(𝑛log𝑛)”.

N.B. in this context, the expression log2n refers to the logarithm of n in base 2, not the natural logarithm of 2n. A slightly more expressive derivation of the order can be found here, however for the purposes of this comparison, the main takeaway should be that the merge sort is of order O(nlogn).

Interestingly, the idea of the order representing the worst-case scenario doesn’t really apply to the merge sort. Why? Because regardless of what order the original array is in, it will always take the same number of steps to reach the solution — a characteristic of its divide and conquer nature. With this information we can quickly introduce a new concept — the omega of an algorithm, which essentially represents the best-case scenario equivalent of its order.

For the bubble sort, if the array is already ordered correctly, the algorithm simply checks through it once to confirm this, and then terminates. Therefore, if the array is of length n, the bubble sort algorithm can be easily seen to have an omega of Ω(n). Meanwhile, as we’ve just seen, the merge sort will have an omega of Ω(nlogn) — exactly equal to its order!

So, which one’s better?

Well, in the best case scenario, i.e. the array is already ordered correctly, the bubble sort is always going to be faster than the merge sort while the length of the array is greater than 1. In fact, depending on the size of the array, there may be quite a few scenarios where the bubble sort will return a solution in less steps than the merge sort — if one had the time, energy, and inclination, I’m some function could be created that would return some critical value, at which the bubble sort ceases to become the more efficient option.

Nevertheless. There is a reason that software with built-in sorting functions will often implement a merge sort, java-script being a prime example. When you release your coded algorithms upon the world, it is almost impossible to predict every instance in which it will be used. So, perhaps, planning for the worst case scenario isn’t such a bad idea after all… considering that whole ‘mountains turn to dust’ thing.

What next?

Well, perhaps you’re wondering what the quickest sort is on average? Look no further than the aptly named quick sort, with some info here. Or maybe you’re interested in something that might have an even smaller order than our quick merge? Well, under certain conditions, the radix sort might just be your cup of tea, with some info here. Hybrid sorting algorithms? Check out Python's very own Timsort designed by, you guessed it, a guy called Tim!

Show me something truly awful

If you insist. Now, to top off our foray into efficiency, let’s take a peek behind the curtain at the algorithmic equivalent of trying to order a stack of cards by throwing them against a wall repeatedly!

So, to finish, here we have everyone’s favourite sorting method, the Bogo sort, as presented one last time by Timo Bingmann:

Please, do yourself a favour, and listen to this one with the volume on.

--

--