Slow Sorting and Big-O Notation

What’s Big-O Notation?

Ross Mawdsley
The Startup
9 min readJan 15, 2020

--

‘Big-O’ (like ‘oh’, not ‘zero’) notation, aka Big-O analysis, is a technique that computer scientists use to approximate how an algorithm’s performance changes as the size of its input increases. Or, to put it another way, how long does your code take to run (its ‘runtime’) as you give it larger and larger amounts of data to process. Ultimately, Big-O is used as a tool to help programmers make decisions on which solutions are likely going take the least time and/or memory to process; and without having to actually code, execute and compare each solution!

So, this approximation is represented by a ‘notation’, i.e. a single mathematical expression. Here are the simplest Big-O expressions, there are more complex examples, but this is a good start:

‘Constant Time’: O(1)

This is the expression representing the behaviour of an algorithm that will always have the same runtime regardless of the input you give to it (e.g. whether it’s an array of 10 elements, or a million). This would have to be something really simple, like returning the first item in a list: array[0]. {N.B: O just tells us we’re looking at Big-O notation, and the ‘1’ here is an arbitrary choice, and simply represents the constant runtime}.

‘Linear Time’: O(n)

Here, your processing time will go up at a steady rate with the number of items in your input (that number is now represented by n). A good example of this in JavaScript (JS) would be mapping through an array and performing one action to each item (e.g. multiplying each number by two). So, a ‘linear-time’ algorithm should take twice the time to process twice the number of inputs.

‘Quadratic Time’: O(n²)

With a ‘quadratic time’ algorithm, the processing time will go up with the square of the number of inputs: so 10 inputs would take 100 units of time to process, 100 inputs would take 10,000 time units, and a 1000 inputs would take 1 million! That means that (in theory) computing a thousand inputs would take ten thousand times longer than computing 10! This increasing pattern, as you might remember from your GCSEs, is neatly represented by a curve known as a parabola (which, in geometry, is represented by a quadratic function), see below right.

Left: An iterator within an iterator: an example of an algorithm that Big-O analysis would say has “quadratic time”. Middle: the resulting console log.

In the example above, an iterator is used inside an iterator to output an array of arrays (odd example, I know!). As a result, every element has as many operations performed on it as there are elements in the array. So, the first element has 5 operations performed on it, as does the second, and so on, for a total of 25. And so we have a ‘quadratic-time’, aka O(n²), algorithm.

It may not be obvious, but O(n²) algorithms runtime’s can get wildly out of hand as their inputs increase in size! And so you should be careful when using them. To get more of a feel for this, let’s look at different sorting algorithms which are a good example to illustrate what Big-O notation can tell us, and what it cannot.

Bubble Sort

The infamous Bubble-Sort! (Credit)

Bubble sort is the most inefficient type of sort I’ll look at here. As you can see above, the algorithm will work it’s way along the array of values, comparing two values at a time, and if the larger value is in the wrong position (i.e. on the left), it will swap them (otherwise it just moves on). Once it has reached the end of the array, it returns to the beginning and starts a 2nd pass. The algorithm will continue to move through the array like this until an entire pass is made without any swaps occuring.

Why is this inefficient? Well, every element has to be checked on every pass. So for a 100 element array, we would have 100 operations performed 100 times.

To illustrate, let’s run a little experiment. I’ve borrowed a bubble sort function from here, and let’s see what happens to the runtime as the size of the input array is upped! To measure runtime I sandwiched my bubble_Sort function call with the console methods .time()and .timeEnd(). I’m unsure how accurate or consistent this method is, but if you know of a better way, please comment below!

How I measured the runtime of a function (here bubble_Sort)
The runtimes for sorting a random arrays of numbers. (Left to Right) 100 numbers, then 10,000, then 100,000.

So, immediately above I have the console logs from using this bubble sort function to sort arrays of random numbers. To sort 100 numbers took 9.27ms, or just under a hundredth of a second. To sort a hundred times more data than this (10,000 array elements) only took 141.97ms, or about 15 times longer. Obviously this doesn’t fit our “quadratic-time” model, as even if the time went up linearly we would expect it to take 927 milliseconds! Hmm, I would need to ask an expert, but I can only surmise that there is some minimum runtime required for any operation that is masking the parabolic increase here (I ran the same bubble sort with smaller values, 10 and 5, and they took 8.647ms and 7.777ms respectively, so there is obviously some minimum processing ‘overhead’ here). However, things become clear if you just add one zero to the inputs: 100,000 inputs takes 16 seconds! That’s 10 times the number of inputs taking over 113 times the time!

Insertion Sort

A slow, but informative GIF. The element that’s currently focused on turns red. Once an element has been “sorted” (i.e. shifted down until its left neighbour value is smaller) they are highlighted with a black box.

In short, Insertion sort focuses on a particular element (starting with the 2nd element) and keeps on shifting it left until it reaches a neighbour with a value lower than it.

So, to go into more detail, we start with focusing on the number in the second array position, if it’s smaller than the number in the 1st position, they’re swapped. So far, so much like Bubble sort! But trust me, this won’t last long. The number inside of the 3rd array position is then focused on: we compare it to the number in position 2, swap if appropriate, but (here’s where we diverge from bubble), we stick with focusing on the same number! In the GIF above, if you wait for it to cycle back to the beginning, you’ll see the the number in the 3rd position (‘3’) swaps with the 2nd element (‘5’) and then, swaps with the 1st, (‘6’). Now that there are no lower values to the left of 3, the algorithm can move on to focus on the number currently occupying the next position in the array (the 4th). This pattern continues until every element has been focused on, and then the array will be sorted.

Wait, this is also O(n²)?

Alas, for both Bubble Sort and Insertion Sort, if you double the number of inputs you’ll quadruple the runtime (2² =4). But, and this is a crucial distinction, just because two algorithms both scale in ‘quadratic time’, it does not mean they will perform exactly the same! Why? Well, with Insertion sort, that runtime is usually shorter to begin with.

So, let’s try our experiment again! Giving this Insertion Sort algorithm the same numbers as we did for Bubble (10; 100; 10,000 ; 100,000) we get runtimes of 7.915ms, 8.432ms, 34.91ms and 2.432seconds respectively. The final time really shows off Insertion Sort’s advantage over Bubble Sort, taking 2.432s to Bubble sort’s 16.7.

However, similar to Bubble sort you can still see the O(n²) nature of the runtime increase, as processing 100,000 inputs takes 70 times that of 10,000 (compared to Bubbles factor of 113). And (may my poor computer forgive me) I dared to add another zero to the inputs, which took over 4 minutes & 11 seconds to process (that’s 103 times as long for 10 times the input):

At one million inputs, the limits of insertionSort really start to show.

Quick Sort

Finally, let’s look at what should be the fastest, and most complex, sorting algorithm I’ll explore, ‘quick sort’:

Solid Black: the current “pivot” number. Red: numbers currently being compared to the pivot. Black outline: numbers that have already been the pivot, and so are already in their final position!

So, take a deep breath, here is how the above algorithm works:

  • pick a number at random to be a “pivot”
  • Go through the rest of the list and put the smaller numbers to the pivots left and the larger to its right. Note that your pivot number is now in its final position!
  • Now recursively apply these steps to the elements in the array with smaller index values than the pivot, and then those elements with larger index values.

Clear as mud? I’ll try to help by working through the numbers above myself. N.B: The * to the right of a numbers below indicates that it is the current pivot, and the [] indicates which set of numbers are currently being sorted, I’ll call it the ‘focus’.

1) [653*18724]: 3* is picked randomly as the ‘pivot’

2) [213*65874]: smaller numbers go anywhere to 3’s left, larger to its right

3) [21*]365874: now focus left of original pivot & select a new random pivot

4) [1*2]365874: as above, the numbers are moved relative to this new pivot

5) 1*[2]365874:

  • As there are no lower indexes remaining to the left of the pivot we switch sides of our pivot to higher indexes within the previous focus (i.e. [1*2]).
  • Now, with only a single number in focus [2], and having already checked the other side of the pivot, we know we’ve sorted the previous focus [1*2] and so can move to the other side of the pivot that defined that (now sorted) focus {i.e. 3* from step 2}. This is us moving back up one level of recursion.

6) 123[6587*4]: right hand side of (original) pivot in focus;& new pivot

7) 123[6547*8]: again, smaller numbers anywhere to pivots left, larger right

8) 123[654]7*8: focus just on lower indexes within 7)’s focus

9) 123[45*6]78: select new focus/pivot & move other numbers accordingly

10) 123[4]5*678:

  • focusing left of pivot, only one element in focus: so we move onto higher index(es) from previous focus [45*6]

11) 12345*[6]78: switching to the right of the pivot, again only one index

12) 1234567[8]: now looking at higher index(es) from step 7), we’re done!

Sorry if that was poorly explained, recursion can be a bit of a head scratcher! If it helps, the recursive pattern again goes:

  • select pivot
  • move numbers around it
  • repeat these 2 steps for the indexes lower than the pivot
  • then repeat these 2 steps for the indexes higher than the pivot

And remember, that if you check both sides of a pivot, and both just have single numbers, you know that you have sorted the focus where that pivot was defined, and so can move onto the indexes higher than the pivot that defined that (now sorted) focus. E.g: After step 5) and 11) above.

Phew - how does it perform?

Well, using this algorithm for quicksort, I got the following results:

  • 10 inputs took 8.645ms
  • 100: 8.743ms
  • 10,000: 14.062ms
  • 100,000: 63.209ms
  • 500,000: 1.065s !

Unfortunately, node would not allow me to sort a million inputs, giving me a “RangeError: Maximum call stack size exceeded” (which is an interesting error I’ll have to explore later!).

Nonetheless, at 100,000 inputs, Quick Sort is over 38 times faster than Insertion Sort! So Javascript’s .sort() method must use Quick Sort right? Well according to this thread (2009):

“In Chrome’s v8 source code, as of today, it uses QuickSort and InsertionSort, for smaller arrays”

I’m guessing that as the Insertion Sort algorithm uses fewer lines of code, it will take up less memory, and so it makes sense to use it for smaller arrays; interesting stuff.

As for its standing with Big-O notation, Quick Sort is also O(n²), but only in the worst case scenario. In fact, its average processing time scales with O(n* log n), which without going into the math, is much better!

So in conclusion, telling someone that a sorting algorithm is O(n²) doesn’t really tell them much! The discussion would also have to include the difference between average and worst case runtimes, as well as comparisons to other sort algorithms. Anyway, I hope this was helpful!

--

--