A Post Of Sorts

Christopher Pitt
6 min readApr 23, 2015

--

Recently I’ve been watching an excellent video series on Algorithms and Data Structures. I tried implementing a few of the algorithms in JavaScript, and discovered something interesting…

Bubble Sort

First, let’s look at a simple (and terrible) algorithm. It’s called Bubble Sort. Bubble sort is often taught as a “first algorithm”. It’s not very efficient but it’s easy to implement.

Bubble Sort performs multiple passes over a list of items, and compares each item with the next. If the item on the left is bigger it swaps both items. This can be seen in a number of steps:

var items = [1, 3, 2, 5, 4];

// Pass 1
//
// Step 1: compare 1 and 3, no swap
// Step 2: compare 3 and 2, swap
// Step 3: compare 3 and 5, no swap
// Step 4: compare 5 and 4, swap
//
// Pass 2
//
// Step 1: compare 1 and 2, no swap
// Step 2: compare 2 and 3, no swap
// Step 3: compare 3 and 4, no swap
//
// Pass 3
//
// Step 1: compare 1 and 2, no swap
// Step 2: compare 2 and 3, no swap
//
// Pass 4
//
// Step 1: compare 1 and 2, no swap

Bubble Sort works out how many steps it would take to sort the most jumbled list (the worst case). For a list of n items, the number of passes required is:

(n - 1) + (n - 2) + (n - 3) + (n - 4)...0

This equates [roughly] to O(n²) complexity.

If you’re unfamiliar with this notation (as I was) then watch a few of the videos in that playlist.

This means we can expect the following growth of work against input:

1 unit of input → about 1 unit of work + set-up
2 units of input → about 4 units of work + set-up
3 units of input → about 9 units of work + set-up
4 units of input → about 16 units of work + set-up

It’s important to notice that we’re estimating units of work. O(n²) doesn’t take into account set-up work. We’re ignoring constants because, at high volumes of input (high n) the constants don’t contribute meaningfully to the amount of work that needs to be done.

Merge Sort

Then, let’s look at a complex (and beautiful) algorithm. It’s called Merge sort. Merge sort recursively splits a list of items until the list of items it has to sort are 1 or 2 in length. We get 1 when we’re recursively halving the input list, where the input list has an odd length.

After recursively splitting the input list, we combine each small(er) half until we’ve combined all the halves into one list. We can see this in a number of steps:

var items = [1, 3, 2, 5, 4];// Step 1: split [1, 3, 2, 5, 4][1, 3, 2] and [5, 4]
// Step 2: split [1, 3, 2][1, 3] and [2]
// Step 3: sort [1, 3] and [2][1, 3] and [2]
// Step 4: sort [5, 4][4, 5]
// Step 5: join [1, 3] and [2][1, 2, 3]
// Step 6: join [1, 2, 3] and [4, 5][1, 2, 3, 4, 5]

Merge sort removes the problem of sorting lists of items of length greater than 2 by reducing the length of lists to 2 or 1, sorting and combining. It’s more efficient than Bubble sort, reducing the theoretical complexity from O(n²) to O(nlogn). The growth of input against work looks like this:

1 unit of input → set-up
2 units of input → about 0.6 units of work + set-up
3 units of input → about 1.4 units of work + set-up
4 units of input → about 2.4 units of work + set-up

Again, just estimates. We’re not fussed about exact measurement because what matters is growth of work vs. growth of input.
I’m only interested in the worst case of these algorithms. For Bubble Sort that means O(n²) and for Merge Sort that means O(nlogn).

Implementations

So I implemented Bubble Sort and Merge Sort, in JavaScript:

function bubble(data) {
var length = data.length;

while (length) {
for (var i = 1; i < length; i++) {
var prev = data[i - 1];
var next = data[i];

if (prev > next) {
data[i] = prev;
data[i - 1] = next;
}

}

length--;
}

return data;
}

function merge(data) {
var length = data.length;

if (length === 1) {
return [data[0]];
}

if (length === 2) {
if (data[0] > data[1]) {
return [data[1], data[0]];
}

return [data[0], data[1]];
}

var first = merge(data.slice(0, Math.ceil(length / 2)));
var second = merge(data.slice(first.length));


var result = [];

while (first.length && second.length) {
if (first[0] > second[0]) {
result.push(second.shift());
} else {
result.push(first.shift());
}
}

return result.concat(first, second);

}

Having described both algorithms; you should have a pretty good understanding of how these methods work. Sure there’s some interesting array-based mechanics in merge(), but nothing fundamentally different to how Merge Sort works.

We can run these through benchmarks, to see how they perform against each other. In fact, that’s what I did. And I discovered something strange:

10 items
100 items
1000 items
10000 items

My implementations are definitely not as efficient as they could be. What’s interesting, though, is that native sorting is much faster on Firefox 37.* than on Chrome 42.*. I had a feeling that this could be the result of function call overhead in recursive implementations of the sorting algorithms these browsers use, so I ran another benchmark:

1000 calls and 10000 calls

It seems that the cost of calling functions, recursively, inside my Merge Sort implementation (and Chrome’s native sorting algorithm) is significant enough that Bubble Sort is the most efficient for smaller sets. The same could be said for Safari 8.*, if the overhead graph is anything to go by.

Conclusion

A common theme of algorithm design is to ignore constants which don’t meaningfully contribute to the amount of work, as the size of input grows. This is an example of constants (function call overhead) impacting the work so significantly that reasonable samples of work (sorting less than 100 items) are more efficiently done by O(n²). As long as the implementations avoid recursive function calls.

And perhaps…

if (chrome) {
var native = Array.prototype.sort;

Array.prototype.sort = function(data) {
if (data.length > 100) {
return native.call(data);
}

return bubble(data);

}
}

I’m by no means an algorist. Please feel free to point out my errors (in a constructive manner) and I’ll make the necessary corrections. Leave a comment here, or find me on Twitter.

Edit

Mbrubeck points out that calling Array.prototype.sort without a custom comparison function will cause the elements to be converted to strings. I went back to the previous benchmarks and added in a comparison function.

10 items
100 items

This greatly improves the speed at which Chrome sorts these lists. Still nowhere near the level of Firefox, but enough to beat Bubble Sort. I guess this means I should be adding custom sorting functions all the time.

Should Chrome be converting elements to strings, as the default behaviour?

--

--