Sorting algorithms in Ruby and default #sort behavior

Sorting in Ruby can seem magical. Just throw in a series of unsorted integers in an array and in tenths of a second a newly sorted array will be returned.

arr = [178544, 3, 10074, 212, 5]
arr.sort
# => [3, 5, 212, 10074, 178544]

So, how how does Ruby’s native #sort work?

When you use #sort without passing it a block, the default behavior will just use a combined comparison operator to sort in ascending order.

arr.sort 
arr.sort { |a, b| a <=> b }

What happens here is Ruby compares one element with another and will

return -1 if a < b,

return 0 if a = b

return 1 if a > b

It turns out however that those elements in the array are not being compared in adjacent pairs until the entire array is sorted. Although that behavior might seem intuitive — perhaps since it is based on the simplest of sorting algorithms — it’s not actually what’s happening under the hood.


Bubble sort vs. Quick sort

Bubble sort is an algorithm that says:

  1. Consider adjacent pairs of elements, A and B, in an array
  2. If the element that is lower in value between A and B is not already on the left, swap the two elements.
  3. If the the element that is lower in value between A and B is already on the left, leave the two elements in place and move on to the next pair.
  4. If, in one run through all of the pairs in the array, each pair has had a comparison, and there have been no swaps at all, the array is sorted. Otherwise repeat steps 1–3 until it is.

Bubble sort works well for smaller arrays and especially well for almost fully sorted small arrays i.e.

arr = [1,2,3,5,4]
Bubble sort visualization (toptal.com)

Worst case run-time? O(N^2)

In Big O notation, O(N^2) represents an algorithm whose performance is directly proportional to the square of the size of the input data set.

Quick sort is an algorithm that says:

  1. Choose a random element in an array as a “pivot”
  2. Partition the element into three parts. First part includes elements less than the pivot. Second is the pivot itself. Third part includes elements greater than or equal to the pivot.
  3. Then apply steps 1–2 to the first and third parts (recursively) until all “parts” only contain a single element.
Quick sort visualization (toptal.com)

Worst case run-time? O(N^2)

But wait? They are the same?!

Well not quite.

This is because in real world performance, it is more common that there is a random set of values in a large list than it is that there is an almost fully sorted large list. In other words, the worst case scenario isn’t all that likely.

In a set of average cases Quick sort will perform closer to O(N log(N)).

In a case where there are 10,000 elements in an array there would be 10,000 times fewer operations with Quick sort if it performed at O(N log(N)) levels.

Example Bubble sort vs Quick sort with array of length 10,000

So as you might predict, the #sort behavior in Ruby takes these performance differences into account.

For the purposes of Ruby — and since in the creation of the language, use cases have to be considered heavily — Quick sort will almost always be used for lengthy arrays.

And with that I leave you with a video to help sort things out for you even further.