Beginner Rubyist’s Guide to Quicksort
Why do we care?
Ruby is a special programming language. It is elegant, intuitive, forgiving, and comes pre-packaged with tons of useful methods built in. When I started learning ruby, I was obsessed with understanding why everything works. It felt like cheating to tell the computer to do something when I didn’t know how it was doing it. I wanted to really get under the hood and see what’s happening in there. So if you are like me and want to have some real fun learning about algorithms, then get ready for a wild ride.
We’re going to look at the array method #sort
and its underlying algorithm, which is called “quicksort”. I’ll show you how to implement quicksort in ruby (in case you want to give it a go yourself) and talk about how it works and why it’s a good choice for a built-in sorting method. I’ll also diss another sorting algorithm a little along the way. But don’t worry this post is all about fun and I will only be mean to things that are not people :)
A quick note: My implementation will be using some other built-in methods like #delete_at
and #partition
. I think it is better to learn by taking away the abstraction in layers rather than all-at-once. Okay that is all. On with the fun.
How quicksort works
Let’s say we want to sort an array of integers (our method will work to alphabetize arrays of strings as well, but we’ll stick to integers here for clarity’s sake). We have this array object containing the numbers 1–9, but all jumbled up:
messy_arr = [5, 3, 7, 9, 2, 4, 1, 6, 8]
What a mess.
Now to sort this, we could go through from left to right, comparing each number to the next one, switching them when necessary, and repeating this process ad nauseam until it is all sorted (this is called Bubble Sort and it is not a good algorithm… sorry Bubble Sort).
Quicksort’s genius is that it grabs one array element (which we will call pivot
), and compares every other element in the array to it, creating one array of every element that is less than pivot
, and another array with every element that is greater than or equal topivot
. Quicksort then uses recursion to do the same process on the two new arrays until everything is sorted.
Time to see this in action. I’ll be demonstrating quicksort as an instance method inside the Array class to make it a little easier to read.
First, we need to tell our method to grab a random element to use as our pivot
variable:
class Array
def quicksort
pivot = self.delete_at(rand(self.size))
end
end
Note: the #delete_at
method removes an item from an array and RETURNS that item. So we can use #delete_at
to store an element in our pivot
variable after removing it from our array.
Let’s say that our #delete_at
method chose the number 7 from our array, so now pivot = 7
. Now we need to partition our array into two new arrays, one containing all the numbers smaller than pivot
and one with all the numbers larger than or equal to pivot
… wonder what array method we can use…
class Array
def quicksort
pivot = self.delete_at(rand(self.size))
left, right = self.partition { |num| num < pivot }
end
end
Note: quicksort is a “comparison sorting algorithm” because it is comparing every other value in the array to our pivot
value. This becomes important when you talk about time complexity.
The #partition
method is going to return the following variable values:
left = [5, 3, 2, 4, 1, 6]
pivot = 7
right = [9, 8]
“But wait,” you say. “This is still a hot mess. Now we just have more arrays to sort through! This sucks. I’m going back to bubble sort.”
NO! BUBBLE SORT IS BAD NEWS. Don’t worry. We’re getting there. See those new arrays? If only we had some method to sort them… oh wait that is what we are doing…
RECURSION TIME!
class Array
def quicksort
pivot = self.delete_at(rand(self.size))
left, right = self.partition { |num| num < pivot } return [*left.quicksort, pivot, *right.quicksort]
end
end
With our example numbers, that return
statement would look like this:
return [[5, 3, 2, 4, 1, 6].quicksort, 7, [9, 8].quicksort]
So now our method is going to recursively call #quicksort
on each of our new arrays, splitting them into even more arrays and calling #quicksort
on those to split them into even more arrays and… you get it. Recursion!
Of course, I know you are thinking, “Wait. What happens when it can’t partition the arrays into smaller arrays anymore? What if an array only has one element?”
You’re right, I left out our base case. My bad. If you are unfamiliar with recursion, a base case is a line of code that tells our method when to stop calling itself and start returning all those values back to itself. In this case, it will be when our array is empty. So let’s add that to the top of our method definition:
class Array
def quicksort
return [] if self.empty?
pivot = self.delete_at(rand(self.size))
left, right = self.partition { |num| num < pivot } return [*left.quicksort, pivot, *right.quicksort]
end
end
A note on those splat (*) operators:
#Quicksort
returns an array. So if we put the return value of #quicksort
into another array, we end up with nested arrays, which is a mess. The splat operator will flatten all of those messy nested arrays into one nice, neat, one-dimensional array. For more info on the splat operator in ruby, check out this blog post and look under the section called “Constructing Arrays.” If you don’t like it, you could also write your return statement like this:
return [left.quicksort, pivot, right.quicksort].flatten
So now let’s plug our mess of numbers in and see what happens:
class Array
def quicksort
return [] if self.empty?
pivot = self.delete_at(rand(self.size))
left, right = self.partition { |num| num < pivot }
return [*left.quicksort, pivot, *right.quicksort]
end
endmessy_arr = [5, 3, 7, 9, 2, 4, 1, 6, 8]messy_arr.quicksort
#=> [1, 2, 3, 4, 5, 6, 7, 8, 9]
And there you have it! Try loading the code up in irb and sort to your heart’s content. If you want to take a deeper dive on quicksort, I recommend coding along with this video from Edutechional showing this same implementation with some slightly more advanced syntax, but great audio explanations of everything happening.
Why was it chosen for ruby?
Quicksort has an average time complexity of T(n) = O(n log n), also known as quasilinear time. For some real rad computer science-y reasons, this is the fastest time possible for comparison sorting algorithms. It also has a space complexity of O(log n), known as logarithmic space, which means it is very memory efficient as well. I would like to write a post about Big O notation at some point in the future, but in the meantime, this post on Geeks for Geeks is a good place to start understanding. Big O Cheat Sheet is a great reference page if you want to see how quicksort compares to other sorting algorithms (like bubblesort… boo).
Basically, it’s a solid general-use algorithm. There are other sorting algorithms that outperform it in certain cases, but quicksort is a great choice for the built-in #sort
method in ruby.
Now that the math stuff is over…
Thanks for reading and I hope you enjoyed this wild ride with me. Now here is a video of a bunch of sorting algorithms in action. Quicksort (yay) is zooming along in the middle with old bubblesort (boooooooo) bringing up the rear in the bottom left. Turn the sound up and sort to your heart’s content :)