Bubbling Up With Bubble Sorts
There seems to be an ongoing joke in the programming community that transcends language, library, or framework — everyone seems to know that bubble sort is a Bad Idea™. I remember hearing someone joking about this for the first time years ago; they were ragging on bubble sort, laughing about how it was the worst implementation of a sort algorithm, and how they couldn’t understand why anyone would ever use it.
I’ve heard this joke made again and again in the years since, and for awhile, I just accepted it at face value. Sometimes, I’d even laugh right along with everyone else when they made a bubble sort joke, not knowing why people thought it was so terrible. I usually think that it’s better to make up your own mind about something, rather than just listen to someone else’s opinions on it and and accept them as gospel. I did this for a long time with bubble sort. But I don’t actually think that this was a good practice.
It was only when I started this series that I decided I would put all of that aside. Maybe bubble sort really is a terrible algorithm. Or perhaps it’s just misunderstood, or poorly used. And maybe it can even be made better, and optimized. How would I ever know these things unless I learned about them myself?
So, today we’re going to do exactly that: we’re going to think for ourselves. It’s time to put an end to all the rumors floating around about bubble sort.
Before we can really make any fair judgements on the bubble sort algorithm, we need to understand what exactly it does, and how it works. A bubble sort algorithm iterates through the list or array that it is given, and compares each pair of adjacent elements in the list by size. If they elements are in the incorrect order, it swaps them, and then moves on to the next pair of elements.
Definitions are a great starting point, but for me, things only really get cemented when I see them in practice. So let’s take a look at what this definition actually means from a pragmatic standpoint. In the example here, we have a collection of unordered numbers that need to be sorted:
9, 7, 4, 1, 2. How would bubble sort handle this?
Well, we know that bubble sort will compare two pairs at a time. Naturally, it will start off comparing the first two elements in our list — the first pair. The algorithms looks at the first pair (in this case,
7), and determines whether the first element is in the correct place. Effectively, it’s just using a
< operator to do this, depending on how the sort is implemented.
9 is greater than
7, the algorithm knows that it should come after
7. Since these two numbers are in the incorrect order relative to one another, it will swap them, which will change the order of just those two elements in the list. Keep in mind that it has no idea if the
9 is the largest number in the list — it only knows about two numbers at any given point, since an algorithm can’t scan a list quickly with its eyes like we can.
Okay, so that’s how the bubble sort algorithm functions when comparing two elements at a time. But how does it actually sort through the entire list? Let’s look at what the algorithm would do next, using the exact same set of numbers in our example:
We start off comparing the first two elements —
7—and, since they’re out of order, we swap them.
Next, we compare the second and third elements:
4. The number
9 is definitely bigger than
4, so it should come after. This means we have to swap these two elements as well.
The next two elements are
1. Again, the
9 should come after the
1, and not before, which means we need to swap again. Finally, we’re on the last two elements in this iteration:
2. The number
2 should definitely come before
9, so we’ll swap these two elements so that they’re in the correct order.
Phew! That was just one single iteration of bubble sort. And our list isn’t even sorted yet. We’d need to keep repeating this set of actions again and again until the entire collection of elements was sorted. If this was just a single iteration, there’s one big question on my mind now: how many times would we have to iterate in order to sort the entire collection? Imagine if we had a list of 10 or 20, or 50 unsorted elements — I really don’t want to iterate through each set in order to know how much work it’s going to be!
Instead, let’s try and see if we can find a pattern, and make some abstractions about how many iterations we’d have to make given an array with n elements.
We can start off with an easy example. With an unsorted list of just 2 numbers, we need to iterate only once, since in a single pass, we compare the one pair that makes up the list.
For an array of three numbers, we need to iterate twice in order to sort completely — the first iteration, we’d move one number to it’s correct place, and the second iteration would sort the entire list.
I haven’t drawn it here, but for an array of four numbers, we’d need to iterate thrice in order to sort it completely. Hopefully these few small examples is helping you see a pattern that’s emerging here!
In general, given a collection of n unsorted elements, it takes (n-1) iterations through the list in order to sort it using the bubble sort algorithm.
This generalization can be super helpful to us when we’re given large arrays, and we want to know how many times we’ll need to iterate through it if we plan on using bubble sort as our sorting algorithm.
Now that we’ve seen one pattern emerge in bubble sort, it should be a little easier to catch a couple of others, too. There’s one characteristic of bubble sort that’s really interesting — and it’s actually the reason why bubble sort got it’s name!
Let’s look at an example, starting off with an unsorted array:
In this example, each iteration is responsible for moving the largest unsorted element to its correct place in the array. For example, the first iteration effectively moves the largest number,
12, to the end of the list. The second iteration moves the second largest number (or, the largest unsorted number),
9, to its correct place in the list.
This is what gives bubble sort its name: the largest numbers begin to “bubble up” to the end of the list, where they belong.
Of course, depending on how bubble sort is being implemented, this could also be reversed, so that the smallest numbers are being “bubbled up” to the front of the list. Regardless, in both cases, the bubbling of numbers comes from the way that bubble sort compares and swaps each pair of elements as it iterates through the collection.
We can also see another pattern here, too! Notice how we didn’t need to compare the last two elements,
12, in the second iteration; they were effectively already sorted from our first pass through the array.
Let’s try to generalize this pattern again, and try to find a rule that we follow.
We saw that, after two iterations through our array, checking the last two elements was unnecessary, since they were already sorted.
If we wrote out a third iteration, we’d see that we’d end up with
[3, 1, 8, 9, 12] on the third pass, and the last three elements sorted. This means that we wouldn’t need to check the last three elements.
You can probably predict what would happen next: on the fourth iteration, the last four elements would be sorted on the second pass. The pattern that we’re seeing here could be summarized into the following rule:
After x iterations, checking the last x elements in a collection is redundant.
This is a good thing to know, because it’s one way that we could optimize bubble sort! If we know that the last x elements don’t need to be compared, we can break out of an iteration and save ourselves both some time and some memory!
Now that we’ve looked at bubble sort very closely, we can being making some larger generalizations about this algorithm.
A handy thing to remember about bubble sort are that a single iteration puts one element (usually the largest unsorted element) in it’s correct place in the array. It’s also good to keep in mind that it takes (n-1) passes through a collection, where n is the total number of elements, in order to sort the entire thing.
How many bubbles is too many bubbles?
Okay, it’s time for us to talk about the elephant (blowing bubbles) in the room: bubble sort’s inefficiency. I won’t lie to you — it’s definitely slow and inefficient. But, I don’t encourage you to just take my word for it. Instead, let’s figure out why it’s slow and inefficient, together!
I’ve added some
[9, 7, 4, 1, 2].
var myArray = [9, 7, 4, 1, 2];
When we call our
bubbleSort function, here’s what shows up in the console:
Wow, that was a lot. Let’s take a look at what’s going on here. We can see that the algorithm is doing exactly what we were doing when we drew out each iteration — it’s just doing it way faster than us! We can see it comparing two elements at a time. If we look for the instances of
**one full pass through array**, we can see what the array looks like at the end of a single iteration. Given the fact that this array only has five elements in it that need to be sorted, there are currently 16 comparisons being made here. That seems…not great.
This implementation also hasn’t been optimized at all: you’ll notice that, even after the first iteration, we continue to see this printed out, again and again:
comparing 9 and 7. This is a little silly, and that’s part of what makes bubble sort a slow algorithm; it makes a lot of comparisons, but it doesn’t necessarily make them in an intelligent way.
There’s another issue, too: what if our list was already sorted? A naive implementation of bubble sort would iterate through the entire list, even if it was sorted, and use a lot of time and memory to do so.
However, there is one easy thing that we can do to avoid this crazy repetition of unnecessary work. We can check and see if we’re making any swaps in our first iteration; if we’re not, we know that the list must be sorted, and we can stop iterating.
isSorted variable is acting as a flag that we’re setting when we start iterating.
var isSorted = false;
isSorted = true;
If we never end up swapping an element in our first iteration, we know that this array is already sorted. The
isSorted flag, which was set to
true initially, will never be turned off — thus, we know that the array is sorted in the very first pass, and we can break out of the loop without doing a bunch of unnecessary iterations.
But evidently, even though we’ve added this optimization in our code, it’s still pretty slow and seemingly repetitive.
If bubble sort is bad, we should probably figure out just how bad it is. We know that we must make n number of iterations through an array of n total elements in order to sort it. We also know that, within each iteration, we must check all n elements in the array.
Multiplication would tell us that if we’re iterating through all n elements, and within each iteration, checking all n elements, we’re basically multiplying n x n, which is n².
In the context of time complexity, we could say that the Big O notation of a bubble sort algorithm is O(n²).
Based on what we learned in last week’s post on selection sort, we also know if we have a loop nested within another loop in an algorithm, that’s a good indicator that the Big O notation of the algorithm will be quadratic. That is to say, as our array doubles in size, the amount of time it would take for us to sort through it would quadruple.
However, similar to selection sort, bubble sort has a quadratic time complexity, but a constant (or, O(1)) space complexity.
Let’s take a look at some of the other ways that bubble sort stacks up to the other algorithms we’ve looked at already, using the classifications that we’ve already learned about.
We know that bubble sort’s time complexity is quadratic, or O(n²), in Big O notation. Bubble sort doesn’t require that much additional memory when it runs — it only needs a few pointers at a time in order to keep reference to the pairs that it is looking at, and maybe swapping (for example, in our code, the
temporaryReference variable). Since it only requires O(1) constant space, we can say that it is an in-place algorithm, which operates directly on the inputted data.
Bubble sort is also a stable algorithm, meaning that it preserves the relative order of the elements. If we think about it, this makes sense: imagine an array with two instances of a number:
[4, 2, 3, 3]. When comparing the two instances of
3, the algorithm won’t swap them if the one on the left isn’t larger than the one on the right. Thus, their relative order would remain the same.
This algorithm is also an internal sort, which means that all the data is stored within the main memory of the computer. This is vital to how bubble sort functions because as the algorithm processes data, it needs all of it to exist in one chunk; if this algorithm were external, it would result in even worse performance than it already has, as it would have to reference chunks of memory that could be potentially stored all over the place.
Finally, we are already sure that bubble sort is both non-recursive (and instead, iterative), and a comparison sort, since by definition it iterates through an array and compares two elements at a time.
Based on all of these qualifications, it’s a little easier to see why bubble sort gets a bad rap. It’s pretty slow, makes a lot of comparisons, and takes a long time. But it is fairly easy algorithm to understand, and it might be useful if you don’t care about how much time an algorithm will take, or if you have a very small set of data to sort. However, most of the time, that isn’t the case, which means that most of the time, you’ll want to avoid bubble sort if you’re thinking about using it.
Everyone seems to know that bubble sort is generally bad news — even Barack Obama knew that back when he was a senator in 2008:
But guess what? Now you know why it’s a bad idea, how to optimize it, and how to talk someone else out of using it. Hopefully, you won’t ever have to do that, though!
Because bubble sort is such an infamous algorithm, there’s a lot of reading that you can do on it. However, I’ve found videos to be particularly helpful for this algorithm, since they really help illustrate the “bubbling” that happens. I have included a few good ones in the links below. Happy bubbling!