Sorting Algorithms: The Difference Between Bubble Sort and Merge Sort
The concept of sorting comes up a lot in server-side development and is fundamental to computer science.
In my journey to becoming a software developer, I’ve found sorting algorithms to be very fascinating and would like to help others on the same journey understand two of the more well-known algorithms: Bubble Sort and Merge Sort.
Bubble Sort: A Comparison Algorithm
Bubble Sort takes an iterative approach — looping through elements in a matrix-like fashion — to sorting, and is a great place to start with implementing your first sorting algorithm.
Here’s how it works: given an unsorted array, for the full length of that array, we pass over each element; comparing it with the element next to it. If the first element is larger than the second, we swap the two elements.
This creates a “bubbling” effect, where the smallest elements (in our case numbers) migrate their way to the front of the list with every pass.
As I mentioned earlier, using helper functions to implement bubble sort makes the code more readable, so I’ll start with implementing those:
A Pair-wise comparator function
First, we’ll define a pure helper function — a function that that takes in input and gives us output without changing anything — called inAscendingOrder. This function will take two elements in a given array, compare them, and return a boolean based on the result.
A swapping function
Next, we’ll define a function that swaps two elements in a list. We can’t call this a pure function. Why? Because it will actually have an effect on the external scope (our bubble sort implementation) when we use it later on.
The actual bubble sort implementation
And finally, we want to define the actual bubble sorting algorithm.
Before I bring in the code, I want to mention a couple of things that might be helpful to understand:
(1) I’m going to be using the concept of “state”, which basically means my function will keep metadata on itself, and let us know when it finishes sorting the input array;
(2) I’m going to pass through the array backwards. What does this mean? Well, we use nested loops to implement bubble sort. The outer loop handles the direction and length of our passes, so I’ll start my loop from the last element of the array and work my way to index 0. Why are we looping backwards? This makes it so that the inner for loop, the loop that will be handling the swapping, requires less work to do its job; avoiding the added task of passing over elements it has already sorted. Hopefully, you’ll see what I mean below:
Merge Sort: A Recursive Sorting Algorithm
Merge Sort, on the other hand, takes a divide-and-conquer approach to sorting; recursively breaking the input array down until we have sorted tuple-sized subarrays that we can then merge back together at the end.
I’ll also use helper functions in implementing merge sort (again, to keep the code declarative):
A split helper function
A merge helper function
And finally, here’s our recursive merge sort solution that utilizes both helper functions…
Comparing Bubble Sort and Merge Sort: Time-Complexity Analysis
So why choose one over the other?
Both have their pros and cons, but ultimately bubble sort quickly becomes less efficient when it comes to sorting larger data sets (or ‘big data’). Where as, Merge Sort becomes more efficient as data sets grow.
This makes more sense once you familiarize yourself with Big-O Notation and the concept of time complexity. What’s time complexity? Basically, we use time complexity to analyze the performance of an algorithm, or how long it takes to solve the problem for a given input. Here’s a cheat sheet to help you dig deeper into this.
At best, with smaller data sets, bubble sort has O(n), and worst case scenario, it has O(n²) time complexity (which is pretty bad).
On the other hand, merge sort performs pretty consistently, with a time complexity of O(n log(n)). The time complexity of our helper functions for merge sort make this possible.
There are many more sorting algorithms to explore, and I hope this helps others venturing into software engineering, machine learning, and other disciplines get a better understanding of the two most popular ones.