Merge Sort: How To Understand an Algorithm by Creating Music
How to push your learning process further by challenging your creativity
Learning an algorithm can sometimes feel like learning a magic recipe. We understand the concept behind it, but understanding exactly how each step is done can be very confusing.
Live coding allows me to dive deep into the study of an algorithm by having fun with it and challenging my understanding and creativity. It gives me a playground where I can right away put into application what I’m learning by creating something interesting.
So today, we’ll dive deep into understanding how merge sort operates by building a Ruby program that uses this algorithm to create music.
Setting Up Our Environment
For this tutorial, we’ll use a software called Sonic Pi.
Sonic Pi is open source, lightweight, and easy to install. Just download the appropriate version according to your operating system.
Sonic Pi provides us with a sound synthesis environment (it comes bundled up with its own SuperCollider instance) and a full IDE. That’s where we’ll be writing our code and generating music. It also provides us with a library of functions and methods we can use specifically to manipulate sounds.
Sonic Pi comes packaged with very clear and complete documentation, so if there are any terms you don’t understand at any point in this tutorial, I encourage you to look it up in Sonic Pi’s own documentation.
Besides that, the only thing you’ll need to follow along is a good pair of headphones.
Understanding Merge Sort
Merge sort has a running time of O(n log n). It falls into the divide-and-conquer type of algorithms, which makes it much more efficient than algorithms such as selection and insertion sort (and of course bubble sort, which is the slowest of them all).
The idea behind merge sort is this: take a list, divide that list into two halves, and keep dividing each half until we end up with subarrays of length 1 representing the left side of the list and the right side of the list.
We then send these subarrays to another function that will merge them and return the result. Since we are always merging two subarrays that are already sorted, merging these together can be done by comparing elements in each subarray at a specific index and choosing the smallest value between the two.
So to implement merge sort, we need to write two different functions. One I’ll call merge_sort, which will divide our list into a right subarray and left subarray recursively. Another I’ll call merge, which will take two sorted subarrays and merge them together.
There are multiple ways to do this, but it’s important to note that the merge function can be done in a recursive or non-recursive manner.
Here’s how this looks in Ruby:
First, let’s look at the merge_sort function:
- It takes an array as argument.
- If the length of the array is 1 or 0, return it since it’s already sorted. This is our base case scenario that will end the recursive calls to merge_sort.
- Find the middle point of the array by dividing its length by two. Since we are dividing by an integer and not a decimal number, the answer will automatically round down to the nearest value.
- Recursively call merge_sort on both the left subarray and the right subarray.
- We end by calling merge on the resulting left and right subarrays.
Now, the merge function: Both the recursive and non-recursive implementation accomplish the same thing.
Let’s take a look at the non-recursive version first:
- It takes two arguments, an array representing the left side of the list and another array representing the right side of the list.
- We define an empty array called sorted.
- While none of the subarrays are empty, compare their first elements.
- If the first element in the left array is smaller, take it out of the left array using shift and push it into the sorted array. If the opposite is true, we take out the first element of the right subarray and push it into the sorted array.
- Once we exit the loop, we take our sorted array and concatenate it to the remaining elements in either the left or right array.
To understand clearly how the concatenation works and why we need it, remember that we exited the loop because one of the arrays was empty. That means that at least one element remains in one of the subarrays.
By concatenating, we take whatever elements are left and add them to the end of our sorted array. The empty array just gets ignored in the concatenation process.
That’s why it doesn’t matter whether you choose to concatenate the right side first or the left side first. One of these sides is empty, so it will just be ignored.
The merge_recurse function is slightly different:
- It takes the same arguments as before, a left subarray and a right subarray.
- We establish our base cases: If the right subarray is empty, we return the left one. If the left one is empty, we return the right one.
- If the first element in the left array is smaller then the first element of the right array, we return the first element on the left (inside brackets so that we have an array of a single element) and we combine that with a recursive call to merge_recurse. We pass merge_recurse the left (now without its first element) and right subarrays.
- If the first element on the right side is smaller than the first element on the left, we do the same thing—this time taking out the first element of the right array using shift.
Now, let’s use our understanding to create a music composition!
Mapping Information From the Algorithm to Sounds
Choosing the information we want to track
For the rest of this tutorial, I’m going to stick with the non-recursive merge function because I find it a bit more straightforward to translate into music.
First, we need to identify the pieces of information we want to hear from the algorithm's process. Here’s what I chose to track. I’d like to know:
- If I’m inside the merge function or the merge_sort function
- If the values are in the right subarray or the left one
- If I’m returning from a function
- If a subarray is empty and which subarray is it (the left or right one)
And of course, I want to hear each sound in the list being sorted and hear the progression of the sorting process.
Here’s how I chose to address each of these elements:
- I’ll use samples that have a very distinctive and percussive sound. Both functions will have their own sample that will be triggered first thing when the function gets executed.
- Since we are manipulating our initial list and dividing it as left and right side, I decided to play around with the direction of the sound. By using the pan parameter, I can direct the sound from the center to the left or right side of the headphones.
- Very similar to Number 1, I’ll choose a sample with a distinct sound, different from the ones already used, that will be heard right before a return statement.
- Again, similar to the previous point, except that in that case, I’ll pan the sound on the left or the right depending on which subarray is empty.
Creating a function to play a list of sounds
Sonic Pi has two key functions that make it possible to manipulate audio: play and sleep. The default speed for playing the sounds is 60 beats per minute, which means that if we write sleep 1, the computer will wait one second before executing whatever comes next.
The first thing we can do is create a simple function that allows us to play each value in a given list one by one.
This function will take three arguments:
- an array of numbers
- a number for the volume (defined by the amp parameter, amp stands for amplitude)
- a value between -1 and 1 that we’ll use for panning the sounds left or right
The last argument has a default value set to 0, in case we want to hear the list centered (like we normally would without changing the panning). I’m also adding a check for the length of the list because if the function receives an empty list, I’d like to hear a specific sample.
I chose the sample ambi_swoosh because I like its airy texture but feel free to experiment. You’ll see that Sonic Pi automatically shows the list of samples available once you type the keyword sample followed by a space.
In the else branch, we use a simple each method that calls play on each element of the array, followed by 0.25 seconds of silence. I added the release, sustain and decay parameters to make each pitch shorter but feel free to change these numbers and see for yourself what they do!
Modifying our merge_sort function
Because I’m interested to know if the subarray I’m hearing is the left or right one, I’m going to add an extra argument to merge_sort. I’ll give it the name side and give it a default value of nil. This will allow me to play each recursive call with the appropriate value for pan.
Here are the modifications we need to do:
- First things first, we need to decide on an unsorted list of values. I’ll come back a bit later on how and why I chose this specific array.
- Play a distinctive sample when we enter the function. I chose the sample glitch_perc1.
- We need to decide on a midi synth that will play the values being sorted. Sonic Pi comes with a variety of built-in midi synthesizers. For this function, I chose to use the synth square, which plays simple square waves and has a bright sound.
- We need to find the right value for the pan parameter. This will depend on the argument passed to merge_sort when the function is called.
- We need to use our function play_list, which we created in the previous step, to play each value in the list.
- If the length of the array is smaller or equal to 1, we want to play another distinctive sample that indicates we are returning from the function. For this purpose, I chose the sample elec_chime.
- We need to add our extra argument to each of the recursive calls to merge_sort.
- Finally, we need to call our function so that we can hear the results.
Here’s the code for merge_sort after applying all these modifications:
Now, if we press play, we are starting to have some sounds! Not yet the great musical result we are hoping for, but we’ll get there.
Modifying the merge function
Modifying the merge function is very similar to the modifications we applied to merge_sort, apart from one new concept: threads.
I’m using the built-in function in_thread because I want the left and right subarrays to sometimes play simultaneously. In a nutshell, in_thread allows you to execute blocks of code seemingly at the same time.
From Sonic Pi’s documentation:
Execute a given block (between do … end) in a new thread. Use for playing multiple ‘parts’ at once. Each new thread created inherits all the use/with defaults of the parent thread such as the time, current synth, bpm, default synth args, etc. […]
Besides that, the process is similar:
- Play a distinctive sample when we enter the function. In this case, I chose the sample perc_bell2.
- Choose a synthesizer that will play our values: in this case, I chose to use the synth beep, which has a slightly more muffled sound and is not as bright sounding as the square synth.
- Play the left and right subarrays simultaneously with the appropriate value for pan.
- Do the same as the previous step for each iteration of the while loop so that we can clearly hear the progression of the sorting.
- After exiting the loop, play the left subarray, then the right one and then the final sorted list.
Here’s the code:
Now if we press play and listen to the result, we should hear something a bit more interesting!
First, we should clearly hear sounds moving from center to left and right depending on the subarray we are sorting.
Then, the samples and the difference in the sounds of the synths indicate to us which function is executing.
There are also some nice layers with the sounds being played simultaneously in the merge function. The ambi_swoosh sample that gets triggered when a list is empty gives a nice texture to the overall result.
Do you hear that there seems to be an interesting “groove” in the pacing in which the percussive samples get triggered?
Turning our algorithmic experiment into a musical piece
In this last step, the code we’ll add is purely for the purpose of making our merge sort algorithm sound more musical. For that, we need to get familiar with another important built-in function within Sonic Pi: live loops.
Live loops are loops that can be modified in real-time without having to stop the program. It allows us to write code, run the program, apply modifications, and press play again without having to interrupt the music.
Here’s the definition of live loops in Sonic Pi’s documentation:
live_loop name (symbol)
Loop the do/end block forever. However, unlike a basic loop, a live_loop has two special properties. Firstly it runs in a thread — so you can have any number of live loops running at the same time (concurrently). Secondly, you can change the behaviour of a live loop whilst it is still running without needing to stop it. Live loops are therefore the secret to live coding with Sonic Pi. […]
Yes, it’s true: live loops are what make live coding with Sonic Pi possible and easy to understand.
The first thing we need to do is wrap our function call to merge_sort inside a live loop. Each live loop needs to have a symbol as a name so I’ll name mine merge_sort.
Then, we’ll create another live loop that will contain a very simple beat that will give a steady pulse to our piece as well as a long low-pitch sound that never changes.
Having a steady beat will highlight the groove created by the algorithm. The long low-pitch note (called a drone in music language) will give some ground to all the different values being played.
Personally, I like to define this process as putting sounds into perspective.
Finally, we wrap both live loops inside a reverb effect to give our piece a bit of ambiance. We do that by using the built-in with_fx block. Sonic Pi comes with a variety of different effects. Feel free to try them out!
While we’re at it, let’s make the whole piece play a bit faster with the built-in function use_bpm and changing the default value of 60 to 70.
Here’s the complete code!
Go ahead and press play, see what you think about our composition!
Little Practical Tips to Help You on Your Musical Journey With Code
Now, it’s very possible that you still have some unanswered questions. For example:
- “Why did you choose this specific array?”
- “Why is the unsorted array of length 12?”
- “How did you know which value to use for the drone?”
- “How did you know how to create the beat?”
OK. To tell you the truth, I always do my best to write tutorials that require no musical knowledge. I want to make it as easy as possible for programmers with no prior musical experience to use music as a playground for their coding skills and their creativity.
But instead of letting you experiment in the dark, I’m going to give you a few tips, just in case it makes things easier for you on the music side of things. If it seems too complicated or if you’re comfortable with experimenting on your own, feel free to skip this section.
First, the unsorted array. I knew that the values played would act as a sort of melody, so for that, they needed to be in a comfortable range for the ear. Personally, for this kind of thing, I like to use values between 55 and 80.
This is the function I use to create my unsorted list:
Let’s take a look at line 3 because that’s the one doing pretty much all the magic.
There are two functions you can use to create lists of values that fit well together: chord and scale. The main difference between a chord and a scale is that scales are a series of consecutive values, and chords have spaces between each value.
Because I was planning to hear sounds together (in the merge function when we use in_thread), I decided to use a chord because that eliminates a lot of the sound clusters you can hear when playing consecutive values of a scale together.
Both the chord and scale functions take a symbol as their first argument. This symbol is the name of a note. Notes have letters for names, from a to g. You can pick any of these letters. The next part of the symbol name is a number. If you want very high frequencies, choose a higher number. In this case, 3 is equivalent to a middle range sound.
The next argument is num_octaves. An octave is a series of twelve consecutive sounds that bring you back to the sound you started with—only twelve steps higher or twelve steps lower depending on the direction you’re going. So for example, e4 is one octave higher than e3.
I usually give myself two octaves if I’m dealing with scales and three octaves if I’m working with chords because there are more notes in scales than in chords.
Next, I chose to create a list of twelve values because 12 is divisible by 4. I usually use 16 values but any number divisible by 4 will work. That’s in a context where you want to create something very easy sounding for the ear. Since rhythms based on 4 are the most common in Western music, I choose to base my tutorials on that. But again, that’s me and it’s completely up to you to decide and experiment!
Now, to find the base note for the drone, take your lowest value in your array and subtract 12. Depending on how high your lowest value is and which synthesizer you’re using, you might even want to subtract 24. It’s a matter of trying and seeing what works best.
I hope this demystifies on a basic level the musical side of things. If you have any questions or if anything is unclear, feel free to ask me in the comments section. I’m very happy to explain anything more in-depth.
I hope you found this algorithmic experiment interesting and fun! Here are a few ideas for you if you’d like to go a bit further:
- What would the algorithm sound like if you use the merge_recurse function instead?
- Instead of playing the final sorted list, try replacing it with a custom function that will be called once the list is sorted. This is a very cool way of creating multi-part compositions.
- Try to use a scale instead of a chord to generate your unsorted array and see how it sounds.
- Modify your code inside the live loops while it’s playing; that’s the first step to becoming a true live coder!
I’m leaving you with this YouTube video where I play the merge sort composition we created here while slightly expanding on it.
I hope you enjoy it! And if you experiment and create amazing compositions inspired by this tutorial, please consider making a gist and posting the link in the comment section, I’d love to hear what you come up with.