Merge Sort in Python

3 min readDec 12, 2018

Today, I will attempt to share and explain my implementation of the merge sort algorithm in Python.

The whole idea behind this efficient and popular algorithm is that you divide the problem into smaller steps and conquer each part before combining it all together to form the solution. It relies heavily on a helper function (merge) in which given two sorted lists, will merge them with a time complexity of O(n).

Well, what’s the point of merge sort if we already have sorted lists?

The idea is that we break each list into smaller lists until they are single-length lists. At that point, they are sorted because they have no other value to compare to. For example:

[3,7,5,1,6,2][3,7,5,] [1,6,2][3] [7,5] [1] [6,2][3] [7] [5] [1] [6] [2]

As we unpack and merge, it will start to look like this:

[3] [7] [5] [1] [6] [2][3] [5,7] [1] [2,6][3,5,7] [1,2,6][1,2,3,5,6,7]

So let’s hop right in and take a look at the merge function:

The first thing you will notice is that we will have two pointers for each lists and they start at 0.
We then designate an empty output list, in this case named res
We loop as long as both pointers have not reached the end of their lists
We compare the values corresponding at each pointer, starting off with the zero index
If the value of the left index is smaller, we add it to our output. Conversely, we do the same for the other side. We increase the pointer of the smaller side also by one each time
We repeat this process until one lists finishes
We then add the remaining values of the unfinished list into the end result

Next, comes the magic!

Now, that we have our helper function, we will implement it in our merge_sort function:

The whole idea behind this function is that we divide the list in half recursively unto the list is broken up into a bunch of a single value-lists. Our base case is as follows:

if len(arr) < 2:
    return arr

Once we reach the bottom, we will merge it with all the “technically” sorted lists until we reach the top.

This part can be hard to grasp conceptually; I know I struggled with it for a while before understanding it. Sometimes it helps to add breakpoints in our code and look at what call stacks are being performed. For example, our call stack will look something like this:

[8,4,9,5,6,7,1,3,2][8,4,9,5][8,4][8][4][4,8][9,5][9][5][5,9][4,5,8,9] // 50% done...the other sideresult: [1, 2, 3, 4, 5, 6, 7, 8, 9]

I added a link to see this animated here.

And that’s it! The Big O for this algorithm is roughly O(nlogn) which is fairly fast and it also works on negative numbers.

If you have any suggestions to improve my code or if I made any mistakes, please let me know in the comments below. Thanks for reading!

Merge Sort in Python

Written by Tu Vo