Merge Sort Algorithm

7 min readNov 15, 2022

Merge sort is a sorting algorithm that divides a list into two halves, recursively sorts each half, continuously divides those sublists band then merges the sorted halves to produce a sorted list. The recursive partitioning continues until a list of 1 element is reached, as a list of 1 element is already sorted.

In real life, we tend to break things up along useful lines. If we’re sorting change, we first divide the coins up by denominations, then total up each denomination before adding them together. When we put together a puzzle, we divide out the edge pieces first, put them together, then build the rest of the puzzle on that. We see this in real life more often than blind divisions because we, as humans, know we can divide along useful lines.

Here’s a visual demonstration of the merge sort algorithm at work:

Let’s go through a conceptual overview of how merge sort works.

Conceptual Overview

In our example, we have a list containing 6 numbers, and we are going to divide it into 2 sublists. The way that we would divide it in code is we would take the first index [0], we’d add it to the last index [5], divide the sum by 2, and round down our result. The middle of the array is at index 2. As demonstrated in the above graphic, we now have 2 sublists [3, 11, 8] and [18, 4, 25]. Then, we are going to recursively apply the merge sort algorithm on these sub-arrays.

Let’s start with the left sub-array [3, 11, 8]. We divide it in half, and we get [3, 11] and [2]. Because we are using a recursive algorithm, we are going to dive all the way to the left, get to the base case and then move to the right sub-array. Now, we continue on to sub-array [3, 11] and we divide it further into two sub-arrays of the length of one. Once we’ve reached sub-arrays of length one, this is where the second part of the merge sort happens.

Once we get to the base case where we’ve got sub-arrays of length 1, we start to recreate new sorted sub-arrays until we reach the array length of our input array. See the right side of the graphic where the elements are colored in blue.

We start comparing 2 numbers (2 sub-arrays of length one) and whichever one is smaller goes first, and whichever one is bigger goes second. Now, we have an array [3,11] in blue that is sorted. The second blue sub-array [8] is already sorted.

Now we have 2 subarrays that we know are sorted. And now we create a new array where we recombine them but we make sure that this 4-element sub-array is sorted. How do we do that? We compare the first elements of each sub-array 3 and 8.

3 is smaller than 8, so when we create a new sub-array, we add 3 first, we move our pointer to 11, and compare 11 and 8. 8 is smaller than 11, so we add 8 to the recombined array next. Finally, we add 11 and we are done — we have a recombined sub-array of 3 elements that is sorted [3, 8, 11].

Now we go all the way back up to our right sub-array [18, 4, 25] and we perform the same steps on this subarray. We will first divide the right sub-array the same way we divided the left sub-array, we’ll divide it down to subarrays consisting of one element. [18], [4] and [25]. Then we sort and recombine them into a sorted array [4, 18, 25] just like we did with the left sub-array.

We are almost there! Looking at our graphic again, now that we have two sorted sub-arrays of length 3 each [3, 8, 11] and [4, 18, 25], we can combine them into one final sorted array. We’ll have 2 pointers set on the first element in each sub-array. We are building an array that’s going to be the length of our input array.

We compare 3 to 4, 3 is smaller than 4, so 3 gets added first.

We move the left sub-array pointer to the right. 8 is bigger than 4. We know that every number in the left sub-array that comes after 8 is bigger than 8 because it’s sorted, and the same logic applies to the right sub-array — any number that comes after 4 is bigger than 4. So 4 is our next smallest number that gets added to the final sorted array.

We move the right sub-array pointer to the right. 18 is bigger than 8, so we add 8.

We move the pointer to 11. 11 is smaller than 18, and we add 11 next.

We move the pointer to the right and now we are done with the left sub-array. We know that any remaining numbers in the right sub-array are going to be added in their order because the right sub-array is sorted, so we add 18 and 25. Now we have our final sorted array.

As you can see we’ve created a brand-new array. We didn’t do sorting in place as we did with some other popular search algorithms. To recap one more time, we recursively divided the array into two halves. Each sub-array was divided until an array of one element is found (our base case). An array of one element is already sorted. At each level, the sorted sub-arrays were merged together while maintaining the sorted order.

Let’s now do a complexity analysis.

Time Complexity

On the first call on the input array of length n, we divide it in two. Creating a copy is O(n) operation and merging the sub-arrays together is also an O(n) operation. So we have O(2n), and we drop a constant and end up with O(n).

Recursive calls to two sub-arrays is O(n), dividing subarrays is O(n/2), merging is O(n/2), and we drop constants, so O(n).

On the next call we have four sub-arrays O(n/4) + O(n/4) + O(n/4) + O(n/4), we drop constants, so O(n).

As we can see, every level is going to take O(n) time. This is where O(log n) comes into play. Because we divide each array in half, and we further divide them in half, and further divide them in half, we get the time complexity of O(n log n). To further explain, the merge sort algorithm divides the input in half until a list of one element is reached, which requires log n partitioning levels. At each level, the algorithm does n comparisons selecting and copying elements from the left and right partitions, yielding n * log n comparisons.

Space Complexity

Merge sort requires O(n) additional memory elements for the temporary array of merged elements. For the final merge operation, the temporary list has the same number of elements as the input. Some sorting algorithms sort the list elements in place and require no additional memory.

Let’s now implement the merge sort algorithm in Python.

def msort(arr, start, end):
    # Leaf node, condition to end recursion
    if start >= end:
        return
    
    # Internal node worker
    mid = (start + end) // 2 #we find a mid point to split the array into two halves
    msort(arr, start, mid)   #sorting the 1st haf recursively
    msort(arr, mid + 1, end) #sorting the 2nd haf recursively
    
    
    # Merge the two sorted halves
    i = start   # index to iterate over the firstr half
    j = mid + 1 # index to iterate over the second half
    mlist = [] #temporary array to put sorted elements
    
    """
    we are comparing values of index i and j in the sub arrays
    if the value at index i is smaller or equal to the value at index j
    we append this the smaller value to the temp array mlist
    otherwise, we add the value at index j to the temp array mlist
    then we keep moving the idex of i and j until the end of each sub array
    
    """
    
    while i <= mid and j <= end:
        if arr[i] <= arr[j]:
            mlist.append(arr[i])
            i += 1
        elif arr[j] < arr[i]:
            mlist.append(arr[j])
            j += 1
    
    # Gather phase        
    while i <= mid:
        mlist.append(arr[i])
        i += 1
    while j <= end:
        mlist.append(arr[j])
        j += 1
    
    # Store in original array
    arr[start:end + 1] = mlist

def merge_sort(arr):
    msort(arr, 0, len(arr) - 1)
    return arr

Conclusion

A merge sort is an example of a divide-and-conquer algorithm. It recursively breaks a problem into two or more related subproblems until they are simple enough to solve easily. We broke down a list into sublists until each sublist had only one element, making it simple to solve because a list with one item is sorted by definition. A merge sort’s runtime is O(n * log n) because while splitting the initial list into sublists is logarithmic, the algorithm requires linear time to handle each item in the sublists as it merges them. With log-linear time complexity, a merge sort is one of the most efficient sorting algorithms and is widely used by computer scientists.

Explore other common sorting algorithms:

Merge Sort Algorithm

Conceptual Overview

Time Complexity

Space Complexity

Conclusion

Written by Natasha Ferguson