Compile-time Merge Sort [C++]

Vanand Gasparyan
7 min readJun 18, 2020

--

In C++20 std::sort, along with other standard algorithms, becomes constexpr. That makes the following code possible.

The sorted_arr on line 14 is initialized to { 1, 2, 3, 4, 5} at compile-time, and the main returns 1. Here’s the Godbolt of this code to play around with. As you can see, there are no traces of the sorting algorithm and logic in the machine code. Moreover, if you turn the optimizations on (-O3), this whole code becomes a single instruction, proving that all the sorting happens at compile-time.

Let’s see how to accomplish this with C++17. Before starting with the implementation, we’ll list down what we need for the merge sort. If you don’t know how merge sort works, you probably don’t want to see the code that’s coming.

We need:

  • a sequential container
  • a way to split it on the middle
  • a way to merge two sorted arrays to get another sorted array

Here is the implementation of a compile-time sequential container.

The array is as simple as it gets. It’s a type with an arbitrary number of integer template arguments. It comes with get functionality to enable indexed access. get is implemented recursively. On lines 7–8 is the definition of the signature, that is, get takes two parameters, an array (type) and an index (integer). On lines 10–14 is the recurrence relation: the INDEX-th value of an array is the (INDEX - 1)-th value of that array without the first element. On lines 16–20 is the base case of the recursion: the 0-th value is the first element. If it still looks gibberish, think of the struct line as a function signature, the things inside the angle brackets (<>) as the parameters, and the “static const int value” line as the return statement. Example:

get_v defined on lines 22–23 is just syntactic sugar to avoid writing ::value all the time.

With this ready let’s start implementing the sort with a top-down approach. We want to achieve the following:

And here’s how:

On lines 4–5 we define the prototype: it takes a single array as an input. Lines 7–11 is one of the base cases of the recursion: sorting an empty array results in an empty array. Lines 13–17 is the other base case: sorting an array with a single element results in the same array. And finally, the meat and potatoes, on lines 19–29, the general case. Here we need to split the input array of arbitrary size into two, sort each and finally merge them. On line 23 we “precalculate” the length of the array to make the whole code more readable. On line 26 we take the first half of the array and sort it, on line 27 we take the second half and, again, sort it. On line 25 we merge those two. So now, we have to implement the merge_t, take_t and drop_t. And again, lines 31–32 is just syntactic sugar to enable writing sort_t<arr> instead of typename sort<arr>::type.

Let’s start with the drop as it’s the simplest (or is it?).

It takes an array and a number N and returns what’s left after removing the first N elements of that array. E.g.

The implementation is quite trivial. Lines 7–11 is the recurrence relation, where we drop the first element and decrement the N, and lines 13–17 is the base case, the case of dropping 0 elements.

This is very similar to the get up there, so it should work fine, right? WRONG! It doesn’t. The problem is the ambiguity in the template specialization. In get’s case everything was fine because we had the following two specializations.

The first specialization (line 5) takes two parameters. The first one, array<I0, Is...>, is an array of 1 or more elements, exactly “1 or more” because of the I0. The second parameter is an integer. In the second template specialization (line 11) the first parameter is exactly the same, but the second parameter is 0, thus this second specialization is more specialized. In summary, the first parameter must be an array of 1 or more elements, and the second one must be a number. If it’s 0, the second specialization (lines 10–14) will be instantiated, otherwise the first specialization (lines 4–8).

In the implementation of drop we had both parameters change. The first specialization (line 5) is for arrays with 1 or more elements and any number N. The second one (line 11) is for arrays with 0 or more elements and the number 0. 0 is more specialized than N, however, array<Is...> is less specialized than array<I0, Is...> (because the latter excludes the case of the empty array). Therefore the ambiguity. The compiler is unable to choose between these specializations when it sees, for example, drop_t<array<1, 2, 3>, 0>. Here’s how we fix this ambiguity:

Here we have 3 template specializations. The first one is the same as it was, the second one (line 11) is a more specialized version of the first one, and the third one (line 17) covers the case of the empty array (which the first two don’t). We don’t specialize the case of the empty array with a non-zero value, as it’s an erroneous case and we do want it to fail compilation.

We implement the take quite similarly. Again, we have 3 template specializations for the same reason. The base case of N=0 is quite trivial. There’s only one new thing in the general case (line 10), the prepend_t. It takes an integer, an array, and obviously prepends one to the other.

The implementation of prepend is quite trivial.

And finally, the last piece of the puzzle, the merge_t. It’s supposed to take two sorted arrays and merge them into a single sorted array.

As the signature suggests on lines 4–5, merge takes two arrays as input parameters. There are three base cases specialized. Lines 19–23 and 25–29 are the base cases merging an empty array with a non-empty one. The base case on 31–35 is defined only out of solidarity, just to cover the case of merging two empty arrays. The other two specializations don’t cover it as they're equally appropriate for that case, thus causing ambiguity. Nevertheless, our sort implementation will never ask for it. The general case logic is recursive again (lines 7–17). We basically find the smallest of the two first elements of those two arrays, separate it from its array, merge the rest, and finally prepend this separated element to that result. Here’s a step by step demonstration.

Let’s take a closer look at the general case of the merge.

As we can see (line 4), in the general case (the case of two non-empty arrays) merge resolves to a prepend. The two parameters of this prepend depend on the first elements of the two arrays (I0 and J0). The first parameter is the least of them (line 5). The second parameter (lines 6–8) is the result of a merge with the red arrays if the green condition is true, and with the blue arrays otherwise. std::conditional_t works a lot like the ternary operator (?:).

Check out the Godbolt of this code to play around with.

--

--