Templates and Compile Time Execution

Published in

The Zerone

5 min readJan 8, 2023

One template to rule them all, one template to find them, One template to bring them all, and in compile time bind them; In the Land of C++ where the shadows lie.

Readers are expected to be familiar with basic C++ and template programming.

Have you thought about the possibility of precomputing the values in compile time so that you can improve your computation speed? What if you could do calculations while the program is compiling? Compile-time programming refers to performing computations during the compilation to decrease the load during the program's execution. In C, it is implemented using macros.

#ifdef _DEBUG
// if the code is compiled in debug mode, this code is generated
#define LogDebug(x) x;
#else
// if the code is compiled in release mode, then the macro is replaced by empty code
#define LogDebug(x)
#endif
int main() {
#define var 10
#if var % 3 == 0
 LogDebug(printf("Divisble by 3")) // this line is used when variable var % 3 == 0 
#else
 LogDebug(printf("Not Divisble by 3")) // this line is used when variable var % 3 != 0
#endif
}

In the above example, the two different programs are generated during the compilation of the program for each case. But, **macros** are not highly flexible, and much more cannot be accomplished by macros apart from writing platform-specific code or compiling mode-specific code. So, how can we perform compile-time programming in a modern way? It can be accomplished by one of the most infamous and hideous programming techniques from the C++ lands. The template metaprogramming language is a powerful and mysterious tongue, known only to a select few who have devoted their lives to the study of it. The dark language of Template is even banned from many codebases as it can take ages to decode (compile time in this case).

A polymorphic function is a function that can be used for multiple data types. Monomorphic function means it applies to a single data type. Templates were introduced to the general folks as a way of writing functions and struct for parameterized types. Monomorphization is the process of converting a polymorphic function to many specialized functions for each data type. Templates use the process of Monomorphization for compile time polymorphism.


// here add function works for every type 
template <typename T>
T add(T a, T b) {
 return a + b;
}

int main() {
 add(1, 2); // calls add<int>(1, 2)
 add(2.0f, 3.0f); //calls add<float> (2.0f, 3.0f)
}

In the above program, the add function is generated for both the int and double types during the compilation, so it works for every type that supports the binary + operator. The generated assembly can also provide us with insights into how the template works.

As we can see in the generated assembly, two functions are generated int add<int>(int, int) and float add<float>(float, float).

Templates can also be used to create polymorphic struct. The same concept of Monomorphization is used for structures too as multiple signatures for each type are generated during compile time.


// Now, the classs works for every data type
// the compiler generates the appropriate code for every data type needed
template <typename T>
struct Wrapper {
  T data; // it can store data of any type
  Wrapper (T _data): data(_data) {}
  
  T get_data() {
   return data; 
  }
};


int main() {
   Wrapper <int> int_type(2); 
   Wrapper int_type(3); //valid from C++ 17, as it can infer types 
   Wrapper <float> float_type(3.0f);
}

With each passing age, the template gathered more powers and became the dark lord (or final boss) for the C++ realm. The nightmare of C++ developers contains the template error coming out of STL (Standard Template Library) or the sight of code from STL when they accidentally step into the function from STL. An immense amount of riches have been spent on monitors that can show complete template errors, but none have been successful.

Despite the reputation of *Templates*, all supervillains have a logical purpose. C++ templates also serve a lot to the realm, and one of the principal tasks it accomplishes is compile time computation. Compile time computation occurs during the compilation of the program rather than during execution. However, compile-time computation is only possible if the value of arguments is at compile-time. One of the most common examples of compile-time computations is the calculation of factorial.


#include <iostream>
#include <inttypes.h>
// general case for factorial
template <uint64_t n>
struct Factorial {
 static uint64_t const value = Factorial<n - 1>::value * n;
};
// base case for recursive definition
template <>
struct Factorial<0> {
 static uint64_t const value = 1;
};
int main() {
 int x = Factorial<6>::value; 
}

The above program is similar to the recursive function for the computation of factorial. But all the calculation occurs during the compilation phase. The call to Factorial<6>::value causes the instantiation of the primary template, and the recursion will end when the specialized template for Factorial<0> is reached. Each step of this calculation is executed during compile time, so the compiler can directly insert value as seen in the generated assembly below.

One of the more significant examples is that of Fibonacci numbers.


#include <iostream>
#include <inttypes.h>

template <uint64_t n>
struct Fibonacci {
 static uint64_t const value = Fibonacci<n - 1>::value + Fibonacci<n - 2>::value;
};

template <>
struct Fibonacci<0> {
 static uint64_t const value = 0;
};

template <>
struct Fibonacci<1> {
 static uint64_t const value = 1;
};

int main() {
 int x = Fibonacci<40>::value;
}

The recursive definition of Fibonacci took about 1 second to execute for n = 40, while it was much faster for compile-time execution. Similar to factorial, the value of Fibonacci<40>::value is already known at compile time.

For compile time computation to work, the arguments to function should also be known during compile time while limiting its use case vastly. Moreover, today’s compilers can produce optimized assembly code such that it can completely precompute the values known in compile time.


#include <iostream>
#include <inttypes.h>

uint64_t factorial(uint64_t n) {
if (n == 1) return 1;
 return n * factorial(n - 1);
}

int main() {
 int x = factorial(10);
 std::cout << x;
}

As we can see from the above-generated assembly with the -O3 flag, the factorial of the number is calculated without any call to the function. Also, readers familiar with assembly code can see that the factorial sub-routine is iterative rather than recursive, as we defined in source C++ code.

In conclusion, we can say that a template is a handy tool for compile time polymorphism or compile time execution of programs. However, caution must be applied while using such extreme force of programming.

Templates and Compile Time Execution

Written by Sanskaramgain