A gentle introduction to C++ Modules

Guilherme Ferreira
5 min readMar 9, 2024

For decades, C++ developers have organized their code the same way C does: using header and source code files. This approach has serious drawbacks such as unnecessary recompilations, weak visibility control, symbol redefinitions, and inclusion order dependencies. Precompiled headers, pImpl, and include guards exist solely to work around these problems. The core issue with the header file mechanism is that it depends on two separate tools: the preprocessor and the linker.

C++20 introduced Modules as a new way to organize the source code. Modules are a language feature. It means that compilers are (mostly) responsible for managing inter-module dependencies. So, the source code organization is part of the language instead of being delegated to other phases of the building.

Header files

So far, the most common way to organize C++ source code is by declaring the public interface in a header file, for example in the header.hpp.

#ifndef MY_HEADER_HPP
#define MY_HEADER_HPP

namespace my_namespace {

struct PublicType {
void print();
};

void PublicFunction(double n);

} // namespace my_namespace

#endif // MY_HEADER_HPP

And defining the implementation in a source code file like source.cpp.

// Header inclusion is managed by the preprocessor. Including a
// non-existing header is detected by the proprocessor. For example
// g++ -E source.cpp -o source.ii

#include "header.hpp"

#include <iostream>

namespace my_namespace {

// Public method
void PublicType::print() {
std::cout << "PublicType::print()" << std::endl;
}

// Public function but with incorrect signature
void PublicFunction(int n) {
std::cout << "PublicFunction(" << n << ")" << std::endl;
}

// Private function. Should not be visible outside this translation unit
static void PrivateFunction() {
std::cout << "PrivateFunction()" << std::endl;
}

} // namespace my_namespace

Whoever wants to use that code, must include the header. The main.cpp file exemplifies this.

#include "header.hpp"

// Forward declarations uncover hidden symbols, postponing error
// checking to the linker
namespace my_namespace {
void PrivateFunction();
}

int main() {
// OK: accessing a public symbol
my_namespace::PublicType pubt;
pubt.print();

// ERROR: using a function that doesn't exist (has an incorrect
// signature)
my_namespace::PublicFunction(1.2);

// ERROR: accessing local (i.e. static or private) symbol
my_namespace::PrivateFunction();

return 0;
}

The header file works as a contract between source.cpp and main.cpp. The incorrect use of a symbol is detected only during linking:

$ g++ -c source.cpp -o source.o
$ g++ -c main.cpp -o main.o
$ g++ main.o source.o -o program
/usr/bin/ld: main.o: in function `main':
main.cpp:(.text+0x21): undefined reference to `my_namespace::PublicFunction(double)'
main.cpp:(.text+0x26): undefined reference to `my_namespace::PrivateFunction()'
collect2: error: ld returned 1 exit status

The header file inclusion is handled by the preprocessor, while their implementation requires linker support. This means that dependency management depends on two tools in disjoint building phases. Some dependency errors happen during preprocessing, others during linking. No tool has a complete view of the dependencies. This limits the capability of error diagnostics. Not to mention delaying the error reporting.

Modules

In contrast to the header’s approach, Modules are a language feature. This means that dependency management is done during compilation, instead of preprocessing and linking.

As an example, the file module.cpp declares a module unit.

// Global Module Fragment is used to include headers when 
// importing the headers is not possible.
module;

// Inclusion of header files between the "Global Module Fragment"
// and the "Module Declaration".
#include <iostream>

// Module Declaration creates a module unit
export module my_module;

// Namespaces and modules are orthogonal: one module can have multiple
// namespaces, and the same namespace might exist across multiple modules.
namespace my_namespace {

// The "export" keyword makes symbols visible outside the module.
export struct PublicType {
void print() {
std::cout << "PublicType::print()" << std::endl;
}
};

// Declaration and definition in the same statement, reducing the chances
// for signature mismatch
export void PublicFunction(int n) {
std::cout << "PublicFunction(" << n << ")" << std::endl;
}

// The default visibility is private
void PrivateFunction() {
std::cout << "PrivateFunction()" << std::endl;
}

} // namespace my_namespace

To use this module, a translation unit (e.g. main.cpp) imports the module by its name.

// The "import" declaration makes available in this translation unit
// all declarations and definitions exported by the "my_module" module
// interface unit.
import my_module;

// Forward declarations are no longer able to circunvent the visibility
// control
namespace my_namespace {
void PrivateFunction();
}

int main() {
// OK: accessing an exported symbol
my_namespace::PublicType pubt;
pubt.print();

// ERROR: using the incorrect function signature
my_namespace::PublicFunction(1.2);

// ERROR: accessing non-exported symbol
my_namespace::PrivateFunction();

return 0;
}

Misuse of modules is detected during the compilation.

$ g++ -std=c++20 -c module.cpp -fmodules-ts -o module.o
$ g++ -std=c++20 -c main.cpp -fmodules-ts -o main.o
main.cpp: In function 'int main()':
main.cpp:14:19: error: 'PublicFunction' is not a member of 'my_namespace'
14 | my_namespace::PublicFunction(1.2);
| ^~~~~~~~~~~~~~
main.cpp:17:19: error: 'PrivateType' is not a member of 'my_namespace'
17 | my_namespace::PrivateType();
| ^~~~~~~~~~~

By detecting dependency errors in earlier stages of the building process, Modules save development time. It optimizes the overall building process, especially for large-scale applications.

Modules and Templates

The source code using templates requires access to the template definition to instantiate (i.e. create) a proper implementation of it. This is why the inclusion model, a header file approach, is the most common way to use templates. Each template user instantiates its template version and duplicated symbols are managed by the One Definition Rule (ODR).

Depending on the compiler, we can place templates inside modules. Recent versions of GCC handle the template instantiation.

For example, the module_tmpl.cpp file declares a class and a function template.

module;

#include <iostream>

export module my_module;

namespace my_namespace {

// export a template class
export template <typename T>
struct PublicTemplateClass {
void print(T param) {
std::cout << "PublicTemplate::print(" << param << ")" << std::endl;
}
};

// export a template function
export template <typename T>
void PublicTemplateFunction(T param) {
std::cout << "PublicTemplateFunction(" << param << ")" << std::endl;
}

} // namespace my_namespace

And the main.cpp file uses those templates.

#include <ostream>

import my_module;

int main() {

my_namespace::PublicTemplateClass<int> pubtmpl;
pubtmpl.print(14);

my_namespace::PublicTemplateFunction(1.2);
my_namespace::PublicTemplateFunction("a string");

return 0;
}

There is one caveat though. If the templates use templates themselves, we have to instantiate the templates that they use. The compiler doesn’t know ahead of time which specializations to create during the module compilation. Explicit template instantiation can generate these specializations.

Compiling modules containing templates is the same as compiling regular code.

$ g++ -std=c++20 -c module_tmpl.cpp -fmodules-ts -o module_tmpl.o
$ g++ -std=c++20 -c main.cpp -fmodules-ts -o main.o
$ g++ -std=c++20 main.o module_tmpl.o -o program
$ ./program
PublicTemplate::print(14)
PublicTemplateFunction(1.2)

Summary

Modules offer a more modern solution to the source code organization. They provide a centralized visibility detection on the compiler. This contrasts with header files, which use the preprocessor and the linker, tools outside the language scope.

Module units are compiled only once. In contrast, a header file is recompiled on every translation unit that includes it.

The order of module imports doesn’t matter because there are no macro or other side effects. Headers, on the other hand, are susceptible to various side effects like macros hiding member functions.

Modules prevent collisions between an identifier defined in one header and used in another.

Finally, modules are a promising mechanism to reduce the obnoxious template compilation times.

--

--

Guilherme Ferreira

I'm a passionate C/C++ developer with a long experience on different network applications, from device driver level to the application layer.