Rust vs C++. A Performance Comparison. Part 3. Effective abstractions

Dmytro Gordon
Rustaceans
Published in
15 min readMay 26, 2024

In preceding articles, we explored particular aspects like aliasing, move semantics, dispatch and memory layout. However, achieving optimal performance isn’t solely about fine-tuning specific implementations; it’s also about constructing abstractions with minimal or zero overhead.

In prior chapters, we’ve seen many things that Rust copes with better than C++. Yet, this chapter will reveal that many tools for constructing efficient abstractions are either entirely absent or severely limited in Rust, especially in its stable version. Disagree? Let’s explore further.

Abstractions, abstractions, abstractions

To ensure our abstraction achieves best possible performance, the compiler must be able to optimize each instantiation as if it were manually implemented for the specific parameters. This implies that it should be compile-time or static.

I’ll start with an important disclaimer. In many scenarios, using runtime abstractions has minimal impact on your application’s performance. A virtual function call isn’t typically a concern if the function is heavy or never appears on a hot path. However, in this article, we operate under the assumption that you’re constructing the fastest possible abstraction from smaller components. Here, dynamic dispatch not only introduces a substantial overhead but also prevents the compiler from applying optimizations like inlining and reordering.

In various programming languages, there are several tools that help you build such kinds of abstractions.

  • compile-time computations
  • generics or templates
  • compile-time reflection
  • macros

The sequence in which I’ve arranged these items isn’t arbitrary. In my view, they’re ranked based on code readability and maintainability. For example, generics are considered more readable than macros and should be prioritized over them whenever feasible. The order of the last two items may provoke debate, but I have a personal aversion to macros, so please forgive me for this. Additionally, there’s another option missing from the list: code generation. However, I believe there’s no significant difference between languages in this aspect, so it falls outside the scope of this article.

Let’s compare what C++ and Rust offer for each of these items.

Compile-time calculations

The fastest calculation is the one that is done at compile-time. Compilers are usually pretty good at evaluating something that depends solely on constants. However, you can’t fully rely on that, especially in complex cases. Also, if you need to use the result of a calculation in a context where only a compile-time value is applicable (such as a generic parameter), you have no other choice but to explicitly mark the computation as compile-time.

Let’s begin with C++. As of C++23, there are two important keywords for performing compile-time calculations: constexpr and consteval. The major difference between them is that the latter can be used exclusively in a static context, while the former can be utilized for both compile-time and runtime calculations. And there are numerous capabilities they offer:

  • Define functions.
constexpr int fibonacci(int n) {
if (n <= 1) {
return n;
}
else {
return fibonacci(n-1) + fibonacci(n-2);
}
}

In C++11, there were severe limitations on the expressions you could use. However, since C++14, these limitations have been relaxed. Now, you are free to use almost anything you can do in normal functions — call other functions (if they are constexpr, of course), use if and loop statements, and so on.

  • Define classes with constexpr methods, operators, constructors and destructors (since C++20)
class MyType {
public:
constexpr MyType(int val): val(val) {}
constexpr ~MyType() {}

constexpr operator bool() const {
return val != 0;
}

constexpr void set(int new_val) {
val = new_val;
}


private:
int val;
};

Since C++20, even virtual functions can be constexpr, enabling dynamic dispatch to happen at compile time, which may seem weird.

  • Allocate memory and use structures that use dynamic memory allocation (since C++ 20)
#include <vector>

constexpr int f() {
std::vector<int> vals{1, 2, 3, 4, 5};

auto result = 0;
for (const auto& v: vals) {
result += v;
}

return result;
}

constexpr auto value = f();

You can’t store something containing the result of dynamic memory allocation to a constexpr variable. However, you can use temporary objects that require memory allocation, including standard library containers. An this is quite impressive! Additionally, there is the constinit specifier (introduced in C++20), ensuring that the value is initialized with a pre-computed constant without any runtime overhead.

  • constexpr variables can be defined anywhere
constexpr int f() {
return 42;
}

constexpr auto value_1 = f();

void foo() {
constexpr auto value_2 = f();
static constexpr auto value_3 = f();
}

struct Bar {
static constexpr auto value_4 = f();
};

Actually from the language itself it seems that compile-time evaluation support even exceeds the expectations. There is no sense to do some weird template magic to calculate factorial at compile time. You may face issues only when some thirdparty library doesn’t mark some function you want to use as constexpr .

In fact, the language’s support for compile-time evaluation seems to surpass expectations. There’s no need for any tricks to calculate factorial at compile time. The only issue you may encounter is when a third-party library fails to mark a function you want to use as constexpr.

Let’s explore what Rust offers for similar purposes. At first glance, it appears quite similar: you can annotate free functions and methods as const to indicate that they can be evaluated at compile time:

const fn fibonacci(n: u64) -> u64 {
if n <= 1 {
n
} else {
fibonacci(n - 1) + fibonacci(n - 2)
}
}

struct Foo {
bar: u64
}

impl Foo {
const fn new() -> Self {
Self {
bar: fibonacci(12)
}
}
}

const foo: Foo = Foo::new();

However, there are several limitations that significantly reduce the power of const functions compared to their counterparts in C++.

  • Trait support is fundamental in Rust, serving as the primary mechanism for static and dynamic polymorphism. However, in stable Rust, traits do not play well with const. In fact, trait implementations cannot be marked as const, making trait methods unusable in const functions. Even if the implementation is trivial, it cannot be use in const context. Consider, for instance, if you wanted to make Fibonacci calculations generic:
// Doesn't compile
const fn fibonacci<T: From<u64> + Add<Output = T>>(n: u64) -> T {
if n <= 1 {
T::from(n)
} else {
fibonacci::<T>(n - 1) + fibonacci::<T>(n - 2)
}
}

But that doesn’t compile; the compiler isn’t aware if the implementations of From::fromand Add::add are const, and you have no options to restrict them! Due to this limitation, you can’t use a for loop in constfunctions because it calls Iterator::nextunder the hood:

const fn foo(mut n: u64) -> u64 {
// Doesn't compile. This loop must me replaced with `while`
for i in 0..3 {
n *= n
}

n
}

The inability to use trait methods is a significant limitation. Fortunately, there’s an unstable feature that addresses this:

#![feature(const_trait_impl)]
#![feature(effects)]

// Mark traits that may have `const` implementations
#[const_trait]
trait Foo {
fn foo() -> Self;
}

// Tell compiler that all the methods of the trait have const implementations
// \/
impl const Foo for u64 {
fn foo() -> Self {
42
}
}

// Mark that `T` implements `Foo` in a const way
// \/
const fn bar<T: ~const Foo>() -> T {
T::foo()
}

However, this won’t make our Fibonacci implementation compile because From is not annotated with const_trait attribute

  • Another limitation at compile time is performing floating-point arithmetic:
const fn add(lhs: f64, rhs: f64) -> f64 {
// Doesn't compile, error[E0658]
lhs + rhs
}

There’s a reason for this limitation — ensuring that const and non-const versions of the same functions return identical results is challenging. However, there is an unstable feature available to enable this.

  • Dynamic memory allocation is not possible in stable Rust. However, in nightly builds, there is a const_allocate function available for this purpose. It’s unclear why a special function was introduced instead of just making std::alloc::alloc const. This decision makes it impossible to use Box, Vec, or Stringin a const context, even after all necessary methods are marked const. To make it work, you’ll need to use another unstable feature — allocators. I found a Reddit thread where the author implemented a const allocator, but that’s undoubtedly quite burdensome. Additionally, all code that uses dynamic memory allocation must have the allocator type parameter propagated everywhere to be used in a const context. This stands in stark contrast to C++, where you can simply use std::vector or std::string !
  • Dynamic dispatch. As far as I know not supported even on nightly.

In conclusion, the support for compile-time calculations in stable Rust is severely limited. Enabling some unstable features does address some of the issues, but const functions in Rust still significantly lag behind what is available in C++20.

Generics or templates

Generic code allows you to write a function or a class with type (or non-type) parameters that must be specified at compile time. This is a popular concept found in many programming languages. Let’s compare how it is implemented in C++ and Rust.

C++ templates were introduced in the language back in 1993, and generic programming in C++ differs from what we see in other languages. Function and class templates are not fully checked for correctness until they are instantiated (when all template parameters are substituted with known values). The language allows you not only to create a template implementation for the class or function but also to add partial (for classes only) or full (both for classes and functions) specializations. These specializations are completely independent implementations that can even have different memory layouts and interfaces. Each instantiation is a unique case where a complex lookup is done to find the best specialization for the concrete set of template parameters. If some specialization implementation fails to compile, the compiler simply tries the next one — this is the famous SFINAE rule.

Thanks to this flexibility, C++ templates form a Turing-complete language, empowering you to accomplish nearly any task at compile time. However, templates are infamous for their complexity and the verbose error messages they generate, often making the process of resolving compiler issues feel like solving a crime. Fortunately, part of this challenge was mitigated by the introduction of constraints and concepts in C++20, offering the capability to specify necessary (but not sufficient) conditions for a successful instantiation.

In Rust, the concept of generics differs. Generic code is expected to be fully checked before instantiation occurs (well, almost fully). Traits are used to define the sufficient constraints for the input type parameters. This approach resolves the issue of dealing with mysterious errors during instantiation. The code may fail to compile for two reasons:

  • The generic method itself is invalid, a fact known prior to substituting concrete types.
  • The types you’re trying to instantiate a struct or function with do not meet the constraints.

And that seems like a really good idea and saves a lot of time. However, when attempting to implement more complicated cases, it becomes apparent that generics in Rust are less flexible than C++ templates. Let’s examine several points.

Variadics

One of the new features introduced in C++11 was variadic templates. Now you can write code that operates on an arbitrary number of types:

#include <iostream>

void print() {
std::cout << std::endl;
}

template <typename T> void print(const T& t) {
std::cout << t << std::endl;
}

template <typename Head, typename... Tail> void print(const Head& head, const Tail&... tail) {
std::cout << head << ", ";
print(tail...);
}

Using fold expressions introduced in C++17, one can eliminate recursion in the latest specialization:


void print() {
std::cout << std::endl;
}

template <typename T> void print(const T& t) {
std::cout << t << std::endl;
}

template <typename Head, typename... Tail> void print(const Head& head, const Tail&... tail) {
std::cout << head;
((std::cout << ", " << tail) , ...);
std::cout << std::endl;
}

The availability of variadic templates enabled the implementation of features such as std::tuple, std::function, std::format, and many others. This feature is extensively used in numerous libraries and projects, facilitating tasks that were previously accomplished only through the use of macros or cumbersome template tricks.

Well, is there something similar in Rust? Nope. Perhaps some unstable feature in nightly? Nope. Here is a document with design sketches of how it could be implemented, and that’s all for the moment. Are variadic generics really needed in Rust? I would say yes:

  1. You can’t write generic code to operate on arbitrary tuples. Typically, you’ll end up using macros to define functionality for a number of types up to some certain limit. This approach is common in many libraries.
  2. FnOnce, FnMut, and Fn cannot be turned into “regular traits” until we have variadic generics.
  3. Some advanced concepts, such as heterogeneous vectors, etc.

Specializations

Specializations greatly enhance the power of generic programming. With them, you can have a generic class or function whose implementation depends on the specific generic parameters, making the implementation highly flexible.

In C++, you can create partial specializations for template classes and full specializations for functions.

For classes, each specialization can have a different set of data members and methods. Let’s try to reinvent the Optional class we mentioned in the previous part. The generic implementation would have a layout similar to the following:

#include <type_traits>

template <typename T>
class Optional {
private:
std::aligned_storage<sizeof(T),alignof(T)> _data;
bool _initialized;
};

However, if T is a reference, we’re simply wasting extra space for a boolean flag (which likely means doubling the size of Option due to alignment). Therefore, we can create a partial specialization to handle this case:

template <typename T>
class Optional<T&> {
private:
T* _ptr;
};

Additionally, in some generic code, it may be beneficial for Optional<void> to work properly, yet our generic implementation doesn’t support it. Not a problem — full specialization resolves this:

template <>
class Optional<void> {
private:
bool _initialized;
};

It’s worth mentioning that you have to provide implementations of all the methods for each specialization separately, which can be a bit cumbersome if you want to maintain multiple specializations of a type implementing a certain interface. However, apart from that, it’s a really powerful mechanism for building effective abstractions! You’re free to create almost any kind of specialization, and as long as the compiler can find the one that is strictly more specific than the others during instantiation for certain template arguments, everything is good. In cases where several specializations match but the “most specific” one can’t be chosen, you’ll get a compiler error. As always in C++, with plenty of lines of error output at the instantiation moment.

As for functions, only full specializations are allowed, and I’ll simply leave that topic out of the scope of this article.

Let’s explore Rust’s approach. The type system in Rust differs, so there are several places where specialization could occur:

  1. Specializing struct layout based on generic arguments is impossible in both stable and nightly Rust. Additionally, there seem to be no plans for supporting it in the future.
  2. Specializing methods of a struct based on its generic arguments. You can introduce multiple impl blocks with distinct constraints for the same struct. Consider a scenario where you have a generic math vector type:
struct Vec<T, const N: usize>([T;N]);

You can implement the cross product only for the generic parameters for which this operation makes sense.

impl<T: Sub<Output = T> + Mul<Output = T> + Copy> Vec<T, 3> {
fn cross_product(lhs: &Self, rhs: Self) -> Self {
Self([
lhs.0[1] * rhs.0[2] - lhs.0[2] * rhs.0[1],
lhs.0[2] * rhs.0[0] - lhs.0[0] * rhs.0[2],
lhs.0[0] * rhs.0[1] - lhs.0[1] * rhs.0[0],])
}
}

However, the compiler won’t permit you to declare the same method in impl blocks that can match the same input generic parameters. Consider if we want to implement a method that adds a vector to itself:

impl <T: Add<Output = T> + Copy, const N: usize> Vec<T, N> {
fn add_self(&self) -> Self {
Self(self.0.map(|v| v + v))
}
}

However, for vectors of boolean values (treated as elements of the GF(2) field), we know that this operation always returns a vector of false elements.

#[derive(Default, Clone, Copy)]
struct F2(bool);

impl Add for F2 {
type Output = Self;

fn add(self, rhs: Self) -> Self {
Self(self.0 ^ rhs.0)
}
}

impl <const N: usize> Vec<F2, N> {
// error: duplicate definitions for `add_self`
fn add_self(&self) -> Self {
Self([F2::default();N])
}
}

Both stable and nightly Rust won’t allow you to do that; no specializations are permitted for structure method implementations. In some situations, there can be a workaround using the unstable negative_bounds feature.

#![feature(negative_bounds)]

...

trait IsF2 {}

impl <T: Add<Output = T> + Copy + !IsF2, const N: usize> Vec<T, N> {
fn add_self(&self) -> Self {
Self(self.0.map(|v| v + v))
}
}

...

impl IsF2 for F2 {}

impl <const N: usize> Vec<F2, N> {
fn add_self(&self) -> Self {
Self([F2::default();N])
}
}

3. Trait generic implementations present a similar situation in stable Rust to what we have with structure implementations; all trait implementations must match non-intersecting sets of input types. Consider a scenario where we have a trait that returns the number of elements the trait value represents:

trait NumElements {
fn num_elements(&self) -> usize;
}

impl<T> NumElements for T {
fn num_elements(&self) -> usize {
1
}
}

impl<T> NumElements for [T] {
fn num_elements(&self) -> usize {
self.len()
}
}

So far, so good. However, if we attempt to add an implementation for a tuple, we’ll encounter a conflicting implementations error:

impl<T, U> NumElements for (T, U) { // conflicting implementation for `(_, _)
fn num_elements(&self) -> usize {
2
}
}

The good news is that there is an unstable feature called min_specialization, which allows you to resolve that issue:


#![feature(min_specialization)]

trait NumElements {
fn num_elements(&self) -> usize;
}

// NOTE `default` here
impl<T> NumElements for T {
default fn num_elements(&self) -> usize {
1
}
}

impl<T> NumElements for [T] {
fn num_elements(&self) -> usize {
self.len()
}
}

impl<T, U> NumElements for (T, U) {
fn num_elements(&self) -> usize {
2
}
}

The min_specialization feature contains a subset of a larger feature, unsurprisingly called specialization. This is a crucial language feature with significant implications. You can find the RFC here, which includes many examples explaining why it’s essential for building zero-cost abstractions, such as enabling efficient overloads for length-aware iterators and other functionalities. However, the reason for the existence of min_specialization is due to the current implementation of specialization being unsound. This means it can potentially produce undefined behavior in code that isn’t marked unsafe.

4. Generic free function specializations are forbidden, and it seems reasonable since even function overloading is prohibited.

Constant parameters support

Not only types can serve as template or generic parameters; integer values can as well.

C++ has a realy good support for non-type template parameters. You can use any type of constexpr calculation when passing values there. Let’s employ the following simple example for comparison: a 2D field that has a scale method to create a field of double size.

template <size_t M, size_t N, typename T>
class Field {
public:
Field<2*M, 2*N, T> scale() const {
// implementation
}

private:
int _data[M][N];
};

It works fine in C++, as do many far more complex examples. But what about Rust? The const generics feature was introduced in Rust 1.51.0 back in 2021. However, the problem is you can’t really perform any arithmetic with it, not even the most simple operations like in the example above:

struct Field<const N: usize, const M: usize, T> {
data: [[T;N];M]
}

impl <const N: usize, const M: usize, T> Field<N, M, T> {
// error: generic parameters may not be used in const operations
// cannot perform const operation using `N`
fn scale(self) -> Field<{2*N}, {2*M}, T> {
todo!()
}
}

It doesn’t compile. Also, you have almost no possibilities to use constant generic parameters in bounds (apart from a trick with [T; N]: Sized). There is an unstable generic_const_exprs feature that allows using generic constant expressions, but it’s far from being stable.

Afterword

So we’ve seen that C++ looks much better when dealing with compile-time calculations and generic code, especially when comparing against stable Rust. One may think that most of the examples above that show limitations of Rust are artificial, but just believe me, as long as you start implementing some library or utility code for a big project, you’ll unavoidably face them. It is really upsetting that in a language that has so many mentions of “zero-cost abstraction” in the description, there are so many obstacles for creating your own zero-cost abstractions. Rust’s standard library is built with some unstable features enabled, and you can see there some of those, which are directly related to generic/const functionality: const_trait_impl, min_specialization, negative_impls. That’s the best proof that there is a demand for this functionality to create efficient abstractions. But is stabilizing and implementing that kind of features a high priority? Taking a look at the Roadmap 2024, we can see some of the related items there:

  • associated generic types (already implemented)
  • const generics and constant evaluation
  • making Fn s “normal” traits
  • variadic tuples and variadic generics

Yes, at least some of the mentioned problems are identified as major, but not all of them.

In the next article, we’ll complete our list and delve into more exotic topics like compile-time reflection and macros. Stay tuned for more!

Hey Rustaceans!

Thanks for being an awesome part of the community! Before you head off, here are a few ways to stay connected and show your love:

  • Give us a clap! Your appreciation helps us keep creating valuable content.
  • Become a contributor! ✍️ We’d love to hear your voice. Learn how to write for us.
  • Stay in the loop! Subscribe to the Rust Bytes Newsletter for the latest news and insights.
  • Support our work!Buy us a coffee.
  • Connect with us: X

--

--