Async C++/Rust Interoperability

Aida Getoeva
13 min readApr 3, 2024

--

This article is a write-down of my talk for the Rust Nation UK 2024 conference.

Motivation: Why C++?

Rust seems like a great and safe alternative to C++. Desire to rewrite everything in Rust may be compelling: no foot-guns and all the benefits of a system programming language. However, there are many reasons why there is still a place for C++, let’s look at some of them:

  • Extra responsibilities. If there is a perfect solution out there, that is being actively developed and well supported, it’s very hard to find a reason for another one. Writing your own library, apart from the complexity of it, means taking all the responsibilities for maintaining and updating said library.
  • New code, new bugs. Designing a new solution from scratch may solve some existing problems but bring many new ones.
  • Limited resources. If there is not enough resource, it’s hard to justify why C++ -> Rust rewrite is a good idea. Yes, it can potentially improve maintainability and reduce the tech-debt, but it will also bring negative short-term impact.

Interoperability Tools

There are at least three main solutions for C++/Rust interoperability: Bindgen, CXX and cpp! We’ll focus on Bindgen and CXX, but the issues discussed below will most probably be relevant to any binding tool that doesn’t support seamless async.

Before we jump into the specifics, it’s important to keep in mind that C++ code is complex and neither of the mentioned tools is currently capable of correctly interpreting inheritance and heavy use of templates. That means Rust may not know the size and structure of some of the exposed C++ types and cannot pass them by value.

That said, there are reasons why you may choose CXX over Bindgen.

Bindgen

Being the oldest and most used tool on the market. Bindgen is capable of consuming whole C++ libraries, lists of header files and producing extern “C” code compatible with Rust.

  • Supports seamless conversion between primitive types and raw pointers but not for the common complex types like string or vector.
  • No support for the C++ smart pointers means unsafe memory handling and lack of implicit destructors.
  • Bindgen doesn’t handle neither constructors nor destructors and frequently requires to write a C-like simplified API around the types and functions being exposed to Rust.
  • Generated extern “C” functions are unsafe.
  • Bindgen operates on a best-effort policy, meaning it handles huge amount of code in one go producing silent errors that lead to failures in runtime.

CXX

The idea behind CXX is to provide safe and idiomatic bindings without much overhead. It takes a bit more manual approach by requiring the engineer to explicitly specify types, functions and methods that need to be exposed between C++ and Rust in the special module called cxx::bridge.

  • Supports seamless translation between vectors, strings and smart pointers.
  • The bridge provides a two-way road between C++ and Rust allowing to define types shared between both languages and types being exposed from Rust to C++ and vice versa.
  • CXX employs static analysis to check the external function signatures to provide safe external bindings, i.e. extern “C” functions are safe.
  • CXX does only what it’s told to do, no silent errors, no surprises.
  • Smart pointers! The support of smart pointers reduces burden of watching memory safety to a minimum and basically eliminates most of the potential issues.

Binding examples

You can find a comprehensive and easy to follow guide on CXX bindings here and for Bindgen here.

Let’s assume that we want to bind a simple C++ library for Mysql. Here is how it can look like:

// represents a connection to the DB, allows to run queries
class Connection;
// options for the connection: connection timeout, etc
class ConnectionOptions;

// client allows to concurrently acquire connection to the DB
class MysqlClient {
// address of the DB shard we're connecting to
std::string shard_;

public:
MysqlClient(const std::string db): shard_(db) {}
~MysqlClient();

// connects to the DB and returns a Connection instance
std::unique_ptr<Connection> connect_sync(const ConnectionOptions& opts);

// asynchronously connects to the DB
folly::SemiFuture<std::unique_ptr<Connection>>
connect(const ConnectionOptions& opts);
};

// a simple wrapper that allows to create a client instance
// wrapped in the shared pointer
std::shared_ptr<MysqlClient> new_mysql_client(const std::string db) {
return std::make_unique<MysqlClient>(std::move(db));
}

That’s how Rust side can look like using CXX.
We will only expose synchronous code for now.

#[cxx::bridge]
mod ffi {
unsafe extern "C" {
// opaque C++ types
type MysqlClient;
type Connection;

fn new_mysql_client(
db: &str
) -> cxx::SharedPtr<MysqlClient>;

// client member-function
fn connect_sync(self: &MysqlClient, opts: &ConnectionOptions)
-> cxx::UniquePtr<Connection>;
}
}

// a simple wrapper-type for the client
struct RustMysqlClient {
inner: cxx::SharedPtr<ffi::MysqlClient>,
}

impl RustMysqlClient {
fn new(db: &str) -> Self {
Self {
inner: ffi::new_mysql_client(db),
}
}

fn connect(&self) -> cxx::UniquePtr<ffi::Connection> {
let opts = /* define options */;
self.inner.connect_sync(&opts)
}
}

Async Programming

Why asynchronous programming is tricky? What is so different between C++ and Rust async API that needs a special treatment?

Asynchronous function call is not just another sequential block of code we can run. It’s a task that is going to be executed at some point of time “independently” from the main flow of the program and concurrently to it and other tasks within the program. Let’s look at the specifics for C++ and Rust.

In the examples below I’m using Folly (C++ library) and Tokio (Rust crate) for the async runtime, but the content of this article is library-agnostic.

In C++ the result of an async function call is not a value, it’s a promise of a value. You can think of it like a container that is empty for now and is fulfilled as soon as the execution completes. The call immediately schedules a task to be executed but it doesn’t let you know how much time it’s actually going to take, instead it gives you an API to access the result when ready.

Async Rust works similarly in a way, but has one interesting difference: a call to the async function does not really promise you anything... just yet. The resulting object, a future, is not going to do much unless it’s specifically polled, for example, as a part of an await call. Only then it promises to complete and return the value.

The challenge within async bindings is in the fact that neither Rust nor C++ runtime is aware of the other one. And having concurrent tasks running in both of them we need to come up with a way to share the state of the futures across the runtime border.

Running Async C++ from Rust

The big question now is how do we make Rust and C++ runtimes talk?Assume we want to connect to the Mysql server using an async method:

// async connect
folly::SemiFuture<std::unique_ptr<Connection>>
MysqlClient::connect(const ConnectionOptions& opts);

As mentioned, a non-waiting call to the async C++ method will schedule the future to be executed, so let us do just that. Let’s make a sync call on the Rust side to the FFI connect and leave it at that.

What do we have now? There is a future on the C++ side that is trying to connect to Mysql, but we don’t have any idea when that will actually happen.

We want to make Rust runtime be aware of the C++ future completion

A naive way to make it work is to expose C++ future type to Rust together with all the API necessary to poll the future and check for it’s completion. We can check if the future object is ready in a loop and sleep in between the calls. The problem though… it doesn’t feel very asynchronous.

The reason we’re dealing with async here is because we want it to keep async in Rust, for example, due to heavy IO within the program. We can make sleep async (see tokio::time::sleep) which solves some of the problems, but not all. The amount of boilerplate and a not-so-sound way of communication makes tedious.

One way to do it is using good old ready?-sleep! loop

Let’s now think about something out of a Computer Science class, more specifically inter-process communication. How would you implement a game of ping-pong between two processes?
That’s right, using a channel!

We can define an async channel on the Rust side and ask C++ to send the result of the future back through the channel. Oneshot is a good candidate to use, it allows to synchronously send a single message and asynchronously receive it.

This means, that from the Rust perspective, the first connect call will stay sync, while the async part’ll happen while waiting on the receiver!

I guess we could also use a pigeon post…

Now it’s time to talk about the “boring” stuff: how is it actually implemented?

As you may have noticed, C++ doesn’t know anything about the Rust channel type. And the fun thing: it doesn’t need to. We can define a couple of simple callbacks in Rust that will know how to talk to the channel and expose those callbacks to C++ as a couple of function-typed arguments.

#[cxx::bridge]
mod ffi {
extern "Rust" {
type Transmitter;
}

unsafe extern "C++" {
fn connect(
client: SharedPtr<MysqlClient>,
options: ConnectionOptions,
ok: fn(Transmitter, UniquePtr<Connection>),
fail: fn(Transmitter, &CxxString),
tx: Transmitter,
) -> Result<()>;
}
}

// result of the connect call with string as an Error type
struct ResultPayload(Result<Connection, String>);
// a simple wrapper for the sender part of the channel
type Transmitter = Box<oneshot::Sender<ResultPayload>>;

// callback in case of a successful result
fn ok(tx: Transmitter, conn: cxx::UniquePtr<ffi::Connection>) {
let conn = Connection(conn);
let _ = *tx.send(Ok(conn));
}

// callback in case of an exception
fn fail(tx: Transmitter, error: String) {
let _ = *tx.send(Err(error));
}

// now we can async connect to Mysql
async fn connect(
client: RustMysqlClient, opts: ConnectionOptions
) -> Result<Connection, String> {
// create a channel
let (tx, rx) = oneshot::channel();

// sync connect to start the future execution
ffi::connect(client, opts, ok, fail, Box::new(tx))?;
// async wait on the channel to get the future result back to Rust
let ResultPayload(result) = rx.await?;

result
}

In C++ we have to implement a simple shim API that knows what to do with the callbacks:

// definition for `ok` callback type
using OkCallback = ::rust::Fn<void(
::rust::Box<Transmitter>,
std::unique_ptr<Connection>
)>;
// definition for `fail` callback type
using ErrCallback = ::rust::Fn<void(
::rust::Box<Transmitter>,
const std::string&,
)>;

void connect(
const std::shared_ptr<MysqlClient>& client,
ConnectionOptions opts,
OkCallback ok,
ErrCallback fail,
::rust::Box<Transmitter> tx,
) {
client->connect(options)
.via(get_executor()) // schedule the future on the executor
.thenTry([ok, fail, tx = std::move(tx)](auto&& res) mutable {
if (res.hasValue()) {
// successful result, let's send it back over the 'ok'
(*ok)(std::move(tx), res);
} else {
// not that successfule one...
(*fail)(std::move(tx), "Failed!");
}
});
}

Issues Within Async C++/Rust

There wouldn’t be this talk here if things were all sunshine and roses :)
If I’ve ever got a segmentation fault in Rust it was while working on the async bindings for C++. So… let’s look at some of the things that may (and probably will) go wrong.

Thread safety

Send and Sync traits represent two fundamental concepts in async Rust:

  • Send means the object is safe to send between threads.
  • Sync means the object is safe to share between threads.

Send/Sync are unsafe traits. Although they’re automatically derived for complex types consisting of Send/Sync primitives, there are times you have to make a decision yourself whether to implement or not the traits for your type.

Not having the type being Send/Sync still allows you to run async Rust code, but it significantly limits the ability to use async patterns. For example, not-Send types cannot be spawned as a part of a task on another thread.

The tricky part within the bindings is that the engineer is fully responsible for figuring out whether the C++ type being exposed is Send and Sync safe.

Unfortunately, the only way out of this is to have a comprehensive understanding of the C++ code, or very good comments and some trust :)

It’s easy to copy-paste stuff, it is also common for the types to implement both Send and Sync traits. Keep in mind that it is not necessarily true. For example, a Mysql connection is perfectly Send-safe but definitely not Sync-safe.

Copy-paste bugs are the worst :/

Lost pointers

Now, let’s assume that either we’re using Bindgen that doesn’t support smart pointers, or the MysqlClient::connect(..) returns a raw pointer. Either way, we have to send a raw pointer over the channel.

What if the channel receiver drops before it reads the message? This can very well happen when you race multiple tasks and one of them decides to panic or when you set a time limit on the async task. The receiver is being destroyed, but not the C++ side. C++ still continues to execute the future and sends the result over the channel. Because the receiver is gone, the objects within the channel will be destroyed too.

So… what’s going to happen to the newly acquired connection object?
It’s leaked!

It goes straight into the abyss of leaked memory

The object within the channel is a raw pointer, meaning it’s just an address. Rust doesn’t know anything about it, neither size/structure nor which destructors should be used to free the memory. When the channel starts to unwind, the connection pointer will be destroyed but not the connection itself. Thus it’s leaked.

Apart from the memory leaking it can also lead to nasty deadlocks.

Connection pool is a way to reuse connections to the server and save on the round-trips. If there is an open and free connection in the pool, it will be returned on connect(..). If all the connections are busy, connect(..) will wait until one gets freed.

Now, let’s assume we have a pool for two connections. We are trying to acquire two connections in a row, limiting each task to run max 5ms. If the connect(..) future takes more time to execute, it will be dropped before the connection is ready. Dropping the receiver and whatever is within the channel.

When we call connect(..) again, we’ll hit a deadlock: all connections in the pool are leaked and not freed, meaning the pool still thinks they’re in use.

At this point we’ll be waiting either till one of the connection expires (if there is an expiration time set) or… forever (:

Deadlock on connection pool ☠️

How do we deal with it?
Smart pointers. Smart pointers or explicit destructors:

  • If there is a way to use smart pointers, use them. When a smart pointer is being freed, it does call the destructor.
  • Otherwise (for example, in Bindgen), wrap the raw pointer into a struct that knows how to free the memory. You are required to expose the destructor and explicitly call it from the Drop::drop().
Be smart, drop things

Lifetime issue

Probably the most important feature of Rust is being memory safe. Objects in Rust are either exclusively owned, read-only shared or explicitly declared and safe for the write-read sharing.

What if C++ gets an access to an object owned exclusively by Rust?

Let’s consider this scenario. C++ connect(..) function takes a raw pointer to the client as an argument. Mysql client object is exclusively owned by Rust, it is Sync-safe and we would like to access the client and acquire multiple connections concurrently. That means we cannot give up on ownership when calling C++, but we must provide a pointer to C++ anyway.

async fn connect(client: Arc<MysqlClient>) -> Result<Connection, Error> {
ffi::connect(client.as_ptr(), ok, fail)
...
}
C++ access the client object exclusively owned by Rust

What if Rust runtime starts to shut down?
The stack unwinds, the objects are being destroyed, the memory deallocated.

C++ runtime is not bound to Rust, if the connect future hasn’t finished the execution it will continue running until… until the client object suddenly goes away.

SEGFAULT!

Have you seen SEGFAULT in Rust? Now you have!

This happens because Rust thinks that it has an exclusive ownership over the client object, it is not aware of the C++ side using the client. On shut down, as the client users go away, the reference counter Arc<..> gets to 0 and the Mysql client within it gets freed.
But it is still used by C++!

How do we make it work?

Well… smart shared pointers! If it’s not possible (Bindgen), please read further :)

We need to fix the issue in the root of it: bind Rust runtime to the C++ one. Make Rust aware that client is used by C++ and extend its lifetime until C++ future is running. The easiest way to do that is to explicitly share a clone of object with C++. Meaning the reference counter will never gets to 0 while C++ future is still executing.

Share things responsibly

You may think that this is not possible… how would C++ understand how to use Arc<..>? The answer, it cannot understand it and doesn’t need to. We can share a clone of the object just as a “luggage”, an additional weight with the only goal: to be alive until C++ is done with the future.

Basically, C++ will still take a raw pointer to the client, but as another argument it will receive a blob, a pointer to some Rust object (clone of Arc). C++ only needs to send this blob back to Rust over the channel as soon as it’s ready. Then Rust will free it and proceed with whatever it’s doing.

This is an example of Rust code sending a clone over to C++:

async fn connect(client: Arc<MysqlClient>) -> Result<Connection, Error> {
let (tx, rx) = oneshot::channel();
let context = Box::new((tx, client.clone()));

ffi::connect(
client.as_ptr(),
Box::into_raw(context) as *mut c_void,
ok,
fail,
)?;

let ResultPayload(result, _context) = rx.await?;
result
}

The End

…and thus C++ and Rust lived happily ever after.

--

--