Forbidden diaries of Pandemonium. Part 1. Tips for calling C from Rust.
In this series of posts, I will share my experience building
libPandæmonium. In a first chapter, I will talk about interesting quirks of FFI in Rust. They will apply to libraries outside of FreeBSD as well.
What is Pandæmonium?
It’s easier to tell what is the goal of Pandæmonium — simplified interaction with FreeBSD ecosystem. The end goal of this project is to have jail based container environment. The goal+ is to make it work with k8s. Something like this:
Linking to C library
A common pattern is to create a
<libname>-sys crate that does nothing, but surfaces C API and tells the linker where to link to. You have to use separate
-sys crate because at most only one crate can link to the same shared library.
Three ways of doing it:
- Use bindgen at compile time
- Use bindgen CLI tool
- Write bindings yourself bindgen didn’t work for me so I’ve written them myself.
Fresh Object Pattern
Where common pattern in C is when the caller doesn’t need to know about internal structure, but only has a pointer to it. In that case library usually, have
destroy calls. For example C header:
Corresponding rust code:
C create function returns a pointer to some
nvlist_t structure. In this case, we don't care about the internals of
nvlist_t. However, we want to have some type safety, so using pointer type will provide even worse type safety than C.
In this case, it’s very common to create empty enum because you can’t instantiate it from rust. Another common way to do is to create a structure that holds a zero size array. I don’t recommend it because you can instantiate such structures and mess things up.
You might think just calling that method will take care of everything and call it a day. Well…no. There are a number of reasons C might fail to initialize your thing — in our case it can only happen if allocation failed a.k.a.
Create and Destroy
Alright, now you have unsafe functions to create and destroy an object, how do you actually create it?
First, you have to create a safe wrapper structure around that unsafe pointer:
Again, we don’t care about the internals — all we care about is the pointer to that structure.
We make an unsafe call, check if the pointer is actually pointing somewhere and if it is we return our wrapper, if not we return an error.
Nope, we just created a memory leak. If
NvList goes out of scope it's going to drop the reference to structure it was pointing to, but the structure will live. You have to implement
Drop trait on it:
Flags in C usually just a bits meaning flag_a is
0b001 and flag_b is
0b010 and both of them is
0b011. It's up to you how to implement it: use
bitflags crate and just write by hand:
Errors in C
Unlike Rust there is no
Result in C there a few patterns used in C, some of them:
- Structure has its own error code store
- Library has a global pointer to error code
- Function takes a pointer to where it will write error code
The library I’m interacting with use first one. C header:
We pass a pointer to structure to the library, the library gives as an error number. Where 0 means no error.
Passing things like i32, bool, null is very easy. Please note that in my case library performs a copy. That means Rust side can safely drop value after insertion. However, that library also has corresponding move methods that will take ownership of memory, that means it’s on you to make sure Rust won’t free that memory too early — that’s how you get use-after-free.
If you thought strings in Rust are hard and complicated…Well, welcome to C…
*const i8 is just a pointer to CString. Strings in C are NUL terminated, that means
CString is the same as Rust's
\x00. If your strings contain a null byte it will fail conversion to
Null pointer in rust
Rust does have null pointers. It’s just impossible to run into it without unsafe code or using null pointer type. To make a safe wrapper around
nvlist_add_null in rust:
nvlist_add_null returns nothing, to check if insertion actually went through you have manually check for error code:
Passing booleans and numbers to C
C doesn’t have the boolean type. Instead, they use integer and type alias. In most cases, you can just type
bool in unsafe method and use that. To make a safe wrapper around our unsafe function:
Passing any primitive number will exactly the same just change bool to i8/i32/i16 whatever.
Passing arrays of primitives
In order to pass an array, you have to pass a pointer to first element specifying what type it’s using and how many elements. The way C is going to read it is: “read every N bytes M types”. Where N is the size of your type and M is length of the array.
Very easy with primitives:
If you have a Vec then just get a slice of it and then get the pointer. If you want to transfer ownership — use
Passing an Array of Strings
This one is a bit weird because the signature of the method is:
Read is a pointer to a pointer of i8.
That’s a lot of code. First we convert our slice of strings into Vec of CString. Common mistake would be convert to CString and
as_ptr() at once. That will give you pointer to garbage because Rust will call
free() on strings you just created and let you keep that pointer. That's why it's two step operation. Other than that it's exactly the same as passing an array of bools or integers.
The same way goes for everything else. Get a vector, make another vector with pointers to elements from the first vector, pass the pointer to the second vector to C:
That’s it for today. In Chapter 2 I will show how to read things from C. I planned to put all in one, but this chapter is already >1600 words. Code of the library is available here.