Writing Rust NIFs for your Elixir code with the Rustler package
Elixir is a language that has positioned itself as suitable for high concurrency workloads. It does so by running on the Erlang VM (BEAM), which is well known for its implementation of the actor model. Actors are realized as BEAM processes which are lightweight compared to OS threads, but most importantly they have their own stack and heap which share nothing with other processes. Communication requires message passing semantics, which in practice means copying values around.
Process execution is handled by the schedulers of which BEAM will usually spawn a number equal to the number of logical cores on the system. The schedulers have a queue in which processes are sent to for execution scheduling. Each process runs for a certain number of reductions before being rescheduled. Reductions are a somewhat obfuscated term in the Erlang docs, but the number that is frequently quoted is 2000 reductions is roughly <1ms, which is also the time slice a process runs for before rescheduling. This preemptive behavior is what gives BEAM its low latency and soft real-time capabilities.
What this means is that a process needs to be able to be interrupted more or less at any point in its code. Achieving this requires trading off in other areas, and these will tend to manifest themselves in CPU heavy computation. Which isn’t to say that BEAM is slow, it sits pretty comfortably ahead of languages like Ruby and Python, but will lag behind languages like Java and Go. Of course even then, BEAM will do really well if it can take advantage of its strengths like binary pattern matching for example.
Nevertheless, there will be times where code will run slow and Erlang/Elixir optimizations will only go so far. BEAM has several ways to interface with foreign code, the fastest way being with a Native Implemented Function (NIF) whose API expects them to be written in C. But speaking frankly, the last time I worked with C involved a lengthy debugging session that boiled down to the lack of type safety, so I’d rather not have to repeat that experience. It’s for this reason that Rust is such a compelling language. It has a robust type system with type inference, pattern matching, and many more features. That and it has a C compatible ABI.
This is where the Rustler project comes in. In its own words it provides a safe bridge for writing Erlang NIFs. One of its safety guarantees is catching panics before they reach the C code. One of the nice things about Rust is that if the code compiles, you can be reasonably sure you won’t run into a wide range of memory safety related bugs, among others.
Rustler provides a Hex package as well as a Cargo crate (Rust’s package management system). To get started, create a new mix project and add rustler to the dependencies
defp deps do
[
{:rustler, "~> 0.16.0"}
]
end
And then run mix deps.get
to grab the dependencies. Rustler then provides a mix task we can invoke, mix rustler.new
. Here’s what that looks like
There are two interactive prompts, the first one asks for the Elixir module in which to load the NIF. Note that if your base project module name is Foo
and the module in which you wish to load the NIF is Foo.NifModule
then that is what you need to put in the prompt. I point it out because I made that mistake previously. The second prompt asks for the name of the crate it will generate. Leaving it blank will have it default to a name that will probably work just fine.
Once it is done running, we can see that it created a directory called native
with a generated Rust project located in it. The generated README tells us to make some changes to our mix.exs.
The full listing is as follows
The additions are straightforward, we have a rustler_crates
keyword in the mix project, and an entry for Rustler in Mix.compilers.
The value of MIX_ENV
will dictate how our crate is compiled, debug or release. The README gives us a boilerplate module example to load the NIFs into BEAM
Note that the generated boilerplate has use Rustler, otp_app: [otp app], crate: "niftest"
so we need to remember to substitute with the actual atom for our OTP app. As explained by the comment, the function is a fallback to guard against the case where the NIF is not loaded. The add
function has a corresponding Rust implementation when Rustler generated the crate for us. Before we take a look at the Rust code, let’s try the NIF out in a quick iex session
and indeed 1+2=3.
Now let’s take a look at what the lib.rs
file that Rustler generated for us looks like
After the usual imports at the top of the file, we see two macro calls. The first defines atoms that we want to use, the second exports a Rust function and registers it to a name that will be used to call it in the Elixir code with its arity. The Option
as the third argument in the exports is of type Option<fn(env: &Env, load_info: Term) -> bool>
and if it’s Some
will execute when the NIF is first loaded by BEAM. Here, it is None
so nothing will be executed upon loading.
The Rust function is where we get into the substantive part. We see that it is decorated with a lifetime parameter, specifically this is because no value may live longer than env
, which is an opaque data type defined in the Erlang NIF API. Each Rust function that is to be exported as a NIF will have the same signature, it will take an argument of type Env<'a>
and one of type &[Term<'a>]
. The latter will contain all the arguments used when calling from Elixir. The return type will always be NifResult<Term<'a>>
where NifResult
is an alias for Result
.
The body is fairly straightforward to read, but I want to point out the pattern. When extracting the value from a Term
we need to decode
it into a variable with a type annotation. When we return data, we need to encode
it. The returned data here is Rust tuple with an atom in the first slot, and the sum of the two arguments in the second slot, encoded into a Term
and wrapped in an Ok()
to satisfy the function signature.
When we encoded a Rust tuple we got an Elixir tuple when we called the NIF. The conclusion here is that Rust types will more or less correspond to similar Elixir types. Let’s give it a shot with some simple examples
and the corresponding guards
and a quick iex -S mix
session shows
iex(1)> Rustnif.NifTest.return_string()
"Hello world!"
iex(2)> Rustnif.NifTest.return_list()
[1, 2, 3, 4]
So far so good, but what about maps? Let’s try with another simple function like this (don’t forget to register it in the rustler_export_nifs!
macro and Elixir module)
If we try to compile, this is the error we get
error[E0599]: no method named `encode` found for type `std::collections::HashMap<std::string::String, {integer}>` in the current scope
--> src/lib.rs:46:12
|
46 | Ok(map.encode(env))
| ^^^^^^
which we could have seen coming if we checked the Rustler documentation. The Encoder
trait is not implemented for Rust’s HashMap
.
We’re not quite completely out of luck, there technically is a way to return a map if we really want to. We can call it like this, let map = rustler::types::map::map_new(env).
And now we can call encode
on map
. But this is essentially a Rust representation of an Erlang map, and probably much less efficient than Rust’s HashMap
.
The reason for mentioning this is that there is a real possibility for non-trivial overhead to manifest itself when serializing data between Erlang and C/Rust and back again, depending on the type and size of the data. There is also a tiny overhead associated with running a NIF on a dirty scheduler.
Speaking of dirty NIFs, I would be remiss if I did not include an example showing how to mark a NIF as such. Here is an example showing how to do so, it is done entirely in the rustler_export_nifs!
macro
and the other flag is DirtyIo
, which are defined as follows
pub enum SchedulerFlags {
Normal = 0,
DirtyCpu = 1,
DirtyIo = 2,
}
The examples above are a small sample of what is possible. For a more comprehensive list, I recommend the test directory in the repository. For example, you can return an Option
and the value will get unwrapped when it is called inside Elixir.