Implementing structs by value in Dart FFI

A deep dive into API design and native calling conventions

Daco Harkes
Dart
8 min readJun 8, 2021

--

In the Dart 2.12 release, we extended our C-interop feature, Dart FFI, with the ability to pass structs by value. This article talks about what it took to add this feature to the Dart SDK. If you’re interested in low-level language implementation details or in platform conventions for passing structs by value, keep reading.

This article talks about both developing the API and figuring out the ABI (Application Binary Interface) for the struct-by-value feature. During the two years we worked on this feature (and other Dart FFI features), we discovered many constraints that required changing the API. The ABI journey was equally interesting, illustrating that you can take multiple approaches to nailing down the details of a hard problem.

Pass by value and pass by reference in C/C++

Here’s a quick refresher if you don’t write code in C every day. Suppose that we have the following struct and functions in C:

Then, we can use these functions in some simple C code. Let’s say we have a local variable c1:

Coord c1 = {10.0, 10.0, nullptr};

If we pass c1 to TranslateByValue, then the argument is passed by value, which makes the callee effectively operate on a copy of the struct:

Coord c2 = TranslateByValue(c1);

This means that c1 stays unchanged.

However, if we pass c1 by reference with a pointer to the memory containing c1, then c1 gets mutated in place:

TranslateByPointer(&c1);

c1.x now contains 20.0.

The API design journey

The original Dart FFI prototype already had support for passing pointers to structs. However, we redesigned the API multiple times to accommodate various use cases and constraints.

Initial design

Our initial design enabled allocating structs in memory, passing those pointers to C, and modifying the fields of the structs. With that approach, the Struct class extended the Pointer class:

Dart FFI users wrote the preceding snippet, and Dart FFI internals generated an implementation of sizeOf and getter and setter implementations for x, y, and next.

However, two years ago we realized that this design had an issue. By having the Coordinate extend Pointer, we could not distinguish between Coordinate and Coordinate*.

Distinguishing between Coordinate and Coordinate*

We introduced Struct to Dart FFI and made structs extend this class:

Now a Pointer<Coordinate> in Dart represents a Coordinate* in C, and a Coordinate in Dart represents a Coordinate in C.

This meant that the next field had the type Pointer<Coordinate>, which made the @Pointer annotation redundant. So, we got rid of Pointer annotations.

Because we now represented pointers to structs as Pointer objects, we started using the allocate factory on Pointer:

final c = Pointer<Coordinate>.allocate();

To get access to the fields of a Pointer<Coordinate>, we need an object of type Coordinate, because that object has the fields x, y, and next. For this, we had the load method on Pointer already.

c.load<Coordinate>().x = 10.0;

Of course, having to write <Coordinate> on calling load is verbose. (Having to write a type argument was the same for loading a Dart int out of a Pointer<Uint8>.) The reason we need this type argument on load is to specify to the Dart type system the return type of this method.

Extension methods to the rescue

Dart 2.7 introduced extension methods. With extension methods, we could pattern match on the type argument T in Pointer<T>:

Pattern matching on the type argument enabled us to get rid of the verbosity on call sites:

c.ref.y = 10.0; // ref is pattern matched to be of type Coordinate.

We could also use the extension method pattern matching to make the type argument of Struct<S> redundant, changing the definition of user structs to:

Before, the type argument <S> constrained the Struct field Pointer<S> addressOf. Instead, we changed the field to an extension getter:

Stop leaking backing storage

When returning a struct by value from C to Dart, we don’t want to malloc C memory to save the struct, because that would be slow and burden the user with freeing it. So, instead, the struct is copied to a TypedData, and the Coordinate can have either a Pointer or a TypedData as backing storage.

However, addressOf, which was introduced in the first redesign, had type Pointer. This type conveyed that it was always backed by C memory, but this was no longer true.

So, we deprecated addressOf.

For optimizations

The last step is to require invocations of various Dart FFI methods, including the ones related to structs, to have compile-time constant type arguments:

The invocation of methods allows us to better optimize the code and is more aligned with C semantics.

Note that this last change triggers deprecation notices in Dart 2.12, and the change is enforced in Dart 2.13.

The ABI discovery journey

Now that the API is in place, the next question is: Where does C expect these structs when passed or returned by value? This is known as the Application Binary Interface (ABI).

Documentation

The natural thing is to look for documentation. ARM provides Procedure Call Standard for the Arm Architecture — ABI 2019Q1 and Procedure Call Standard for the ARM 64-bit Architecture (AArch64). However, the x86 and x64 official documentation fell off the internet, resulting in people searching for this information and resorting to unofficial mirrors or reverse engineering.

A quick glance at the documentation shows a variety of locations for passing structs by value:

  • In multiple CPU and FPU registers.
  • On the stack.
  • A pointer to a copy. (The copy is on the caller’s stack frame.)
  • Partially in CPU registers and partially on the stack.

When passed on the stack, there are some further questions about what the required alignment is and whether all unused CPU and FPU registers are blocked off or backfilled.

When returning a struct by value, the struct can be passed back in two locations:

  • In multiple CPU and FPU registers.
  • Written to a memory location by the callee, in which case the caller passes in a pointer to that memory location. (This reserved memory is also on the caller’s stack frame.)

When a pointer to the result location is passed in, a further question is whether this conflicts with a normal CPU argument register.

Refactor Dart FFI compilation

This initial investigation was enough to realize that we had to reengineer a part of the Dart FFI compiler pipeline. We used to reuse the Location type, which was originally intended for compiling Dart code to assembly.

However, in the Dart ABI, we never use non-word-aligned stack locations or more than two registers at the same time. An experiment trying to extend the Location type to support these extra locations ended in a huge complicated diff because Location is used a lot in the Dart virtual machine.

So, instead, we replaced the compilation pipeline for Dart FFI.

Explore the native ABIs

Let’s explore the ABIs a bit.

Suppose that we have the following struct and C function signature:

How do various ABIs pass these structs in MyFunction?

In Linux on x64, there are 6 CPU argument registers. The struct is small enough to fit in a single register, so the first 6 arguments go into the 6 CPU argument registers, and the last 2 go on the stack. The stack arguments are aligned to 8 bytes. And, the return value also fits in a CPU register (larger example).

So, what happens on Windows?

It’s completely different. Windows has only 4 argument registers. However, the first register is used to pass the pointer to the memory location to write the return value to. And, all arguments are passed by pointer to a copy, because the size of the struct is 3 bytes, which is not a power of 2.

Let’s look at another example: ARM32 on Linux and Android. Suppose that we have the following struct and C function signature:

These specific types of structs are called homogeneous composites, because they only contain identical elements. And, homogenous floats with up to 4 members are treated differently from normal structs. In this case, Linux uses floating point registers for the individual floating points in the struct.

On Android, SoftFP is used instead of HardFP. This means that floats are passed in integer registers rather than floating point registers. Moreover, we’re passing in a Pointer for the result. This results in a curious situation in which the first argument is partially passed in integer registers and partially passed on the stack.

Getting any of this wrong will likely lead to segmentation faults at runtime. So, it’s paramount to get all the corner cases of the ABI on every hardware and OS combination correct.

Explore through godbolt.org

Because the documentation is very terse, we figured out many corner cases through the compiler explorer godbolt.org. The compiler explorer shows C code and compiled assembly side by side:

A screenshot of godbolt.com showing that the assembly code for sizeof(Struct3Bytes) is returning 3 in the return register.

The preceding screenshot shows that on Windows x86 sizeof(Struct3Bytes) is 3 bytes, because 3 is moved into the return register eax.

When we change the struct slightly, we can inspect whether the size is still 3:

The size is not 3: mov eax, 4. Because the int16 must be 2-byte aligned, the struct must be 2-byte aligned. That means that when allocating an array of these structs there is a 1-byte padding after every struct to ensure that the next struct is 2-byte aligned. Hence, this struct is 4 bytes in the native ABI.

Explore through generated tests

Unfortunately, the compiler explorer doesn’t support MacOS and iOS. So, to make exploring manually more efficient (and to have a nice and huge test suite for this feature), we wrote a test generator.

The main idea is to generate tests in such a way that if they crash it’s possible to use GDB to see what’s wrong.

One way to make it easier to see what is going wrong when hitting a segmentation fault is to make all arguments have predictable and easy-to-recognize values. For example, the following test uses consecutive integers, so that these integer values can be easily spotted in registers and on the stack:

Another way to make finding problems easier is to add prints everywhere. For example, if we don’t hit a segmentation fault during the transition from Dart to C, but we manage to garble all the arguments, then printing the arguments helps:

Adding a test is as easy as adding a function type in the configuration file. The ability to add tests quickly has resulted in a huge test suite.

Sure enough, this test suite caught another curious case in a native ABI — this time on iOS-ARM64. Non-struct arguments on the stack on iOS on ARM64 aren’t aligned to word size but to their own size. Structs are aligned to word size, except that if the struct is a homogeneous struct with only floats, then it is aligned to the size of the float.

Summary

This concludes our journey through the API design and ABI discovery. With a good test suite and thorough code reviews, we landed support for passing structs by value in Dart FFI in December 2020 on the master branch, and it is available in Dart 2.12! If you’re interested in using Dart FFI, you can get started with the C interop documentation on dart.dev. If you have any questions or comments on the API design and ABI discovery, feel free to leave a comment below. We’d love to hear from you!

Thanks to the Dart language team and the (rest of the) Dart virtual machine team for their contributions to this Dart FFI feature, and thanks to Kathy Walrath and Michael Thomsen for shaping this blog post!

--

--

Daco Harkes
Dart
Writer for

Software Engineer at Google on the Dart compiler and virtual machine team.