Rust to assembly: Arrays, Tuples, Box, and Option handling

Map the Rust array, tuple, Box, and Option to assembly. See how the Rust compiler inlines functions.

EventHelix
Software Design
3 min readJun 4, 2022

--

We have already seen how Rust handles enums under the hood. We also looked at the code generation for the Box smart pointer. Here we put these items together in a Rust example that describes how arrays, tuples, Option enum, and Box smart pointer allocations are handled at the assembly level.

Code example

We will be dissecting the assembly generated for the following code. The code is annotated to explain the Rust code.

Assembly code for make_quad_coordinates

The assembly code for the make_quad_coordinates function is shown below. The code has been annotated to explain the mapping to Rust code. Key points in the generated code are:

  • Since the function returns a Box smart pointer, the assembly code allocates memory on the heap. __rust_alloc is used to allocate memory on the heap.
  • If the heap allocation fails, the function throws an exception using a special instruction (ud2).
  • Rust enums typically result in generating a discriminant value that is used to select the variant. Rust code generator optimizes the Option<Box> implementation by using the NULL pointer as the discriminant value.

Understanding the assembly code will be aided by understanding the memory layout of several data types used in the code.

Representation of Option<Coordinate>

The memory layout of the Option<Coordinate> type is shown below. Byte offset 0 is the discriminant used to distinguish between the variants Some and None. The Coordinate tuple is stored in the next two entries.

Representation of Option<Box<[Coordinate; 4]>>

The memory layout of the Option<Box<[Coordinate; 4]>> type is shown below. There are two memory locations in the Option<Box<[Coordinate; 4]>> type.

The first is the pointer to the array of coordinates. The second is the array of Coordinate objects.

Option<Box> on the stack

The Rust code generator optimizes the Option<Box<> type to a single pointer on the stack. The pointer works as a pointer to the array of coordinates as well as the discriminator. If the pointer is NULL, the Option variant is assumed to be None. A nonzero pointer indicates that the Option variant is Some.

[Coordinate; 4] array on the heap

The [Coordinate; 4] array is allocated on the heap. The heap pointer is stored in the Box pointer. The Box pointer points to the memory shown below. The array contains four Coordinate objects.

IEEE 754 floating point standard

Annotated assembly

Assembly code for cross_lines_from_quad_coordinates

The assembly code for cross_lines_from_quad_coordinates really surprised us. We were expecting to see a heap allocation in the return value from the call to the make_quad_coordinates function. Since the Box was going to be consumed in the function, we were expecting to see a de-allocation of the heap memory before the function returns. What we see is a very efficient generated code that inlined the make_quad_coordinates and eliminated the Box altogether. Thus, saving a memory allocation and de-allocation.

The key points in the generated assembly code are:

  • The compiler inlines the make_quad_coordinates function. This results in deep optimization of the code.
  • The compiler eliminates the Box allocation and de-allocation.
  • The generated code also optimizes the memory writes by joining together two 64-bit writes into a single 128-bit write.

Understanding the representation of Option<Line,Line> will assist in keeping track of the flow of the assembly code.

Representation of Option<(Line, Line)>

Annotated assembly

Explore more

Visit the Rust to assembly: Arrays, Tuples, Box, and Option handling post on EventHelix.com to learn more. Examine the Rust to assembly mapping interactively in the Compiler Explorer.

--

--