Rust bool vector to a string slice vector

Understand the assembly code for a Vec<bool> to Vec<& 'static str> transformation.

EventHelix
Software Design

--

We have already looked at assembly code generated for a vector iteration. We will build on that knowledge to understand the assembly code generated when mapping a Rust vector to a string slice vector.

Example code: Map a vector of bools to a vector of string slices with static lifetime

The following code shows two functions:

  • convert<A,B> function accepts a vector of type A and returns a vector of type B. The function accepts a closure that accepts an element of type A and returns an element of type B. The conversion is done by calling the map function on the vector iterator. Note that convert<A,B> is a generic function and does not itself generate any code.
  • convert_bool_vec_to_static_str_vec function uses the convert<A,B> function to convert a vector of bools to a vector of string slices. The function accepts a vector of bools and returns a vector of string slices. The function calls the convert<A,B> function with a closure that converts a bool to a string slice.

Visualizing the input and output vectors

Let’s understand the input and output vectors of the convert_bool_vec_to_static_str_vec function. This will aid in understanding the assembly code generated.

The input vector passed to the convert_bool_vec_to_static_str_vec function is a vector of bool s. The memory organization of this vector is shown below. As discussed in the vector iteration article, the memory organization of a vector is as follows:

  • An 8-byte data array pointer points to the start of the data array on the heap. The field is highlighted in light green to indicate that it contains a heap address. The heap address points to a bool array. Each bool is stored in a single byte.
  • An 8-byte capacity field contains the length of the data array on the heap.
  • An 8-byte length field contains the number of elements in the vector.

The output vector of the convert_bool_vec_to_static_str_vec function is a vector of string slices. The memory organization of this vector is shown below.

  • An 8-byte data array pointer points to the start of the data array on the heap. The field is highlighted in light green to indicate that it contains a heap address. The heap address points to a & 'static str array. &'static str is a string slice with a static lifetime. An &str is represented as a pointer to the str and the length of the str in bytes.
  • An 8-byte capacity field contains the length of the data array on the heap.
  • An 8-byte length field contains the number of elements in the vector.

String slice vector generation overview

The following figure gives an overview of the generated assembly code for the convert_bool_vec_to_static_str_vec function. A few key points to note here are:

Overflow and memory allocation failure handling

  • If there would be an overflow in the size computation, the compiler will generate a panic and drop the input vector. Allocate a vector of the size resulting from the size computation.
  • If the allocation fails, the compiler will generate a panic and drop the input vector.

Length based optimization

The generated code takes the following input vector length-based decisions:

  • If the input vector length is 0, the output vector length is 0. No heap allocation is required for the output vector.
  • For non-zero input vector length, the output vector length is the same as the input vector length. The output vector array is allocated on the heap.
  • If the input vector length is odd, the generated code handles one iteration and then falls through to the code for even length handling.
  • The generated code handles two iterations per loop entry if the input vector length is even. This reduces the branching penalty of the loops.

Preparing the string slice

  • Each entry in the output vector array is a string slice that contains a pointer to the static "true" or "false" along with the length of the string slice.
  • The generated code removes the bool if condition from the loop body. This is achieved using the following techniques:
  • the copying of the state "true" or "false" string slice to the output vector array is done using a conditional move (cmove) instruction.
  • The length computation is done using an XOR between 5 (101 binary) and the bool value (0 or 1).

Cleaning up the input vector on exit from the function

The compiler generates a call to __rust_dealloc to free the input vector. Note that the function owns the input vector so it is its responsibility to free up the memory allocated for the input vector array just before the function returns.

Flow chart describing the generated assembly code

Annotated assembly code for the convert_bool_vec_to_static_str_vec function

The generated assembly code has been annotated to help understand the mapping from Rust code.

Explore more

Visit the Rust bool vector to a string slice vector post on EventHelix.com to learn more. Examine the Rust to assembly mapping interactively in the Compiler Explorer.

--

--