Rust &TheMachine

Concretely subdue the compiler and make it work for you

Prolific K

--

The intended audience is newly started Rust users who have read the syntax and are starting to fight the compiler. The focus on a machine-oriented mental model will be especially helpful if you have been using mostly garbage collection to manage memory recently. I promise this way is faster than fighting the answers out of the compiler case-by-case.

Rust’s memory management is slick enough to barely notice you’re doing any. It can make explaining compiler errors very opaque. If you are already familiar with or willing to learn bare-minimum microarchitecture, just the basics of function execution and memory usage, this article is an efficient crash course for finding the low resistance paths to writing natural Rust.

Pre-Requisite Mental Models

You absolutely must have these concepts hot in your cache to understand birds-eye machine Rust.

  • Function call mechanics in assembly. Just know call stack, stack pointer, and stack memory. Get out your copy of Micro-Architecture: YouTube Summer Edition
  • Heap memory, which is a fragmented blob. Stacks are organized, linear, and only hot on one side. Heap is noncontiguous garbage.
  • Multiple threads means a stack for each thread, but there is still only ever one heap.
  • Instructions and static data are over there somewhere and don’t move, so we can always borrow static data

Good Stack-Keeping

Ownership is the abstraction that involves several behaviors such as managing memory and using pointers. When the compiler enforces ownership, it also ensures that we never create invalid pointers or leak memory. The memory management part is actually quite simple and rarely results in compiler fights. If there were no pointers, the drop mechanics of RAII would be completely trivial.

However, we sometimes want to see the same memory in multiple locations to write an elegant algorithm or we want to avoid copying large values into every function call. We need pointers. Pointer invalidation occurs when things are moved or deallocated with outstanding pointers. Excluding moves, everything else about valid ownership rules boils down to preserving pops-before relationships in stack memory.

First, we’re going to look at how ownership and references work for stack values. Second, by reasoning that every heap segment has an owner on the stack, we will map all lifetime & ownership into the stack. This neat mapping then makes it clear that every timing to drop memory is decidable up front and that at no time will we hold invalid references.

Starting With Correct Pops-Before Relationships

Note: Micro-arch application convention (including x86) grows stacks down in memory addresses, opposite of conceptual CS stacks. This article uses “upstream” and “downstream” semantics, where calling a function is going “downstream” and returning is “upstream”, back where you came from.

References naturally travel downstream and refer to upstream. &T always points through upstream stack memory to a T and always pops before the T it refers to.

  • In the trivial (easy) case,&T lives downstream and points directly to an upstream T
  • Ownership composition in structs preserves the trivial case. While &T can refer to a field T on a struct, and while we can move &T into a different struct later, whatever owns &T will be downstream of whatever owns the T it refers to
  • Consequently, T always gets popped after all outstanding &T’s
  • It can be difficult to convince the compiler that the composed case doesn’t break the law, but the trivial case will always be pretty easy to pull off

Preserving Pops-Before Relationships

The compiler will not allow program execution to result in references upstream of what they refer to. This could be done by attempting to return a borrow to scope data, trying to move a reference into a struct that will outlive what the reference refers to, or attempting to move a value with an outstanding reference.

All function references and owned values can be passed in as arguments or returned accordingly:

  • &T’s are passed into functions, downstream, referring to upstream caller data
  • T’s can be passed downstream, into functions. They will either will be returned in the products or dropped when the stack is popped
  • T’s created in the current function can be returned upstream to the caller (left on the now taller stack) or dropped
  • &T’s referring to data of the current scope cannot be returned back upstream to the caller. The data they would refer to has to be moved to be returned. The data therefore must be either dropped or moved and the &T can’t remain valid either way
  • &T’s passed into the function refer to upstream data and can be returned upstream

Because these relationships are preserved through function calls, they become transitive all the way through the call stack. In general, all of your references point upstream and are sent or carried downstream. You can also remember this as just two rules for function calls:

  • References and owned values in
  • Owned values and references to still-alive upstream data out

There you have it. We have defined the correct order for all stack memory to pop and have a set of rules that say we will not ever break that situation. Now let’s look at heap.

Boxes etc are Owned Heap

Box<T> is an owned pointer to a segment of heap with some value inside. Box<T> is a combination of a Box on the stack and a T on the heap. It’s an actual real-life heap segment pointer that you own. You can pass it, move it, mutate what it points to, return it. When it dies, the associated heap and whatever it owns are deallocated. Box references, (&Box) can be passed downstream, but like any other reference, cannot be returned upstream of the Box they refer to.

You put that T on the heap when you call Box::new(myT). It was on the stack before you made a box. It is on the heap after you box it up. Arc::new(myT) is another container type that owns heap data, placing data on the heap by moving it during construction. Vec::with_capacity(usize) is a growable segment of heap where you put your owned data onto the heap a few at a time, using methods like push. You own the Vec. The Vec owns its heap.

  • Stack Owns Heap
  • Stack Owning Heap Dies Heap Dies

The Box holding the heap will live until the thing (or last copy of the thing) keeping it alive dies. For the most part, your heap memory values are just the appendices of stack values and reasoning about stack life has already been shown to be very easy.

Even if you do something like Box<…Box<Box<T>>…> and move pointers themselves over to the heap, it will ultimately be kept alive by the outer-most Box, which is still on the stack, still in a function scope that will pop the box, and all the boxes it holds, when it returns.

Even if you start tossing references up onto the heap by constructing Box::new(&myT), you will not be able to do this with data from the current scope and still return the box. Owning a heap location that stores a reference to current scope data doesn’t create a way to return &T’s to current scope data back upstream.

The completeness of this abstraction lets us get away with almost never manually allocating or explicitly freeing. You can allocate via std::mem, but it’s not the usual way.

Reference Counted Lifetimes

Rc and Arc (Atomic reference counter) deserve special focus in the context of heap and stack. We don’t know how long these values will live, so they are reference counted and live on the heap. Arc is counted atomically and can be shared across threads. The Rc guts live on the heap. The Rc you pass and return on the stack is just a handle. This handle can be cloned numerous times, and only the last drop will trigger drop of the inner owned data. They do not imply any shared mutability, but you can put mutability inside them.

Borrow Lifetime = Immobile Lifetime

You cannot move a value that has an outstanding borrow. A borrowing reference to a moved value would be a dangling pointer under the hood. Instead, either copy / clone the value or clone & pass a reference counter, an Rc or Arc of the value. Another option is to move the owned value upstream so that the value you are trying to move becomes just another reference.

Ownership Composition Leads to Lifetime Annotations

If you create something, which has lifetime, but it owns a reference to something else that can have a different lifetime, you will begin needing annotations. This is composition of ownership. You must both maintain pops-before relationships and convince the compiler that you have done so.

The most obvious example is if you build a Struct that contains a reference to borrowed data. Immediately the compiler starts asking for lifetimes. If you have two references, the compiler wants to know if you intend both of those borrows to expect and track the same lifetime for both or a different lifetime for each. Annotations just help it decide this accounting.

Function Calls Need Space

Compiler warnings about unsized types are a direct consequence of needing to plan the linear memory layout of stack data. The compiler is capable of handling immense amounts of calling convention and address alignment details. What it cannot do is decide how much space to put on the stack for types that could be any size.

  • Passing data in of unknown size? Use a &, Box, or another container with known stack size
  • Returning data of unknown size? We can’t return references to current scope data. Use a Box
  • References are just pointers and also have fixed size:
    fn (foo: & dyn Trait) { .. }
  • Array size only known at runtime? Size might need to change? You need heap. You need a Box
  • Multiple types of values possible? Need more than one in the same container? Vec<Box<dyn Trait>> can hold several things implementing Trait simultaneously. foo (t: & dyn Trait) is a signature that accepts anything implementing Trait
  • Vec push and other container methods are functions and need to decide their argument sizes. Remember this when planning data structures.

All of these situations can cause compiler errors about unsized types. Box is a sized type. Box is an owner. Use Box to handle the most basic cases of unsized data that belongs on the heap.

Structs also Need Space

Since you are building values that were returned from functions and passed into functions, you basically can figure this out. Read up on exotic sized types for details on structs that contain their variable sized data in the last field.

Importing OOP Tactics

In garbage collected languages, you can write a procedure and then happily cram it into a class to isolate the API and internal state. In Rust, the introduction of borrowing adds some extra wrinkles that will require both knowledge of Sized and lifetimes. C and C++ users will be used to many of these issues with pointers and moving structs.

I have structs. Structs have methods. I see &self, self, and Self. There is something called trait objects. Although there is no inheritance, can I kind of OOP?

Yes and no. Don’t try to encapsulate data from lifetimes that need to be independent. If a reference points to something that needs to die over the lifetime of your struct, you will start needing funky annotations and it might turn out to be impossible. Start with very, very small structs that live with their data and whose methods essentially describe a set of procedure you will run with similar arguments in different places.

DontBorrowFromConstructorScopeValues::new()

A constructor is just another function. Remember the T and &T passing rules for functions when trying to write a complex constructor. Do not create something in the scope and try to borrow from it and hang it on the struct. You need to own the value or have originally borrowed the reference from upstream if you want to put references on structs.

You shouldn’t normally need to and definitely should not prefer to ever construct a Struct that borrows data from itself during construction. Such a Struct would be immovable without some voodoo to update the self-reference and move in one step. If members borrow this way, you have lifetime mismatch. You want to pass borrowing references downstream, into inner scopes, so send the thing being borrowed to an outer scope or send the borrower to an inner scope.

Do I need Polymorphism?

If you need to iterate over several things that all have a method that is an implementation of a trait (the closest thing to an interface you will find in Rust), you can do so with trait objects. Trait objects occur when you specify a trait bound instead of a type bound, using the dyn keyword.

fn polycaller(thing: Box<dyn Trait>) { ... thing.trait_method() }

This style makes the call, in effect, polymorphic, using a vtable and all that under the hood. Vec<Box<dyn Trait>> will store such a list of objects implementing a trait. The rules on what you cannot do on such a set of structs you are intending to use as trait objects, object safety, are not terribly hard to abide by.

Surprise, you have to use either references or heap containers for this variable sized argument. The compiler can’t set up a function call with a variable stack size. The function can’t be compiled to call the method on the variable concrete type without dynamic dispatch(by vtable lookup).

Container Types That Are Almost Just Syntax

You end up thinking of these types almost entirely according to how they affect memory location, prevent data races, afford well-defined mutability,
or manage lifetime. Borrowing is syntax, and is zero cost, but enacting the runtime mechanisms of these types does require explicit or implicit function calls. A Box in C would be just a pointer, but in Rust you call Box::new(myT) instead of malloc. This is why they are not just syntax and instead are actual types.

TL;DR edition of containers you see and will use a lot:

Single Thread

  • multiple potential owners / managed heap lifetime: Rc<T>
  • take immutable references yet mutate them / interior mutability Cell<T>
  • interior mutability for non-copy types and those that need &mut self RefCell<T>

Multi-Threaded

  • multiple potential owners / managed heap lifetime: Arc<T>
  • Multi-owned, exclusive read-write: Arc<Mutex<T>>
  • Multi-owned, exclusive write or concurrent reads: Arc<RwLock<T>>

The best article on Rust containers (current Rust Book breaks it up now)

Multi-Threaded Rust Has Fewer Concepts

The domain of values you can move into and send across threads has a significantly reduced type surface. Your choices are more straightforward across thread boundaries compared to managing stack & heap memory. This is only hard if you start trying to find the right combination of single-threaded primitives to convince the compiler to do multi-threaded things.

Threads Don’t Share Stack

Threads can’t, without significantly good voodoo, reason about when they will die relative to siblings. You can’t rely on pointers to another thread’s stack. Therefore shared memory is virtually always somewhere in heap.

While your Arc might be on the stack (and it’s just a clone), the data it reference counts and the reference counter value itself will unsurprisingly be found in the heap, where threads can live & die independently without interference.

All pointers (references) that need to be shared across threads need to point to heap data, not stack data.

Sending is not Sharing

Sync is the trait for sharing across threads
Send is the trait for moving data to another thread

Sending is, mechanically, copying the value off the stack into a buffer, a move, and then having it get picked up on the other thread and stored onto its own stack. Sending a Box that owns some heap is not some clever evasion.

When using crate types, a handle for shared memory will likely be Send + Sync but the API can also choose to instead only expose types that hide the thread-safety of the underlying types by implementing neither Send nor Sync so that you don’t try to break their design constraints.

Guaranteed One Owner + Move or Copy ≈ Can Send Across Threads

Keyword send. Whether it’s a Box, Struct, or Cell, single-owner, movable data can be packaged up and thrown into a buffer inside an std::sync::mpsc::channel etc

Copy data, which typically has an identity that doesn’t matter (one 777u32 is as good as another) can basically always be sent.

Currently borrowed data can’t be moved and consequently can’t be sent. Some odd types can prevent moves, and if it can’t move, it can’t be sent.

Rc is an example of indeterminate ownership. You don’t know if it still has only one owner, so if you can’t downgrade it to an owned value, you can’t send it over the thread. It’s reference counter is non-atomic, so it’s not safe to keep track of data with multiple threads that might drop it.

Threads Only Share Ownership and Mutability Atomically

You have to be assured that the value will still be there even if it’s read-only. There can be no races on this decision or you’ve let in undefined behavior. You can’t reference another thread’s stack paperweight for the heap, so you pretty quickly arrive at using an Arc, which you can clone to pass to another stack.

These types, atomic types, are allowed to do what they do by the ownership and lifetime system because they have the capability to enforce atomic ordering of reads and writes, eliminating race conditions and allowing race-free abstractions to be built on top of them.

Multi-Threaded Shared ≈ Atomic-Heap-Something

Whenever you start sharing data across threads, since Rust will not happily allow you to introduce race conditions, you will inevitably at some level find the least resistance is to use an atomic type or something backed by atomic types such as Arc, AtomicBool, Mutex etc.

The best atomic heap something is an Arc. Need to watch a boolean atomically? Just use an Arc<AtomicBool> or an Arc around a structure with an atomic. AtomicBool and other types in the std::sync::atomic package are really useful as you start multi-threading.

There is very little to discuss about multi-threaded shared data in the context of low-level memory dynamics in safe Rust. Its availability and mutability is summarized by stating that it will be behind an atomic type for reads of mutable data, mutations, and drops and that it will be on the heap. The most versatile memory from a thread-safe perspective is unsurprisingly the most restricted in terms of API.

Why the Machine?

The Rust book does tend to describe lifetimes like they are some theory-driven language feature that you should begin learning as behavioral trivia, far removed from execution mechanics. There are mentions of heap, but to a reader both familiar with low-level execution and accustomed to high-level languages, the treatment seems extremely ambiguous. It’s not clear if a low-level machine model or a high-level fluffy model is more natural until well after sufficient compiler-directed education.

The low-level machine-like mental model is a shortcut. While a high-level treatment will eventually get the job done, I personally don’t like the cost-benefit ratio. A machine-oriented treatment’s pre-requisite knowledge is cheap to learn and delivers high value. A systems programmer needs to know the fine details of execution mechanics, so it’s not a bad idea for an application programmer to be using simplifications of execution mechanics as a model of reasoning.

Some may argue that this presentation glosses over too many things that can be accomplished by the compiler or are valid in the completely abstract treatment but are counter-intuitive or difficult to express as execution mechanics because they rely on valid transformations. Execution mechanics can provide sound explanations that get people started quickly without telling lies. Also, if the abstractions of ownership & lifetime were more fundamental than execution mechanics, we would not check the completeness of their transformations against simple abstract machines. Starting with abstractions is for similar reasons probably not a good idea.

Programmers with C++ and C experience are expected to find a lot of this information pedantic, but to the Java or .Net practitioner, let alone users brought up on Python and Javascript, the rules around what can or cannot be borrowed, used as arguments or return values, or sent across threads are thoroughly lacking in context and apparent justification without re-rooting the conversation around the mechanics of the machine.

I’m aware that I left out lots of exception cases such as scoped threads. I’m pretending unsafe Rust doesn’t exist in this guide. If it’s obviously inappropriate for a beginner audience, it can be left out. I removed the section about breaking reference cycles with Weak<T>’s for this same reason.

--

--