Ownership in Rust, Part 2
When we looked at ownership in Rust last time, we looked at how Rust uses scope to determine when a resource/data in memory should be dropped or freed.
We saw that for types that have a “copy trait,” (i.e. types whose data can be stored on the stack), the ownership model behaves similarly to other languages that may use a different paradigm, like garbage collection. But for types without this trait, we needed to be more conscious of the ownership rules.
Despite the design compromises that ownership may introduce, it makes up for it with flexibility, explicitness, and safety.
Ownership and Functions
In the first example, we’re first passing a string literal, (which stores its data on the stack), into a function,
foo(). In the second example, we’re passing a
String type, (which stores it data in the heap), into a different function,
foo(). In both implementations of
foo(), we print the memory address of the variable in their respective scopes.
In the first example, we see similar behavior to when we copied a variable’s value and bound it to a new variable. This happens because string literals use the stack; the size needed to store their pointers are known at compile time, and thus, we can easily copy it’s value and pop it onto the stack.
This means that each of the functions,
foo(), own their own copy of the of the pointer stored in
foo()'s scope is over,
foo() is responsible for dropping it’s own
string, and when
main()’s scope is over, it too is responsible for dropping the
string that it owns.
In the second example, on the other hand,
main() is moving ownership of
foo(). This means that
main() no longer has ownership of the
string variable, i.e. the place in memory that it points to. If we tried to accessing
string from inside
main() after it has been moved, we would receive an error.
Instead of copying, which could be expensive, Rust instead makes
foo() responsible for the data in the memory address,
0x7efced01c010, as indicated in the comments of the example. Now, only when
foo() goes out of scope will Rust free the memory at that address, and thus invalidate any other variables that have a pointer to that same address. Again, we do this to avoid a double free error.
For the second example, if we did want to copy the value of
string, so that both
foo() own their own copies, similar to when using the string literal on the stack, we could make a “deep copy”, by using the
Here, as indicated by the comments,
foo() have ownership of their respective copies of
string. Although this is a valid solution, it is not the most efficient, since Rust needs to step through its heap allocation process each time. And sometimes you actually do want both functions to interact with the same piece of data! (More on that later).
Just as ownership is taken by calling another function and passing in a variable, a function can be given ownership via a return from a different function:
foo() now gives ownership to
main() by returning
string to where
foo() was called. As expected, only when
main()'s scope ends will Rust free
Give & Take
If we follow this trend, it makes sense that we can both give ownership and then have that ownership returned to us by accepting and return the same
String type in
But this seems like a lot of headache just to pass values in and out of functions. Luckily, this is a headache that the Rust maintainers have taken into account:
Taking ownership and then returning ownership with every function is a bit tedious. What if we want to let a function use a value but not take ownership? It’s quite annoying that anything we pass in also needs to be passed back if we want to use it again, in addition to any data resulting from the body of the function that we might want to return as well. Luckily for us, Rust has a feature for this concept, called references.
References & Borrowing
Ownership accommodates the sharing and passing of data, but, you’ve got to follow a few rules.
Borrowing looks like this:
foo() access to
string, but, (as indicated by the label),
main() is still the owner of
string. This means that at the end of
string will not be dropped from memory;
main() is still responsible for
string's space in memory.
Here’s how we would write that interaction in Rust:
Just like our drawing, we would say that
main() passes a reference of
foo() excepts a
String type reference. This is indicated by
& symbol. After then end of
foo()'s scope, execution returns to it’s caller,
string is still valid.
foo() doesn’t have to return ownership, because it was never given ownership, it only borrowed.
Ampersands indicate references, which allow the passing of values without giving up ownership! Rust knows that when we’re passing a reference, the ownership, and therefore the responsibility of deallocating that space in memory, still belongs to the original owner.
Rust allows us to create any number of references:
No matter how many times we pass around a reference to
string, ownership will return to it’s original owner. (In this case, ownership returns to the place where
string was originally instantiated, but remember, we could have passed ownership and then created a reference).
The last thing to mention is mutability. Rust is often written in a functional style, but the writers are very pragmatic and understand that modern languages aren’t always so black-and-white, thus Rust accommodates mutability.
Rust allows us to use the
mut keyword in order to make values mutable. Notice the change in memory address which indicates that
string had to be reallocated in order to fit onto the heap.
Now that we have a mutable variable, we can make a mutable reference!
The syntax here is a bit specific, but we see that first we need to declare a mutable variable
let mut string. Then when we pass a mutable reference, using
&mut. Finally, we use
&mut in the function’s signature to explicitly state that our function accepts a mutable reference.
Now, we can still ensure that only
main() is the responsibly for deallocating
string, while also allowing other functions to mutate
Those familiar with memory management might think of how this can be dangerous if left unchecked. What happens if several functions hold a mutable reference and try to update the same memory location at the same time, asynchronously; like when using threads, for example? This leads to a data race condition.
A race condition occurs when two or more threads can access shared data and they try to change it at the same time. Because the thread scheduling algorithm can swap between threads at any time, you don’t know the order in which the threads will attempt to access the shared data. Therefore, the result of the change in data is dependent on the thread scheduling algorithm, i.e. both threads are “racing” to access/change the data.
This problem can be exacerbated when working with a low-level language, such as Rust. Rust allows us access to raw pointers, which may lend itself to a lot of unsafe scenarios.
This is the kind of thing ownership is set to protect against, and it does so by enforcing this rule: “at any given time, you can have either one mutable reference or any number of immutable references.”
The benefit of having this restriction is that Rust can prevent data races at compile time. A data race is similar to a race condition and happens when these three behaviors occur:
- Two or more pointers access the same data at the same time.
- At least one of the pointers is being used to write to the data.
- There’s no mechanism being used to synchronize access to the data.
Data races cause undefined behavior and can be difficult to diagnose and fix when you’re trying to track them down at runtime; Rust prevents this problem from happening because it won’t even compile code with data races!
Rust’s ownership rules come to the rescue again, which is emphasized as the core safety feature that Rust provides over other systems languages. This means that Ruby programmers, like myself, still don’t have to be intimately acquainted with the inner-working of memory management!
One last thing, when passing references, there is another condition what can cause bugs called dangling references.
Dangling references are pointers to data that has been deallocated, for example:
In this example,
foo() returns a reference to
string. However, once
foo()'s scope ends, the memory for
string is deallocated, which means the reference will point to a invalid place in memory!
Rust prevents this at compile time by throwing an error.
Rustaceans can enjoy the benefits of the ownership model without understanding the protections it provides. However, being able to comprehend the problems that ownership solves only helps to write better code without fighting against the compiler.
There’s still a bit more left to uncover about Rust ownership, but with these two posts, hopefully you’re left with enough to get started working with this elegant solution to an otherwise unwieldy problem.