Rust — memory safety without garbage collector
(this post has been ported from my blog, it was originally posted in May 2015 so the information in this post might have changed since then)
I’ve spent time with Rust at various points in the past, and being a language in development it was no surprise that every time I looked there were breaking changes and even the documentations look very different at every turn!
Fast forward to May 2015 and it has now hit the 1.0 milestone so things are stable and it’s now a good time to start looking into the language in earnest.
The web site is looking good, and there is an interactive playground where you can try it out without installing Rust. Documentation is beefed up and readily accessible through the web site. I personally find the Rust by Examples useful to quickly get started.
The big idea that came out of Rust was the notion of “borrowed pointers” though the documentations don’t refer to that particular term anymore. Instead, they talk more broadly about an ownership system and having “zero-cost abstractions”.
The abstractions we’re talking here are much lower level than what I’m used to. Here, we’re talking about pointers, polymorphic functions, traits, type inference, etc.
Its pointer system for example, gives you memory safety without needing a garbage collector and Rust pointers compiles to standard C pointers without additional tagging or runtime checks.
It guarantees memory safety for your application through the ownership system which we’ll be diving into shortly. All the analysis are performed at compile time, hence incurring “zero-cost” at runtime.
Let’s get a couple of basics out of the way first.
Note that in Rust, println is implemented as a macro, hence the bang (!).
When you bind a variable to something in Rust, the binding claims ownership of the thing it’s bound to. E.g.
When v goes out of scope at the end of foo(), Rust will reclaim the memory allocated for the vector. This happens deterministically, at the end of the scope.
When you pass v to a function or assign it to another binding then you have effectively moved the ownership of the vector to the new binding. If you try to use v again after this point then you’ll get a compile time error.
This ensures there’s only one active binding to any heap allocated memory at a time and eliminates data race.
There is a ‘data race’ when two or more pointers access the same memory location at the same time, where at least one of them is writing, and the operations are not synchronized.
Primitive types such as i32 (i.e. int32) are stack allocated and exempt from this restriction. They’re passed by value, so a copy is made when you pass it to a function or assign it to another binding.
The compiler knows to make a copy of n because i32 implements the Copy trait (a trait is the equivalent to an interface in .Net/Java).
You can extend this behaviour to your own types by implementing the Copy trait:
Don’t worry about the syntax for now, the point here is to illustrate the difference in behaviour when dealing with a type that implements the Copy trait.
The general rule of thumb is : if your type can implement the Copy trait then it should.
But cloning is expensive and not always possible.
In the earlier example:
- ownership of the vector has been moved to the binding v in the scope of take();
- at the end of take() Rust will reclaim the memory allocated for the vector;
- but it can’t, because we tried to use v in the outer scope afterwards, hence the error.
What if, we borrow the resource instead of moving its ownership?
A real world analogy would be if I bought a book from you then it’s mine to shred or burn after I’m done with it; but if I borrowed it from you then I have to make sure I return it to you in pristine conditions.
In Rust, we do this by passing a reference as argument.
References are also immutable by default.
But just as you can create mutable bindings, you can create mutable references with &mut.
There are a couple of rules for borrowing:
1. the borrower’s scope must not outlast the owner
2. you can have one of the following, but not both:
2.1. zero or more references to a resource; or
2.2. exactly one mutable reference
Rule 1 makes sense since the owner needs to clean up the resource when it goes out of scope.
For a data race to exist we need to have:
a. two or more pointers to the same resource
b. at least one is writing
c. the operations are not synchronized
Since the ownership system aims to eliminate data races at compile time, there’s no need for runtime synchronization, so condition c always holds.
When you have only readers (immutable references) then you can have as many as you want (rule 2.1) since condition b does not hold.
If you have writers then you need to ensure that condition a does not hold — i.e. there is only one mutable reference (rule 2.2).
Therefore, rule 2 ensure data races cannot exist.
Here are some issues that borrowing prevents.
There are lots of other things to like about Rust, there’s immutability by default, pattern matching, macros, etc.
Even from these basic examples, you can see the influence of functional programming. Especially with immutability by default, which bodes well with Rust’s goal of combining safety with speed.
Rust also has a good concurrency story too (pretty much mandatory for any modern language) which has been discussed in detail in this post.
Overall I enjoy coding in Rust, and the ownership system is pretty mind opening too. With both Go and Rust coming of age and targeting a similar space around system programming, it’ll be very interesting to watch this space develop.