Rust for Python Developers: Ownership and Borrowing

Raja Sekar
9 min readApr 9, 2019

--

Following the previous post, here I am going to introduce the key concepts of Rust — Ownership and Borrowing.

Note: Ownership is the toughest concept you will encounter while getting started to program in Rust. There is no way to postpone it to later stages like in other languages. It hits you right in the face even in the most trivial programs. So please bear with me for now. Once you get comfortable with ownership and borrowing, it becomes much easier to code in Rust. In addition, you will come to appreciate these concepts when you code in other languages too.

credit: https://www.reddit.com/r/rustjerk/comments/9kjw8s/fearless_rust_parallelism/

Quick Glance at Fundamentals:

Good Old Stack and Heap:

In languages like Python, you never have to think about stack frames and heap. But when it comes to low-level languages, you need to have some basic understanding of how stack and heap work in the context of program execution. Even in a relatively high-level language like Go, these concepts are made explicit.

Stack is responsible for program execution. A runtime call stack consists of one or more stack frames each of which is responsible for a function call. A stack frame consists of all information relating to a particular function like its local variables, return address, etc., One important thing to note about stack is that it is a continuous array like structure and just like its literal meaning, you can add or remove objects only from the top of the stack(Last In First Out — LIFO). Only objects with size known at compile time can be stored on the stack. A stack frame exists only during the function execution, therefore you can’t access the stack objects after the function returns.

Stack vs Heap

So for creating objects which you need to access independent of the function call and for creating a dynamically sized object(remember that stack objects size needs to be known at compile time), something called heap is used. Heap doesn’t have any particular memory layout. Heap memory is independent of the function call stack. Therefore objects created on the heap can be accessed even after the function finishes. In most languages, the heap object is created by the keyword new.

Memory Management:

Now since the objects created on heap exists independently of run time stack, we have to clear those objects once we are done with them. So here is where two broad categories of languages diverge. One category uses manual memory management and the other one uses automatic memory management. In languages with automatic memory management, something called garbage collection is used to clear heap objects which are not being used anymore. Garbage collection keeps track of all the references made to a particular heap object. It runs at fixed time intervals, stopping the program execution and clearing the unused heap objects. This makes the program execution slightly non-deterministic in nature since it is generally difficult to judge when will garbage collection gets activated. To give an example, Python is a complete object-oriented language. Every single data you can create in Python gets stored in heap. Only the address to the objects gets stored on stack frame in Python. It uses a combination of reference counting and tracing garbage collector to clear heap objects. I don’t want to get into these terms to stay with the subject of this article.

In manual memory management, programmers are responsible for deleting heap objects when it is no longer needed, using keywords like delete, free, etc. So here you have precise control over when objects get deleted. Apart from this, when resources are constrained, garbage collection can take non-trivial amount of time to trace and clear objects. This makes languages with manual memory management ideal for programs where performance and latency matters. But this also comes with a severe drawback. It makes programmers responsible for clearing objects at appropriate places to avoid memory leaks. Memory leak itself is not a severe problem in many cases but accessing the address of an object which got deleted previously(called dangling pointer) makes the program exhibit undefined behavior and can lead to a lot of security vulnerabilities.

This is where Rust comes into picture. It gives the programmer complete control over when to delete the object and still prevent invalid references(dangling pointer) completely. It introduces two radically new concepts called ownership and borrowing to make this happen. The same concepts also make it impossible to introduce data race in concurrent Rust code. This makes Rust as fast as C (theoretically, please no flame war here) without introducing typical memory vulnerabilities.

Ownership:

Let’s look at this innocent looking code:

It produces the Rust’s trademark error as follows:

Compiling playground v0.0.1 (/playground)
error[E0382]: borrow of moved value: `a`
--> src/main.rs:8:49
|
7 | print_func(a);
| - value moved here
8 | println!("print inside main function {:?}", a);
| ^ value borrowed here after move
|
= note: move occurs because `a` has type `std::string::String`, which does not implement the `Copy` trait

error: aborting due to previous error

For more information about this error, try `rustc --explain E0382`.
error: Could not compile `playground`.

To learn more, run the command again with --verbose.

Note: You can use Rust playground to experiment with short snippets online and share the link with others in forum in case you need help.

This is Rusts’ ownership in play.

What is ownership?

  • Every data in Rust has a single owner which is its variable.
  • Data lives as long as its owner lives.
  • Data gets deleted precisely when the owner goes out of scope i.e., curly braces {}.
  • There can only be one owner at a time for the particular data.

Let’s take a look at this python code:

The list has 3 owners named as a,b,c. Each of them can do whatever they want with the list as they wish. In python, each value carries with it a reference count which is the total number of references pointing to that value. When you assign a variable to another variable it simply increases the reference count of the value. So basically many variables can own a single piece of data. Here in the first line, ref count of list will be 1 and at the end of the 3rd line, it gets incremented to 3. When all these variables go out of scope, ref count gets reduced and when it reaches 0, the value gets dropped. To check the refcount, you can use the following snippet.

Stack and Heap at the end of c = b

In C++, in this same situation, during reassignment, the entire list gets copied and if you don’t wish to deep copy the data, you should explicitly take a reference to the data.

In Rust, it is an entirely different mechanism. When you allocate an object, the variable becomes its owner. When you re-assign this variable to another variable, the ownership gets transferred(also known as moved) to the new variable and the former variable gets un-initialized. Therefore you won’t be able to use it.

Stack and Heap Diagram

The error message we saw earlier was trying to convey this exact concept. Ownership of string gets transferred from variable “a” to the variable “name” inside the function and when the function gets completed, the variable “name” goes out of scope and therefore, the string gets dropped. So when you try to use the variable “a” again in println, it throws an error.

There are 3 possible ways you can use to make the code compile.

  • C++ way — copy the string. This way the entire string gets copied and sent to function, so the variable “a” still owns the original string. Clone() method can be used to create copy of the data.
  • Transferring ownership. Unique to Rust. Here “name” transfers the ownership to “a” again. I have added mut keyword here. In Rust, all variables are immutable by default. Since we are re-assigning to “a” we need to make it mutable.
  • Using reference. It is similar to C++ reference but with a significant difference which we will discuss shortly.

Not all values behave like this though. Objects created on the stack are cheap and therefore they get copied. Only objects created on heap gets moved. Following code compiles without any error.

Borrowing:

It is tedious to transfer ownership from place to place. In addition, many functions might need to access the same data. The proper way to handle this is using references. In Python, ref counting solves this problem and handles freeing data correctly. In C/C++, when you create pointers to data, it’s upto you to make sure you don’t use the pointer after you free the object. Rust enforces this practice in compile time itself.

In Rust, the process of creating reference is called borrowing. You can temporarily borrow from the owner and when you are done with it, you have to return it back to the owner. So reference lifetime can’t be more than the owner lifetime. Let’s see what does this mean. Look at the following code.

Note: In Rust, you can create arbitrary blocks like this “{}” to tell the compiler how long a particular object should live.

Rust compiler complains.

Compiling playground v0.0.1 (/playground)
error[E0597]: `v` does not live long enough
--> src/main.rs:6:13
|
6 | r = &v[2];
| ^ borrowed value does not live long enough
7 | }
| - `v` dropped here while still borrowed
8 | println!("{}", r);
| - borrow later used here

error: aborting due to previous error

For more information about this error, try `rustc --explain E0597`.
error: Could not compile `playground`.

To learn more, run the command again with --verbose.

What it implies is that the owner(‘v’) lives only within this block(lines 4–6) which is its lifetime after which it gets dropped. But variable in which reference is stored(‘r’) has a lifetime more than this scope. Therefore Rust complains that is not valid.

The following code works fine since the lifetime of reference is enclosed within the lifetime of owner.

This is how Rust ensures that you can’t use references after the value gets dropped.

There are two types of references in Rust. One is a shared reference we saw above and the other one is a mutable reference.

  • You can have as many shared references as you want, adhering to the above lifetime constraints we saw earlier. But you cannot modify the original data through this. It means, you can have as many readers as you want.
  • You can have only one mutable reference to data at a time. When you borrow the data mutably, you can’t have any other kind of borrows which are both shared and mutable. In addition, you cannot borrow mutably from the shared reference either. It means, you can have only one writer at a time and in the mean time you cannot have any readers from the owner either. But you can have multiple shared references from the mutable reference.

This code with mutable reference compiles.

But this code doesn’t.

Compiling playground v0.0.1 (/playground)
error[E0502]: cannot borrow `v` as immutable because it is also borrowed as mutable
--> src/main.rs:13:16
|
11 | let r = &mut v;
| ------ mutable borrow occurs here
12 | add_to_vec(r);
13 | display_vec(&v);
| ^^ immutable borrow occurs here
14 | println!("{:?}",r);
| - mutable borrow later used here
error: aborting due to previous errorFor more information about this error, try `rustc --explain E0502`.
error: Could not compile `playground`.
To learn more, run the command again with --verbose.

It might seem bit constrained at first. But this is how Rust ensures proper concurrent code without data race. When you borrow something mutably, just make sure you complete all functionalities related to it before proceeding further where you might need to borrow again from the owner. With little change the code compiles.

I have to say that the the situation is lot better than when I started learning Rust. Compiler complains a lot less now.

This is all there is to ownership and borrowing. They are rather simple concepts but the implications they produce are significant. In the following article, I will explore the patterns you can utilize to beat the constraints imposed by ownership and borrowing. Once you get comfortable with these two concepts, coding in Rust becomes much easier. Please try to explore and experiment with these concepts until you get comfortable.

Let’s meet again soon!!!

--

--

Raja Sekar

Deep Learning practitioner, Distributed Systems enthusiast and a newbie entrepreneur