Rust Lifetimes — The Key to Writing Robust and Efficient Code

Kevin Carvalho
8 min readJul 4, 2024

--

In this article i cover a rather misunderstood topic of rust, Lifetimes, highlighting how they work an proving that they're not as scary as they seem to be. Also i discuss a bit about their role on code performance and give some advice on how to use them effectively.

What are lifetimes?

When coming from other programming languages, i found it quite confusing the very idea of lifetimes in Rust, and you may too, that's common, there's no such thing as lifetimes in most programming languages out there, so they're quite a new thing.

Fortunately, they're not as complicated as they seem to be at first. They are just a way for Rust's borrow checker to ensure that a references do not outlive the data they refer to.

Think of it this way, a reference in Rust is kind of like a pointer, so Rust needs to know for how long it can hold on to that pointer for, otherwise, it may allow you to use a pointer that's been already deallocated, a dangling pointer, causing undefined behavior.

That's a very common issue in languages like C, where you need to manage the memory manually and there are no lifetimes. In this case, you as a programmer needs to keep track of the lifetimes of your pointers yourself, which of course gets way harder as your project grows.

Let's see a very simple example of such case:

#include <stdio.h>
#include <stdlib.h>

int main() {
int *a;
{
int y = 10;
int *ref_y = &y;
a = ref_y;
free(ref_y);
}

printf("a: %d\n", *a);
}

So, in this simple example, we try creating a variable that will store a reference to an integer, then we open a new scope, in which we create a local variable y and try assigning a reference to it to the variable a. Then later we free the reference to y and try using the variable a.

Clearly that’s problem, since we know when the scope of y ends it will be dropped, so we’d be pointing at an invalid memory location.

The C compiler does not prevent me from compiling and running this program, even though when executing it, i'll get an undefined behavior!

Let’s create the same program in Rust and see if it can catch this silly mistake of us.



fn main() {
// lifetime of a starts here
let a: &i32;

{
// lifetime of y starts here
let y = 10;
a = &y;
}
// lifetime of y ends here

println!("a: {}", a);
// lifetime of a ends here
}

When trying to compile this program, we get:

error[E0597]: `y` does not live long enough
--> src/main.rs:8:13
|
7 | let y = 10;
| - binding `y` declared here
8 | a = &y;
| ^^ borrowed value does not live long enough
9 | }
| - `y` dropped here while still borrowed
...
12 | println!("a: {}", a);
| - borrow later used here

For more information about this error, try `rustc --explain E0597`.

Wow, the compiler really didn't like it! Let's see what it tells us:

The error message says that the borrowed value &y does not live long enough to be used outside of the scope where y is defined in. Exactly what we've discussed!

So the question is, how could Rust figure that out at compile time without any help?

The answer is: Lifetimes! In this case we haven't specified the lifetimes of the references, since rust could figure them out automatically. With that, it assigned a lifetime to the variable a and when we've tried borrowing y, it assigned a lifetime to that borrow as well. Since rust knows the lifetime of y is less than the lifetime of a, it disallows us from using it when the lifetime of y end, aka the end of the y's scope!

See? It's not magic, even though it may seem like it.

Why bother about lifetimes?

So, after the previous discussion, you may be asking yourself: Why should i bother learning about lifetimes and references at all?

Lifetimes are important because they're a requirement when working with references in Rust. And references are very important, because in rust, we have primarily 2 ways to pass data around, we can pass the whole data, which moves its ownership to the caller, or we can pass only a reference to the data, in this case, the ownership does not get transferred.

So, why would i care about it?

Let's say you have a very big struct that holds immutable data that's supposed to be shared between several parts of your program. If you don't pass in a reference, your only option is to move the ownership of this struct to the caller. But, wait a second, there are several callers and when the first one gets ownership of the struct and goes out of scope, the struct gets dropped completely.

So how could you share that data with the other callers? Your only option is to Copy that data around every time you call a function that requires it!!

Do you really wanna be copying a very big struct around every time? What if these functions get called millions of times?

In this case, it's clear that you should only pass a reference to the struct containing the data you wanna read from!

Another case is when you have data that's been allocated on the heap. Think about a Vec for example.

You already have the data allocated for that vector, so why would you copy it every time you use it? It would be more ideal to get only a reference to it instead, otherwise, the whole vector would be copied and allocated to a different location in memory, even though you're effectively using the same data!

Explicit lifetimes

In several cases, Rust can automatically figure out which lifetimes to assign to the references we're using in our programs, but sometimes it cannot deterministically determine the relationship between the lifetimes of our references, so we need to explicitly tell the compiler how this relationship is set up.

Elision rules

The rust's elision rules are the ones that allow it to infer the lifetimes without the need for explicitness. There are 4 rules and they're quite simple:

  1. Each elided lifetime in input position becomes a distinct lifetime parameter.
  2. If there is exactly one input lifetime position (elided or not), that lifetime is assigned to all elided output lifetimes.
  3. If there are multiple input lifetime positions, but one of them is &self or &mut self, the lifetime of self is assigned to all elided output lifetimes.
  4. Otherwise, it is an error to elide an output lifetime.

So effectively, when we have references, each one of them gets a distinct lifetime. If we have just one reference, and we return a reference, the lifetime of the return is the same as the reference.

In the case of more than 1 lifetime, we need to explicitly tell the compiler the relationship between the lifetimes of our references manually.

The exception is when we got a struct and are adding methods to it, in this case, if we got a reference to self, the return reference's lifetime is the same as self by default.

So, why is the lifetime of the return data always tied to one of the parameters?

This is because we cannot return references to stuff that's been defined inside our function. Think about it, if we do that, as soon as our function ends, that data is dropped and we'd get a dangling reference!

So, let's see some examples to make it more clear:

1. Function with only one lifetime parameter and a return reference

// following the first elision rule, rust can figure out the lifetimes
fn some_fn(x: &i32) -> &i32 {
todo!()
}

// we could explicitly define them, though
fn some_fn<'a>(x: &'a i32) -> &'a i32 {
todo!()
}

2. Function with more than one lifetime parameter and a return reference

fn some_fn(x: &i32, y: &i32) -> &i32 {
todo!()
}

In this case, the code does not compile, and the compiler ask us to specify the lifetimes manually:

error[E0106]: missing lifetime specifier
--> src/main.rs:1:33
|
1 | fn some_fn(x: &i32, y: &i32) -> &i32 {
| ---- ---- ^ expected named lifetime parameter
|
= help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `x` or `y`
help: consider introducing a named lifetime parameter
|
1 | fn some_fn<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
| ++++ ++ ++ ++

So, why is that?

Think about it for a moment, we're getting 2 references through our function parameters, and we're returning a reference. Remember that each reference gets it's own lifetime! So, x and y have different lifetimes.

How could rust know to which lifetime the return data should be tied to? It simply can't determine that by itself, so it asks us to define them explicitly. In this case, let's say we're gonna do some operations with x and y inside the function. We'd like that the lifetimes of x and y be available as long as the function does. So, we could do something like:

fn some_fn<'a>(x: &'a i32, y: &'a i32) -> &'a i32 {
todo!()
}

Now, we've defined that the lifetimes of x, y and the return are tied together. That effectively means that the return of our function is available as long as x and y are!

3. Impl blocks with one reference and a return reference

struct SomeStruct {}

impl SomeStruct {
fn some_fn(&self, x: &i32) -> &i32 {
todo!()
}
}

In this case, even though we have 2 lifetimes, the one of self and the one of the x variable. Rust automatically sets the return lifetime to be the same as the self.

When should i start using lifetimes?

That's a very complicated topic, and there's no real best practice on when you should start implementing lifetimes in your programs.

Clearly, lifetimes do add a little bit of complexity to the project, so maybe starting up with them may slow you down a bit.

I personally only use them in early stages of my projects when i'm 100% sure that they make sense in that context. Otherwise, i just clone the data, it makes my life a little easier, then when i'm done with the design of my solution, i come back and remove the clone, by implementing references, smart pointers and lifetimes!

I find that to be one of the most effective ways to explore a new implementation that your uncertain about without adding too much complexity unnecessarily, but that of course is up to you at the end!

Conclusion

In this article i've explored Rust's lifetimes, i've tried giving an overview on how you should learn about them, when to use them and explaned the most common problems you man run into when using those.

There's still a lot to cover on the topic, but not to make this article too extensive, that's it for today!

References:

--

--