Implementing a Linked List in Rust
I’ve been pretty dedicated to learning and sharing about Rust lately. I also have an ongoing thing with data structures — they’re like little puzzle boxes; like a conglomerate of abstract simple-machines whose unique structures have inherent advantages and disadvantages. And even more so in Rust, since I get to think about how they actually are laid out in memory. So I decided I’d try to implement a Linked List in Rust flavor. On this Linked List journey I’ll touch on
std::mem functions, Iterators and IntoIterators, and recursive types (and therefore pointers), and I’ll attempt to satisfy the dreaded borrow checker while I’m at it!
(I assume the reader has some minimal basic knowledge of Rust!)
(EDIT: It was brought to my attention that i could probably drop the Tail type. Here’s an updated Rust Playground!)
Before we begin laying out our Linked List data structures, let’s discuss what a linked list is exactly. It’s a list composed of Nodes (we’ll call ours Links), which can exist in one of three states:
- An empty node,
- a node with no linked
nextnode, but a
value(a Tail node),
- or a Link node — one with both a
As you push values into the list, new nodes are created and linked to the tail end of the list, transforming the previous tail into a link with our new value as its
next. Then, when we want to consume the list later, we should be able to leverage the links’
next values to iterate through the list.
Structurally speaking, that sounds like a good case for a Rust
Now we have three possible states defined roughly like I described. None and Tail look fairly straightforward — but what’s with the Link’s
Box<Link<T>>? Well, Rust’s compiler can’t determine how much space to serve up for your instances if they may contain instances of themselves as fields (because it could contain infinite references to itself!). That’s what is known as a Recursive Type — one that stores an instance itself somewhere in it’s type signature. To remedy this, we add a layer of indirection: the Box. Boxes are just pointers basically. They live on the stack but store an address to an object on the heap. When you dereference them they unbox their address’ contents by following the pointer. They may seem intimidating at first, but I’ll offer up some nifty ways to work with them as we go.
OK, so that’s simple enough. Now let’s get to the implementation. Typically lists have some kind of add and remove functions. I’ll be calling these functions
push a try:
First we open a generic
impl block. We add the trait bound
T. For our implementation, I assume we’ll be dealing with things like
i32's and other
Copyable structures. But we can of course implement
Copy for any type that can live on the stack exclusively.
push function, we’ll want to use a nice idiomatic-Rust match statement. But what goes in our blocks? Since our
enum variants each take up the same amount of space (measured by Rust’s compiler to equal the largest size of its variants), we should be able to transform them in place when our list gets pushed into. Let’s think about what should happen if our current state is
Self::None in this case. We would want to transform our None value into a Tail — a node with just a value. We could use a function like this to do exactly that:
to_tail does exactly what it says it will do — turn our current Link instance into its Link::Tail variant. If it’s already a Tail, it will panic (since this means I, the developer, am using it in the wrong way). We can imagine how to do this same process to make a
to_link function too:
This time, converting anything other than a Tail will panic, because we can never turn an empty node into a link, and we shouldn’t ever replace a Link with a Link. But maybe I’ll reconsider this later when I try to implement
insert or a similar function.
Now we have something to do in our
push function’s match arms:
Except we still don’t know what to do with a Link. But when you think about it, it’s actually quite simple what we do with a link — we take its
next value, and just call
push again to pass
x down the linked list. No matter how long our list gets,
x will just keep getting pushed until it finds a tail, appending that node only.
There’s our complete
push implementation. But how about getting values back out of our List? That’s what
pop will be for:
Yet another puzzle. It’s like
push but in reverse, and we need to spit the value back as a return. So I’ve opted to return an Option type, so that we can handle our list being empty (and later it will help us implement Iterator). We also have a mutable reference to self, which means moving values out will be difficult. We also need to consider we may be converting our Link to it’s None variant. So we should write another conversion function,
Link::None is a unit-like struct, we’ll just replace its current value in memory with a new instance of
Link::None. Plus it gives us a fun opportunity to use a
replace. It does just what you’d expect — replaces what’s in address A with the provided contents of address B.
Now let’s write the
Link::Tail match arm for
We utilize our new
to_none function to toss the current contents of
self. But not before moving item out into a new variable, and then wrapping and returning that variable as an Option. One arm to go. Link is more complicated. It has a
next value to consider, and that value may be another link or the tail. We should make another conversion function, this time
to_next — whatever the value of next is. For flexibility, let’s make
nxt a regular Link, instead of a boxed one. We’ll handle unboxing before we call it.
But our actual
next values are Boxes and these Boxes are behind mutable references (
&mut self). Since Box can’t implement Copy, we can’t move it. Uh oh, that looks like the end of the line for this approach. But wait! We can pull an Indiana Jones on our memory:
We take a similar approach to moving our Copy-able
item out of
self. But we also create a mutable empty pointer (or rather, a pointer to an empty Link). Then we can call
mem::swap on our
next and a reference to our empty Box, switching their places in memory. We can then dereference our box to get the Link value and pass it to our
to_next conversion function. Now whatever next was, we don’t really need to care because we can simply replace our
self with it. Sweet! Now let’s give it a taste test:
This should print 1, 2, 3 on newlines in the console (the
.unwrap() is to discard the Option layer, we know our values are Some). Just as expected! So…now what? Well, it wouldn’t be much of a collection if we couldn’t iterate over it’s items. Let’s think about how to implement this:
- We have currently no way of indexing our Links.
popping our Links consumes them — not ideal.
- We are favoring mutation, Iterator doesn’t much like that.
So it’s time to put on our thinking caps and get creative. Let’s imagine a separate struct for reading the links. It holds onto a reference to a Link. Imagine that it is a little cursor pointing to the currently selected Link in the sequence. That’s exactly what we’ll call this imaginary struct — a Cursor:
We can derive Clone for this type if we also append Link with the Clone Trait. We can use
curr to determine what will be the next
curr by looking at its
next value, very much like our
pop implementation. Let’s implement IntoIterator for Link, which will spawn a Cursor to iterate through our Link sequence:
Each item in that block is necessary.
Item tells the compiler what to expect for iterator values,
IntoIter is our custom Cursor type, and
into_iter must return whatever
IntoIter refers to (in this case, our Cursor). That’s one piece of the iteration puzzle put in place. Now we need to implement Iterator for the Cursor itself (since that’s what will be used to do the actual iterating).
Although this code looks incredibly similar to
pop, it is unfortunately not possible to simply
self.curr value. We have to walk through it again, matching on each type and using our swap-the-box technique from before. Since
self is referring to an instance of Cursor in this context, our Link variants in our match arms are not mutably borrowed, they’re just values. This gives us an opportunity to use some interesting syntax though: look at
Link::Link’s match arm. Notice that we prefix
ref mut? In a destructuring pattern that implies that we are taking a mutable reference to the prefixed field. We finally return our Option-wrapped value,
We are ready to give it a test drive again. Make sure you append
#[derive(Clone)] to your Link
enum. Let’s see how it goes.
Implementing Iterator has given us some superpowers! Since our iterator and value (Link<T>) now implement Clone, we can clone our whole list for easy comparisons. We also don’t have to consume the list to use it’s values this way. More importantly, this gives us the built in ability to use
enumerate() on our list. The enumerate function transforms our iteration into tuples of type
(usize, Link<T>). So now we can utilize index in our for loops.
We get other benefits, too! We get
map (and other transformational functions) for free by implementing Iterator.
Through the process of building this Linked List implementation, we’ve experimented with how we can implement recursive types — and thus we’ve experimented with Boxes and heap allocation. We have learned a lot about the power of implementing Traits (you get free functionality!). We’ve also gotten to play with some nifty memory tricks like
I leave you, my Readers, with this Rust Playground Link. Hack away!
Hope you enjoyed another portion of my Rust journey, and until next time FP on, rustaceans!