Better Programming

Advice for programmers.

Follow publication

Learn Rust by Building a Linked List

Paolo Rechia
Better Programming
Published in
13 min readApr 10, 2023

Photo by Fahrul Azmi on Unsplash

Recently, I’ve been studying the Rust programming language again. It has been a couple of years since I read about it for the first time and built a simple side project in it.

It’s an amazing language but has a steep learning curve. This is especially true when writing data structures in it, as the Rust safety system does not play well with the typical pointer operations required for implementing data structures.

In many cases, it’s necessary to tap the “unsafe” Rust to code an ergonomic API for your data structure. I’m sharing my learning process as I try to build an ergonomic Linked List.

First, we’ll build a Linked List exclusively using the safe Rust API and see where we get stuck. Then we’ll apply some changes that we’ll require using unsafe Rust.

I’ll also go over the basics of a linked list in case you never built one before.

You can find the entire source code for this article in this GitHub repository.

Disclaimer: I’m by no means a Rust expert. A much better Linked List implementation can be found in the Rust standard library, which I used as a reference to learn how to code certain unsafe pointer manipulations.

The Linked List

A brief introduction

If you never coded a linked list, I’d briefly explain it here. It’s a simple data structure typically taught in the first year of a computer science curriculum. It’s not that useful in real usage scenarios, but it’s very simple to implement and makes a great introduction to pointers and more advanced data structures.

Graphically, this is one way of representing a linked list:

A linked list representation
A linked list with three nodes

The first node is typically called the head, and we typically store its memory address somewhere to access the list. We call this stored memory address a pointer, and it’s typically referenced as a variable in the source code.

To access an element in the middle or the end of the list, we must always start at the head and iterate over the pointers until we reach our desired target.

It’s also possible to define a double-linked list, where it’s possible to traverse in the opposite direction, but we’ll keep it simple for now.

We can define a few operations on this data structure. In our case, our goal is to use the linked list to later implement a stack. Therefore, we’ll not implement all possible operations but instead, focus only on a couple of useful operations for our implementation. Hence, we will implement the following:

  1. adding an element to the left side of the list
  2. removing an element from the left side of the list

With this, you should be ready to start diving into some Rust code.

Our initial node definition

We’ll start with a naive definition. We’ll have a single struct that can either have a value or not and either have a next pointer or not. You can think of the struct as a class in other languages like Java or Python.

pub struct LinkedList<T> {
pub val: Option<T>,
pub next: Option<Box<LinkedList<T>>>,
}

Defining a value that may be null is defined by the Option enum in Rust, and it’s how Rust avoids allowing null pointers in safe code, as you need to explicitly tell Rust how to inspect a “nullable” variable.

The <T> in the definition is how we express a generic data type. Our struct may have several implementations defined for different types, reusing the same struct. We’ll keep it simple later on, and implement some methods only on i32 type (integer).

Also important in the definition is the usage of Box. This flags that the LinkedList struct is allocated on the heap instead of the stack. If you don’t know the difference between stack and heap, it’ll suffice to know here that variables assigned to the stack only allow data of fixed size that is known at compile time. We must use the heap for anything of dynamic size, e.g., allocated during run time.

If we remove the Box from our definition, the compiler will throw us the following error:

error[E0072]: recursive type `LinkedList` has infinite size
--> src/lib.rs:1:1
|
1 | pub struct LinkedList<T> {
| ^^^^^^^^^^^^^^^^^^^^^^^^
2 | pub val: Option<T>,
3 | pub next: Option<LinkedList<T>>,
| ------------- recursive without indirection
|
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
|
3 | pub next: Option<Box<LinkedList<T>>>,
| ++++ +

For more information about this error, try `rustc --explain E0072`.
error: could not compile `mini-linked-list` due to previous error

Finally, we make sure our struct and fields are public by prepending the definitions with pub.

Adding elements to the left side

Let’s also first define a test before implementing our first method. This will fail to compile, but it’s fine:

#[cfg(test)]
mod tests {
use super::*;

#[test]
fn test_linked_list_push_left() {
let mut list: LinkedList<i32> = LinkedList::<i32>::new();
list.push_left(1);
list.push_left(2);
list.push_left(3);
list.push_left(4);
assert_eq!(list.collect(), vec![4,3,2,1]);
}
}

We’ve defined three methods here. Our constructor, new, the actual logic we want to implement, that we called push_left, and finally, a helper called collect which transforms the list into a vector, so we can test that the items are stored in the correct order.

The constructor

We can create a new block below our struct definition to define an implementation for a specific type. We’ll implement it for i32 as this makes an effort significantly simpler for Rust beginners like we are.

impl LinkedList<i32> {
/// # Creates an empty LinkedList that may hold i32 values
///
/// # Example
/// ```
/// let list = mini_linked_list::LinkedList::<i32>::new();
/// ```
pub fn new() -> LinkedList<i32>{
LinkedList {
val: None,
next: None,
}
}

As you can notice, the constructor is just a static method that has no reference to the self structure, and returns a new instance of a LinkedList<i32> with its fields set to None.

We add a doctest above the constructor, which shows how to use the constructor. We called our package mini_linked_list, and we reference the crate (our Rust package) name in the example to showcase the crate public API. If we comment out our previous test that doesn’t compile, we can run cargo test to execute this doctest, yielding our first passing test:

   Compiling mini-linked-list v0.1.1 (/Users/paolorechia/dev/learn-rust/linked-list)
Finished test [unoptimized + debuginfo] target(s) in 0.38s
Running unittests src/lib.rs (target/debug/deps/mini_linked_list-e6f8b67db5aa6053)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Doc-tests mini-linked-list

running 1 test
test src/lib.rs - LinkedList<i32>::new (line 18) ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.40s

Our push_left function

Our objective with this function is going from this state:

list:LinkedList goes to None and None
The initial state of our list

To this:

list:LinkedList goes to None and Pointer, which go to 1 and None
The desired state after calling push left

Notice that we don’t store a value in our head node, instead, we use it as a sentinel (e.g., placeholder) node. It makes the logic in our other functions simpler to implement.

Our first attempt may look like this:

    pub fn push_left(mut self, x: i32) {
let node = LinkedList::<i32> {
val: Some(x),
next: Some(Box::new(self))
};
self = node;
}

We would, however, get this compiling error:

error[E0382]: use of moved value: `list`
--> src/lib.rs:80:9
|
78 | let mut list: LinkedList<i32> = LinkedList::<i32>::new();
| -------- move occurs because `list` has type `LinkedList<i32>`, which does not implement the `Copy` trait
79 | list.push_left(1);
| ------------ `list` moved due to this method call
80 | list.push_left(2);
| ^^^^ value used here after move

Graphically, what we’re trying to do is this in this step:

Two pointers to the same node

The Rust ownership safety system prevents us from doing this. When we do this, the original pointer (here called list: LinkedList) becomes invalid! Right, so we can’t have two owners of self simultaneously, and when we did next: Some(Box::new(self)) we moved the ownership of self to somewhere else.

Could we try borrowing instead? No, it doesn’t help, as we’re effectively changing the memory address of the head in the last line, and the compiler gives us back the following:

error[E0507]: cannot move out of `self.next` which is behind a mutable reference
--> src/lib.rs:43:28
|
43 | next: Some(self.next.unwrap())
| ^^^^^^^^^ -------- `self.next` moved due to this method call
| |
| help: consider calling `.as_ref()` or `.as_mut()` to borrow the type's contents
| move occurs because `self.next` has type `Option<Box<LinkedList<i32>>>`, which does not implement the `Copy` trait
|

A naive solution is instead to relax our public API and make it return a new list instead:

    pub fn push_left(self, x: i32) -> LinkedList<i32> {
let node= LinkedList::<i32> {
val: Some(x),
next: Some(Box::new(self))
};
node
}

Then in our test, we may run this as:


#[test]
fn test_linked_list_push_left() {
let mut list: LinkedList<i32> = LinkedList::<i32>::new();
list = list.push_left(1);
list = list.push_left(2);
list = list.push_left(3);
list.push_left(4);
}

Yes, this will work. But we’ve made using our library significantly more cumbersome since the user must always reassign the list to somewhere else.

Let’s try using a couple of other two options to implement our list:

  1. Annotated lifetimes
  2. Unsafe Rust

Using annotated lifetimes

You might be thinking that maybe we need another struct to hold the head inside. For instance, you might try this so we don’t try to mutate self directly, but just self.next.

pub fn push_left(&mut self, x: i32) {
if self.next.is_none() {
let node= LinkedList::<i32> {
val: Some(x),
next: None
};
self.next = Some(Box::new(node));
} else {
let node= LinkedList::<i32> {
val: Some(x),
next: Some(self.next.unwrap())
};
self.next = Some(Box::new(node));
}
}

That also doesn’t work, however, as we’ll run into the same issue nonetheless, e.g., the self.next.unwrap() takes ownership of the self.next, and we’re no longer allowed to assign to it in the next statement.

Unless, of course, you don’t take ownership of a new Box, but rather just borrow it. Let’s say our push_left function doesn’t allocate memory anymore but just borrows the address of a box.

To make this work, we need to annotate our elements with lifetime, mentioning that the passed box will live as long as the parent Linked List. This allows the compiler to check that all references will be valid. Our implementation could look like this:

pub struct LinkedList<'a, T> {
pub val: Option<T>,
pub next: Option<&'a Box<LinkedList<'a, T>>>,
}

impl<'a> LinkedList<'a, i32> {
pub fn new() -> LinkedList<'a, i32>{
LinkedList {
val: None,
next: None,
}
}

pub fn push_left(&mut self, node: &'a mut Box<LinkedList<'a, i32>>) {
if self.next.is_none() {
self.next = Some(node);
} else {
node.next = self.next;
self.next = Some(node);
}
}

pub fn create_box(&self, x: i32) -> Box<LinkedList<'a, i32>> {
Box::new(
LinkedList {
val: Some(x),
next: None
}
)
}
}

Then in our tests, we would define the following:

fn test_linked_list_push_left() {
let mut list: LinkedList<i32> = LinkedList::<i32>::new();
let mut box_ = list.create_box(1);
let mut box_2 = list.create_box(2);
let mut box_3 = list.create_box(3);
let mut box_4 = list.create_box(4);

list.push_left(&mut box_);
list.push_left(&mut box_2);
list.push_left(&mut box_3);
list.push_left(&mut box_4);
}

This also works! However, now the caller must manage the memory allocations himself. This actually looks even worse than the first implementation. Let’s now look at our third option.

Using unsafe Rust

Let’s rewind to our first definition and modify it slightly:

use core::ptr::NonNull;


pub struct LinkedList<T> {
pub val: Option<T>,
pub next: Option<NonNull<LinkedList<T>>>,
}

Instead of using the Box type, we’d be using NonNull, which is one way of accessing raw pointers in Rust.

We can then create a Box and use Box::leak() to get its memory address in a way that will outlive our current function definition.

Let’s first handle the case where our list is initially empty:


pub fn push_left(&mut self, x: i32) {
// allocate on the heap
let node = Box::new(LinkedList::<i32> {
next: None,
val: Some(x)
});

if self.next.is_none() {
// our list is empty
// take control over the raw pointer
let pter: NonNull<LinkedList<i32>> = Box::leak(node).into();

// head is now the new pointer
self.next = Some(pter);

} else {
// our list already has an element
// we should do something else here...
}

Notice that we did not need to define any unsafe blocks yet! Using raw pointer and Box::leak is perfectly safe. However, beware, you may not drop the reference to its data, or else you’ll incur a memory leak. We’ll have to remember this when deleting this node.

Now, let’s look into the else block:

else {
// our list already has an element
// take control over the raw pointer, as mutable
let mut pter: NonNull<LinkedList<i32>> = Box::leak(node).into();
unsafe {
// new node should point to current head
pter.as_mut().next = self.next;
}
// head is now the new pointer
self.next = Some(pter);
}

Notice the unsafe block here. We’re calling pter.as_mut(), which is unsafe as it is dereferencing a raw pointer, which in theory, could be pointing to garbage.

Do notice something interesting, however: at some point in this program, we have two owners of the contents of self.next!

Recall this diagram?

Two pointers to the same node

This is exactly what we managed to do with the unsafe block. Of course, once we call self.next = Some(pter); we’ll finish our job and end up in the desired state:

list:LinkedList goes to None and Pointer, which go to 1 and None
The desired state after calling push left

Building the pop_left function

Because we’ve already seen that using unsafe blocks lead to a more comfortable-to-use public API for our library, we’ll jump straight into using the unsafe implementation for the pop function.

Let’s first define our unit test:

#[test]
fn test_linked_list_pop_left() {
let mut list: LinkedList<i32> = LinkedList::<i32>::new();

list.push_left(1);
list.push_left(2);
list.push_left(3);
list.push_left(4);

let x = list.pop_left();
assert_eq!(x.unwrap(), 4);

let x = list.pop_left();
assert_eq!(x.unwrap(), 3);

let x = list.pop_left();
assert_eq!(x.unwrap(), 2);

let x = list.pop_left();
assert_eq!(x.unwrap(), 1);

let x = list.pop_left();
assert_eq!(x.is_none(), true);
}

We have to handle three cases:

  1. List is empty
  2. List has one element
  3. List has two or more elements

Let’s start with the first, easy one. Our list looks like this:

All we need to do is check that our next node is empty and return None:

    pub fn pop_left(&mut self) -> Option<i32> {
// empty list
if self.next.is_none() {
return None
}
else {
// cases 1 and 2
...
}

In our second case, we have exactly one nonnull value, and graphically it looks like this:

This is one way to handle it:

else {
let mut next = self.next.unwrap();

// list only has one element
let only_one: bool = unsafe {next.as_mut().next.is_none()};
if only_one == true {
// pull the next element from the raw pointer
// and store it in a box, so memory free can work correctly
let next_box = unsafe {
Box::from_raw(next.as_ptr())
};
self.next = None;
next_box.val
} else {
// case 3, ignore for now
}
}

We first check whether the next element of next is none, so we can determine if we fall into case 2 or 3. Notice that here an unsafe block is required. It’s recommended to keep the unsafe blocks as small as possible. Thus, we do our unsafe job and return the boolean before continuing.

Then, we do three statements.

First, load the raw pointer into a Box, so the Rust compiler can correctly call the drop function on this pointer to deallocate the memory once we leave the function call. Again, we’re dereferencing a raw pointer, which is unsafe.

Second, we set our next to None, so our list goes back to case 1 (empty list).

And finally, we return the value, which is an option.

Our third and last case looks like this:

  // list has two or more elements
else {
let next_of_next = unsafe { next.as_mut().next };
let next_box = unsafe {
Box::from_raw(next.as_ptr())
};
self.next = next_of_next;
next_box.val
}

This is very similar to case 2, but instead, we go one extra pointer ahead and find the next of next node, which in this case, refers to the last node in the diagram (the one with the number 2).

Again, our list head must be loaded into a Box so we can correctly deallocate its memory. We set our head to the next of next node and return the head’s value.

This is what our list looks like after handling this case:

list:LinkedList goes to None and Pointer, which go to 2 and pointer

We’re almost done. We just need to finish writing our helper function.

A small extra: our collect function

Next, let’s implement our helper collect function. Inside the same impl block we’ve defined above, we add a new function. I’ll not go over details about this one. In summary, it pushes all elements to a Vector, so we can test the behavior of our list.

  pub fn collect(&self) -> Vec<i32> {
let mut result = Vec::<i32>::new();
if self.next.is_none() {
return result
}

let mut node = self.next.unwrap();
unsafe {
result.push(node.as_mut().val.unwrap());
while node.as_mut().next.is_some() {
node = node.as_mut().next.unwrap();
result.push(node.as_mut().val.unwrap());
}
}
return result;
}

That’s it! With this, we can now have our two unit tests passing! Now we’re ready to implement a Stack using this Linked List. The best part is that the stack will probably not require any unsafe calls! I’ll leave that as an exercise to you.

Conclusion

Admittedly, explaining these steps took longer than I initially thought it would!

I hope it was easier for you to understand than it was for me at first. I struggled quite a bit to implement a Linked List since I was initially unaware that I needed unsafe constructs for it.

I learned by looking at the standard library source code, which is a great way to learn on your own if you don’t find a good tutorial!

Also, if you notice any mistakes or problems with this implementation, please let me know. I’m still a beginner with Rust.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Paolo Rechia
Paolo Rechia

Written by Paolo Rechia

Software Developer / Data Engineer - Connect with me on Linkedin: https://www.linkedin.com/in/paolo-rechia/

Write a response