Unwrapping Rust’s errors
The Rust programming language is loved for its safety and speed and is weirdly fun to use, but sometimes, things go wrong.
In this article, I will focus on only one aspect, that many encounter from the start, and is the source of much struggle. The .unwrap()
method, and all it entails. It’s in all the documentation and samples examples, yet people say not to use it. Why’s that? Let me unwind this for you, and some more.
I will use some examples from a toy analysis project on coronavirus data. If you’re interested, you can find the complete source code here: https://github.com/hhamana/COVID-19.
Error handling
At the core of the issue is Rust’s error handling mechanism. Many languages have very widely different mechanisms for the program to say “hey, something went wrong, I can’t do that”. Java throws
errors, Python raises Exception
s, C is kinda dumb and likes to send tuples with the result, to say if it’s successful or not.
In Rust, error handling is integrated as part of the type system, and is defined as a Result<Value,Error>
, Value and Error both being other types, custom or built-in. The idea is that if something can fail, a function can return the Error it needs to, and the part that called it gets the obvious Result type, and can deal with it appropriately. Such a result can be sent wrapped within a Ok(value)
for successes, or Err(error)
for errors. Quite straightforward, and the compiler will remind you if you forget it, in its usual tight embrace.
Some specialized Result types like the one from the std::io::Result<>
type imply the error type (it will be std::io::Error
), so you only need to define your own success type there. Those are built as a standard Result, but with a custom type alias as such: type Result<T> = Result<T, std::io::Error>
so the compiler already knows the error type, and your own result is abstracted.
Not handling
Alternatively, there’s another way: not handling it, and willingly crash the program instead. That’s what a panic!()
does. Sometimes it’s safer that way, and we can just start again, instead of spending hours to actually do the error handling work when you’re just experimenting. The panic
macro can accept a text message, so a panic!("No file name for file path {}", file_path");
will repeat that message, along with the file and line it was called for quick debugging. The crash will look like this:
thread ‘main’ panicked at ‘No file name for file path “csse_covid_19_daily_reports/01–21–2020.csv”’, src/main.rs:122:21
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
My message makes it useful to remember what the program was trying to do, and Rust adds exactly the line and file where it happened (that’s the file src/main.rs,
line 122
, column 21
), so I can investigate. In this case, I forced it to load files that do not exist. Furthermore, there’s this RUST_BACKTRACE
note thing, we’ll see a bit later.
Overall, error handling is divided into those two, Result<>
for recoverable
errors, and running away from the problem with a panic!()
.
But having to write well-thought error handling code and recovery for any little thing is annoying, and sometimes we just know through logic the problem is not going to happen. This is where unwrap()
comes in and blurs the line.
unwrap and expect
When faced with a Result
, instead of implementing what should be done to handle the possible error, the .unwrap()
method can be used on any function that returns a Result<>
, and just internally calls a panic, with no error message. Quick and dirty.
If the function returns an Ok(value)
, you will get the value
directly. If the function returns an Err(error)
, the program will stop.
There’s the slightly nicer version which allows you to set a message at least, .expect(“Goodbye world”)
. It’s still a bit less code than constructing a match
to extract the error and calling thepanic!()
there, with the added benefit of adding some context to the panic. As such, unwrap
is the lazy way out. Or is it?
?
There is an even lazier way to bail out of handling a Result<>
possibly returning an Error, and this one isn’t even as potentially dangerous, due to its limitations. A simple ?
after a function call that return a Result will propagate the Error upwards in the stack. As such, the function using it must itself return a Result<>
, which just shifts the burden of error handling further up. This will not cause the program to crash (well, as long as the upstream function doesn’t call an unwrap
or panic!
itself).
fn load_csv_data(file_path: PathBuf) -> Result<HashData, csv::Error> {
let mut rdr = csv::Reader::from_path(file_path)?;
for result in rdr.deserialize() {
let record: RowData = result?;
// ...
}
}
In this code sample, I am loading a CSV file, using the csv
crate and using its serde
compatibility to directly convert it into the struct representing the data. The creation of the csv reader and extracting the data can both return an error, luckily both of the type csv::Error, a custom enum made by the csv
library to group various errors in a single type.
My whole function can then just return those to the caller, and isn’t cluttered with error handling code. The function upstream will handle both of these errors, as you can see later. No panic, and as a bonus, it handles 2 possible errors in one strike. The caller could still match
to extract the exact csv error if it was needed.
Libraries examples
Most Rust libraries (crates) provide examples on how to use them. But if one just goes around copy-pasting them, you’ll quickly notice many of those examples make a very liberal use of unwrap, or expect. Take for example, serde
, one of the most used library in the ecosystem, using unwrap()
in the 2nd active line of their first example.
fn main() {
let point = Point { x: 1, y: 2 };
// Convert the Point to a JSON string.
let serialized = serde_json::to_string(&point).unwrap(); // ... etc... ///
The reason why library authors provide examples using unwrap is simple: it shows less code, and goes straight to the point instead. That’s what an example is for. For beginners trying to learn, that’s by far the best. A quick script that you make on your free time? Cool, unwrap at your heart’s content, and don’t feel guilty.
You won’t even notice it, this example cannot crash: the Point
struct was constructed manually, it was already checked valid at compile time. If this is all the program ever does, unwrap is perfectly justified, bad things logically cannot happen. It can only go wrong if reading data from another source at run time.
You just have to know that the unwrap is there to make it easier to understand the library’s usage, but reading this code, you should have a mental note there: “this function returns a Result, I will have to handle it properly instead of just copy-pasting”.
What should I do then?
Like the example above, there are many cases where you know compile checks, previous validations and logic was done right, so the error cases cannot happen, but you still have a Result<>
to deal with. In those cases, calling unwrap
is like calling bluff to the compiler. It can be a valid solution. Or at least an expect
to add some sort of an error message in case something unexpected really does happen.
Unfortunately, those cases are rare, interesting software doesn’t run in a perfectly controlled vacuum, and proper error handling is an important part of engineering a reliable system. Compared to other languages like Python which lets you ignore an Exception
and crash at runtime if not catched, it’s a nice luxury to even have such a straightforward view of where potential crashes are. But for high quality software, having those all over the place is certainly not a good idea.
match
A proper match
on the Result
values is generally the way to deal with it.
On the ?
example above, I returned errors to the caller. Here’s how the caller deals with the Result, unfolding success and error values with the match
syntax.
let data = match load_csv_data(file_path) {
Ok(csv_data) => csv_data,
Err(err) => {
println!("{:?}", err);
return None
}
};
On this example, the csv_data
that gets returned with an Ok()
is put directly in the data
variable, to be processed a bit more later. On the Err() case though, I am bailing out prematurely of this whole function with the return
, and sending a None
value with it, after printing the error information to leave some visible trace. Note on the {:?}
: unlike the regular {}
syntax for string formatting, this will call the one from a Debug
implementation, which is generally derived to implement automatically, instead of Display
trait, which has to be implemented manually for custom structs.
The match
syntax allows us to deal with so many things in Rust, avoiding endless if / else if / else
, while still making sure all cases are covered. The use of early return on such matches creates guards that allow a smooth continuation of the normal process, pruning out defects on the way to finish with the clean result at the end of the function.
map_err
In the earlier ?
example, I mentioned we were lucky both results has the same csv::Error
error type. Sometimes a single function has different error types to handle, but if you still try to send them both upwards with a ?
, the compiler will complain about a type mismatch on the error type, as the function signature can only define a single Error type in its Result. You have to make them all the same in your function.
You could deal with this similarly with a match and early return such as return Err(SomeError::ErrorValue)
, but with the match, this is still a lengthy construct, and this equates to implementing the exact same thing a single ?
would do. Results offer amap_err
method that allow you to write a closure to define another type of Error. It also gives you the error as closure argument, so you can do more processing to define a different error depending on the original one. Then you can send it upwards with a ?
. This allows a construct like this: .map_err(|e| { SomeError::ErrorValue })?;
.
unwrap variants
Rust offers other methods. If this is about getting data, you can use a static value with unwrap_or(other_value)
, which will give you that other_value
in case of error. It will effectively be the default value. But talking about defaults, there’s a trait for this, Default, and if you have implemented it for the type you were expecting, you can use unwrap_or_default()
, and it will use it directly. Neat way to avoid cluttering the business logic with too much error code. If you haven’t and you still need some sort of backup value, or your value would depend on the error it would have gotten otherwise, you can use .unwrap_or_else(|err| { //write your closure here} );
which will take a closure on the spot, and call it with the Error. Be careful for all those methods, the type system makes it so the closure has to return the same type an Ok()
would.
Failure
There is some community effort to improve the error handling paradigm by changing it to a derivable Trait, with notably the failure
crate. Rather than recovery logic or error handling per se, it provides a more convenient way to format and use custom errors. This gets useful for larger systems that have many different types of errors to expose to an end user, as such applications tend to have their own struct
and enum
to manage it. Implementing or deriving Fail on such structs should make for a smoother experience.
Library writers
This article if mainly intended for application writers, but if you are writing a library, do not panic. Don’t unwrap. For the overwhelmingly majority of cases, you should send a Result
and let the API user decide what to do, and be aware of the potential issue instead of crashing in edge cases. Even if documented somewhere, code design makes better documentation.
For what concerns libraries, unwrap
and panics
generally belongs to example code, as a placeholder for application-dependent error handling. On the exact opposite, showing off error handling for those cases might take away and distract from the point of showing how to use your library.
Now for some more intermediate-level Rust.
Stack Unwinding
The panic
macro that ends up being called by this unwrap
performs stack unwinding. For every panic, behind the scenes, the compiler will add a call stack trace, meaning it will go backwards to which function called this, and backwards again and again until the start. This information is generated and included directly in the binary, and can be seen by setting the RUST_BACKTRACE
environment variable to a value of 1
. We saw that one earlier. It will print out the information while crashing for debugging purposes.
For example, I am calling a let files = get_data_files().expect(“Failed to get CSV files list”);
when trying to load the list of files. As “expected”, it fails to load the files when I give it an incorrect path (within the get_data_files
function, which does the actual loading). With the RUST_BACKTRACE
on, this is what it gives me:
thread 'main' panicked at 'Failed to get CSV files list: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:201:17
stack backtrace:
0: backtrace::backtrace::libunwind::trace
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
1: backtrace::backtrace::trace_unsynchronized
at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
2: std::sys_common::backtrace::_print_fmt
at src/libstd/sys_common/backtrace.rs:77
3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
at src/libstd/sys_common/backtrace.rs:59
4: core::fmt::write
at src/libcore/fmt/mod.rs:1052
5: std::io::Write::write_fmt
at src/libstd/io/mod.rs:1426
6: std::sys_common::backtrace::_print
at src/libstd/sys_common/backtrace.rs:62
7: std::sys_common::backtrace::print
at src/libstd/sys_common/backtrace.rs:49
8: std::panicking::default_hook::{{closure}}
at src/libstd/panicking.rs:204
9: std::panicking::default_hook
at src/libstd/panicking.rs:224
10: std::panicking::rust_panic_with_hook
at src/libstd/panicking.rs:472
11: rust_begin_unwind
at src/libstd/panicking.rs:380
12: core::panicking::panic_fmt
at src/libcore/panicking.rs:85
13: core::option::expect_none_failed
at src/libcore/option.rs:1199
14: core::result::Result<T,E>::expect
at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/result.rs:991
15: covid::main
at src/main.rs:201
16: std::rt::lang_start::{{closure}}
at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libstd/rt.rs:67
17: std::rt::lang_start_internal::{{closure}}
at src/libstd/rt.rs:52
18: std::panicking::try::do_call
at src/libstd/panicking.rs:305
19: __rust_maybe_catch_panic
at src/libpanic_unwind/lib.rs:86
20: std::panicking::try
at src/libstd/panicking.rs:281
21: std::panic::catch_unwind
at src/libstd/panic.rs:394
22: std::rt::lang_start_internal
at src/libstd/rt.rs:51
23: std::rt::lang_start
at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libstd/rt.rs:67
24: main
25: __libc_start_main
26: _start
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Of course now we get a wall of text, with a whole lot of stuff related to the unwinding process itself and application startup. It even tells us there can be more details by setting it to “full”, but those details will be instruction memory addresses, and function build hashes. It’s more for debugging the compiler itself. For what concerns me, only the line 24 and 15 are useful : it gets to the main
and since the expect
is called from the main
, it doesn’t show us much more. The actual problem happens within the get_data_files
, which propagated the error upwards with a ?
, so we don’t see it here. Let’s change it and move the expect there instead. Here’s the relevant stack information it gives me, and running with full
backtrace for the curious.
thread 'main' panicked at 'Failed to get CSV files list: Os { code: 2, kind: NotFound, message: "No such file or directory" }', src/main.rs:78:27
stack backtrace:
/../
14: 0x55e966f952d9 - core::result::Result<T,E>::expect::hc4ef2b3b06495828
at /rustc/b8cedc00407a4c56a3bda1ed605c6fc166655447/src/libcore/result.rs:991
15: 0x55e966f8e895 - covid::get_data_files::he58c20a479bd6dd6
at src/main.rs:78
16: 0x55e966f9173c - covid::main::hc80eff7212fd03b3
at src/main.rs:201/../
25: 0x55e966f9278a - main
The line 15 is now the call to the function indeed. The call at a deeper level shows me a little bit more more of the stack. We would have seen this anyway in a proper debugging session.
This stack trace is not generated at run time, but at compile time, and is included in the executable binary. On top of printing this debug-friendly call stack history (which is also used by IDE debuggers), it also generate calls to all function destructors in order to free memory on the way, preventing the panic to lead to any memory leak issue, and ultimately terminates the process. If you are using multi-threaded code, it should only terminate the thread, and the application as a whole can keep running. This is what web frameworks in the ecosystem tend to do, so requests going bad won’t take out the whole server.
Embedded messages
This process uses a little bit of extra binary size, and is dependent on the OS you are using. But Rust is also used on micro-controllers, where storage space can be extremely limited, and more importantly, there is no OS to speak of.
When developing for embedded devices, ie. in a #[no_std]
environment, the compiler will tell you that you need to setup your own panic behavior, or more realistically, use one of the many community-provided panic handlers on crates.io. After adding panic-semihosting = “0.5”
on my cargo.toml
, a single line allows me to use it, with no other code beyond it.
use panic_semihosting;
I do like semihosting when developing for embedded, it’s quite easy to setup once you have a hardware debugger, and see the result on your development machine. It does not work unplugged however, and should be switched to another panic provider on production devices, such as panic-halt
or panic-ramdump
, depending on your use case.
If you are concerned with the binary size, there is a way to remove a good chunk of it. The Cargo.toml
where one generally defines dependencies can also be used to configure this behavior, by using these two lines.
[profile.release]
panic = “abort”
You are probably aware that building with cargo build — release
enables many optimizations, resulting in the fast executable we all love. It follows the “release” profile, instead of the default “dev” profile. By setting this panic = “abort”
, we are telling the compiler to not add any stack unwinding information when compiling to release. You can add the same settings to [profile.dev]
if you also want to remove them in debugging, at the cost of making your own debugging work harder. On this small project, adding this configuration shaved 21_764 bytes off, about 2% of the total binary size. “abort”
is the only alternative to the default behavior, called “unwind”
.
Conclusion
Rust is a very strict language in that it will not allow you to compile things it knows can lead to errors, just to even try. unwrap
allows you to experiment and fix the problem later, so you can see if your logic is right before getting in too deep.
When time comes to production, Rust has many tools to help you remove those, and lets you, the engineer, in control, armed with the confidence that there are no surprise issues. Such extensive error handling might seem unwieldy, but is the herald of quality.