Chapter 7. Error Handling

I knew if I stayed around long enough, something like this would happen.

George Bernard Shaw on dying

Error handling in Rust is just different enough to warrant its own short chapter. There aren’t any difficult ideas here, just ideas that might be new to you. This chapter covers the two different kinds of error-handling in Rust: panic and Results.

Ordinary errors are handled using Results. These are typically caused by things outside the program, like erroneous input, a network outage, or a permissions problem. That such situations occur is not up to us; even a bug-free program will encounter them from time to time. Most of this chapter is dedicated to that kind of error. We’ll cover panic first, though, because it’s the simpler of the two.

Panic is for the other kind of error, the kind that should never happen.

Panic

A program panics when it encounters something so messed up that there must be a bug in the program itself. Something like:

Out-of-bounds array access
Integer division by zero
Calling .unwrap() on an Option that happens to be None
Assertion failure

(There’s also the macro panic!(), for cases where your own code discovers that it has gone wrong, and you therefore need to trigger a panic directly. panic!() accepts optional println!()-style arguments, for building an error message.)

What these conditions have in common is that they are all—not to put too fine a point on it—the programmer’s fault. A good rule of thumb is: “Don’t panic”.

But we all make mistakes. When these errors that shouldn’t happen, do happen—what then? Remarkably, Rust gives you a choice. Rust can either unwind the stack when a panic happens, or abort the process. Unwinding is the default.

Unwinding

When pirates divvy up the booty from a raid, the captain gets half of the loot. Ordinary crew members earn equal shares of the other half. (Pirates hate fractions, so if either division does not come out even, the result is rounded down, with the remainder going to the ship’s parrot.)

fn pirate_share(total: u64, crew_size: usize) -> u64 {
    let half = total / 2;
    half / crew_size as u64
}



This may work fine for centuries until one day it transpires that the captain is the sole survivor of a raid. If we pass a crew_size of zero to this function, it will divide by zero. In C++, this would be undefined behavior. In Rust, it triggers a panic, which typically proceeds as follows:


An error message is printed to the terminal:
thread 'main' panicked at 'attempt to divide by zero', pirates.rs:3780
note: Run with `RUST_BACKTRACE=1` for a backtrace.
If you set the RUST_BACKTRACE environment variable, as the messages suggests, Rust will also dump the stack at this point.
The stack is unwound. This is a lot like C++ exception handling.
Any temporary values, local variables, or arguments that the current function was using are dropped, in the reverse of the order they were created. Dropping a value simply means cleaning up after it: any Strings or Vecs the program was using are freed, any open Files are closed, and so on. User-defined drop methods are called too; see “Drop”. In the particular case of pirate_share(), there’s nothing to clean up.
Once the current function call is cleaned up, we move on to its caller, dropping its variables and arguments the same way. Then that function’s caller, and so on up the stack.
Finally, the thread exits. If the panicking thread was the main thread, then the whole process exits (with a nonzero exit code).


Perhaps panic is a misleading name for this orderly process. A panic is not a crash. It’s not undefined behavior. It’s more like a RuntimeException in Java or a std::logic_error in C++. The behavior is well-defined; it just shouldn’t be happening.
Panic is safe. It doesn’t violate any of Rust’s safety rules; even if you manage to panic in the middle of a standard library method, it will never leave a dangling pointer or a half-initialized value in memory. The idea is that Rust catches the invalid array access, or whatever it is, before anything bad happens. It would be unsafe to proceed, so Rust unwinds the stack. But the rest of the process can continue running.
Panic is per thread. One thread can be panicking while other threads are going on about their normal business. In Chapter 19, we’ll show how a parent thread can find out when a child thread panics and handle the error gracefully.
There is also a way to catch stack unwinding, allowing the thread to survive and continue running. The standard library function std::panic::catch_unwind() does this. We won’t cover how to use it, but this is the mechanism used by Rust’s test harness to recover when an assertion fails in a test. (It can also be necessary when writing Rust code that can be called from C or C++, because unwinding across non-Rust code is undefined behavior; see Chapter 21.)
Ideally, we would all have bug-free code that never panics. But nobody’s perfect. You can use threads and catch_unwind() to handle panic, making your program more robust. One important caveat is that these tools only catch panics that unwind the stack. Not every panic proceeds this way.




Aborting

Stack unwinding is the default panic behavior, but there are two circumstances in which Rust does not try to unwind the stack.

If a .drop() method triggers a second panic while Rust is still trying to clean up after the first, this is considered fatal. Rust stops unwinding and aborts the whole process.
Also, Rust’s panic behavior is customizable. If you compile with -C panic=abort, the first panic in your program immediately aborts the process. (With this option, Rust does not need to know how to unwind the stack, so this can reduce the size of your compiled code.)
This concludes our discussion of panic in Rust. There is not much to say, because ordinary Rust code has no obligation to handle panic. Even if you do use threads or catch_unwind(), all your panic-handling code will likely be concentrated in a few places. It’s unreasonable to expect every function in a program to anticipate and cope with bugs in its own code. Errors caused by other factors are another kettle of fish.




Result

Rust doesn’t have exceptions. Instead, functions that can fail have a return type that says so:

fn get_weather(location: LatLng) -> Result<WeatherReport, io::Error>
The Result type indicates possible failure. When we call the get_weather() function, it will return either a success result Ok(weather), where weather is a new WeatherReport value, or an error result Err(error_value), where error_value is an io::Error explaining what went wrong.
Rust requires us to write some kind of error handling whenever we call this function. We can’t get at the WeatherReport without doing something to the Result, and you’ll get a compiler warning if a Result value isn’t used.
In Chapter 10, we’ll see how the standard library defines Result and how you can define your own similar types. For now, we’ll take a “cookbook” approach and focus on how to use Results to get the error-handling behavior you want.


Catching Errors

The most thorough way of dealing with a Result is the way we showed in Chapter 2: use a match expression.

match get_weather(hometown) {
    Ok(report) => {
        display_weather(hometown, &report);
    }
    Err(err) => {
        println!("error querying the weather: {}", err);
        schedule_weather_retry();
    }
}


This is Rust’s equivalent of try/catch in other languages. It’s what you use when you want to handle errors head-on, not pass them on to your caller.
match is a bit verbose, so Result<T, E> offers a variety of methods that are useful in particular common cases. Each of these methods has a match expression in its implementation. (For the full list of Result methods, consult the online documentation. The methods listed here are the ones we use the most.)


result.is_ok() and result.is_err() return a bool telling if result is a success result or an error result.

result.ok() returns the success value, if any, as an Option<T>. If result is a success result, this returns Some(success_value); otherwise, it returns None, discarding the error value.

result.err() returns the error value, if any, as an Option<E>.

result.unwrap_or(fallback) returns the success value, if result is a success result. Otherwise, it returns fallback, discarding the error value.

// A fairly safe prediction for Southern California.
const THE_USUAL: WeatherReport = WeatherReport::Sunny(72);

// Get a real weather report, if possible.
// If not, fall back on the usual.
let report = get_weather(los_angeles).unwrap_or(THE_USUAL);
display_weather(los_angeles, &report);


This is a nice alternative to .ok() because the return type is T, not Option<T>. Of course, it  only works when there’s an appropriate fallback value.


result.unwrap_or_else(fallback_fn) is the same, but instead of passing a fallback value directly, you pass a function or closure. This is for cases where it would be wasteful to compute a fallback value if you’re not going to use it. The fallback_fn is called only if we have an error result.

let report =
    get_weather(hometown)
    .unwrap_or_else(|_err| vague_prediction(hometown));

(Chapter 14 covers closures in detail.)


result.unwrap() also returns the success value, if result is a success result. However, if result is an error result, this method panics. This method has its uses; we’ll talk more about it later.

result.expect(message) is the same as .unwrap(), but lets you provide a message that it prints in case of panic.


Lastly, two methods for borrowing references to the value in a Result:


result.as_ref() converts a Result<T, E> to a Result<&T, &E>, borrowing a reference to the success or error value in the existing result.

result.as_mut() is the same, but borrows a mutable reference. The return type is Result<&mut T, &mut E>.


One reason these last two methods are useful is that all of the other methods listed here, except .is_ok() and .is_err(), consume the result they operate on. That is, they take the self argument by value. Sometimes it’s quite handy to access data inside a result without destroying it, and this is what .as_ref() and .as_mut() do for us. For example, suppose you’d like to call result.ok(), but you need result to be left intact. You can write result.as_ref().ok(), which merely borrows result, returning an Option<&T> rather than an Option<T>.




Result Type Aliases

Sometimes you’ll see Rust documentation that seems to omit the error type of a Result:

fn remove_file(path: &Path) -> Result<()>

This means that a Result type alias is being used.
A type alias is a kind of shorthand for type names. Modules often define a Result type alias to avoid having to repeat an error type that’s used consistently by almost every function in the module. For example, the standard library’s std::io module includes this line of code:
pub type Result<T> = result::Result<T, Error>;

This defines a public type std::io::Result<T>. It’s an alias for Result<T, E>, but hardcoding std::io::Error as the error type. In practical terms, this means that if you write use std::io; then Rust will understand io::Result<String> as shorthand for Result<String, io::Error>.
When something like Result<()> appears in the online documentation, you can click on the identifier Result to see which type alias is being used and learn the error type. In practice, it’s usually obvious from context.



Printing Errors

Sometimes the only way to handle an error is by dumping it to the terminal and moving on. We already showed one way to do this:

println!("error querying the weather: {}", err);

The standard library defines several error types with boring names: std::io::Error, std::fmt::Error, std::str::Utf8Error, and so on. All of them implement a common interface, the std::error::Error trait, which means they share the following features:

They’re all printable using println!(). Printing an error with the {} format specifier typically  displays only a brief error message. Alternatively, you can print with the {:?} format specifier, to get a Debug view of the error. This is less user-friendly, but includes extra technical information.

// result of `println!("error: {}", err);`
error: failed to lookup address information: No address associated with
hostname

// result of `println!("error: {:?}", err);`
error: Error { repr: Custom(Custom { kind: Other, error: StringError(
"failed to lookup address information: No address associated with
hostname") }) }

err.description() returns an error message as a &str.
err.cause() returns an Option<&Error>: the underlying error, if any, that triggered err.
For example, a networking error might cause a banking transaction to fail, which could in turn cause your boat to be repossessed. If err.description() is "boat was repossessed", then err.cause() might return an error about the failed transaction; its .description() might be "failed to transfer $300 to United Yacht Supply", and its .cause() might be an io::Error with details about the specific network outage that caused all the fuss. That third error is the root cause, so its .cause() method would return None.
Since the standard library only includes rather low-level features, this is usually None for standard library errors.

Printing an error value does not also print out its cause. If you want to be sure to print all the available information, use this function:
use std::error::Error;
use std::io::{Write, stderr};

/// Dump an error message to `stderr`.
///
/// If another error happens while building the error message or
/// writing to `stderr`, it is ignored.
fn print_error(mut err: &Error) {
    let _ = writeln!(stderr(), "error: {}", err);
    while let Some(cause) = err.cause() {
        let _ = writeln!(stderr(), "caused by: {}", cause);
        err = cause;
    }
}

The standard library’s error types do not include a stack trace, but the error-chain crate makes it easy to define your own custom error type that supports grabbing a stack trace when it’s created. It uses the backtrace crate to capture the stack.



Propagating Errors

In most places where we try something that could fail, we don’t want to catch and handle the error immediately. It is simply too much code to use a 10-line match statement every place where something could go wrong.

Instead, if an error occurs, we usually want to let our caller deal with it. We want errors to propagate up the call stack.
Rust has a ? operator that does this. You can add a ? to any expression that produces a Result, such as the result of a function call:

let weather = get_weather(hometown)?;

The behavior of ? depends on whether this function returns a success result or an error result:


On success, it unwraps the Result to get the success value inside. The type of weather here is not Result<WeatherReport, io::Error> but simply WeatherReport.
On error, it immediately returns from the enclosing function, passing the error result up the call chain. To ensure that this works, ? can only be used in functions that have a Result return type.


There’s nothing magical about the ? operator. You can express the same thing using a match expression, although it’s much wordier:
let weather = match get_weather(hometown) {
    Ok(success_value) => success_value,
    Err(err) => return Err(err)
};

The only differences between this and the ? operator are some fine points involving types and conversions. We’ll cover those details in the next section.
In older code, you may see the try!() macro, which was the usual way to propagate errors until the ? operator was introduced in Rust 1.13.
let weather = try!(get_weather(hometown));

The macro expands to a match expression, like the one above.

It’s easy to forget just how pervasive the possibility of errors is in a program, particularly in code that interfaces with the operating system. The ? operator sometimes shows up on almost every line of a function:

use std::fs;
use std::io;
use std::path::Path;

fn move_all(src: &Path, dst: &Path) -> io::Result<()> {
    for entry_result in src.read_dir()? {  // opening dir could fail
        let entry = entry_result?;         // reading dir could fail
        let dst_file = dst.join(entry.file_name());
        fs::rename(entry.path(), dst_file)?;  // renaming could fail
    }
    Ok(())  // phew!
}




Working with Multiple Error Types

Often, more than one thing could go wrong. Suppose we are simply reading numbers from a text file.

use std::io::{self, BufRead};

/// Read integers from a text file.
/// The file should have one number on each line.
fn read_numbers(file: &mut BufRead) -> Result<Vec<i64>, io::Error> {
    let mut numbers = vec![];
    for line_result in file.lines() {
        let line = line_result?;         // reading lines can fail
        numbers.push(line.parse()?);     // parsing integers can fail
    }
    Ok(numbers)
}

Rust gives us a compiler error:
numbers.push(line.parse()?);     // parsing integers can fail
             ^^^^^^^^^^^^^ the trait `std::convert::From<std::num::ParseIntError>`
                           is not implemented for `std::io::Error`
The terms in this error message will make more sense when we reach Chapter 11, which covers traits. For now, just note that Rust is complaining that it can’t convert a std::num::ParseIntError value to the type std::io::Error.
The problem here is that reading a line from a file and parsing an integer produce two different potential error types. The type of line_result is Result<String, std::io::Error>. The type of line.parse() is Result<i64, std::num::ParseIntError>. The return type of our read_numbers() function only accommodates io::Errors. Rust tries to cope with the ParseIntError by converting it to a io::Error, but there’s no such conversion, so we get a type error.
There are several ways of dealing with this. For example, the image crate that we used in Chapter 2 to create image files of the Mandelbrot set defines its own error type, ImageError, and implements conversions from io::Error and several other error types to ImageError. If you’d like to go this route, try the aforementioned error-chain crate, which is designed to help you define good error types with just a few lines of code.
A simpler approach is to use what’s built into Rust. All of the standard library error types can be converted to the type Box<std::error::Error>, which represents “any error.” So an easy way to handle multiple error types is to define these type aliases:
type GenError = Box<std::error::Error>;
type GenResult<T> = Result<T, GenError>;

Then, change the return type of read_numbers() to GenResult<Vec<i64>>. With this change, the function compiles. The ? operator automatically converts either type of error into a GenError as needed.

Incidentally, the ? operator does this automatic conversion using a standard method that you can use yourself. To convert any error to the GenError type, call GenError::from():
let io_error = io::Error::new(         // make our own io::Error
    io::ErrorKind::Other, "timed out");
return Err(GenError::from(io_error));  // manually convert to GenError

We’ll cover the From trait and its from() method fully in Chapter 13.
The downside of the GenError approach is that the return type no longer communicates precisely what kinds of errors the caller can expect. The caller must be ready for anything.

If you’re calling a function that returns a GenResult, and you want to handle one particular kind of error, but let all others propagate out, use the generic method error.downcast_ref::<ErrorType>(). It borrows a reference to the error, if it happens to be the particular type of error you’re looking for:
loop {
    match compile_project() {
        Ok(()) => return Ok(()),
        Err(err) => {
            if let Some(mse) = err.downcast_ref::<MissingSemicolonError>() {
                insert_semicolon_in_source_code(mse.file(), mse.line())?;
                continue;  // try again!
            }
            return Err(err);
        }
    }
}

Many languages have built-in syntax to do this, but it turns out to be rarely needed. Rust has a method for it instead.



Dealing with Errors That “Can’t Happen”

Sometimes we just know that an error can’t happen. For example, suppose we’re writing code to parse a configuration file, and at one point we find that the next thing in the file is a string of digits:

if next_char.is_digit(10) {
    let start = current_index;
    current_index = skip_digits(&line, current_index);
    let digits = &line[start..current_index];
    ...

We want to convert this string of digits to an actual number. There’s a standard method that does this:
let num = digits.parse::<u64>();

Now the problem: the str.parse::<u64>() method doesn’t return a u64. It returns a Result. It can fail, because some strings aren’t numeric.
"bleen".parse::<u64>()  // ParseIntError: invalid digit
But we happen to know that in this case, digits consists entirely of digits. What should we do?

If the code we’re writing already returns a GenResult, we can tack on a ? and forget about it. Otherwise, we face the irritating prospect of having to write error-handling code for an error that can’t happen. The best choice then would be to use .unwrap(), a Result method we mentioned earlier.
let num = digits.parse::<u64>().unwrap();

This is just like ? except that if we’re wrong about this error, if it can happen, then in that case we would panic.

In fact, we are wrong about this particular case. If the input contains a long enough string of digits, the number will be too big to fit in a u64.
"99999999999999999999".parse::<u64>()    // overflow error
Using .unwrap() in this particular case would therefore be a bug. Bogus input shouldn’t cause a panic.
That said, situations do come up where a Result value truly can’t be an error. For example, in Chapter 18, you’ll see that the Write trait defines a common set of methods (.write() and others) for text and binary output. All of those methods return io::Results, but if you happen to be writing to a Vec<u8>, they can’t fail. In such cases, it’s acceptable to use .unwrap() or .expect(message) to dispense with the Results.

These methods are also useful when an error would indicate a condition so severe or bizarre that panic is exactly how you want to handle it.
fn print_file_age(filename: &Path, last_modified: SystemTime) {
    let age = last_modified.elapsed().expect("system clock drift");
    ...
}

Here, the .elapsed() method can fail only if the system time is earlier than when the file was created. This can happen if the file was created recently, and the system clock was adjusted backward while our program was running. Depending on how this code is used, it’s a reasonable judgment call to panic in that case, rather than handle the error or propagate it to the caller.



Ignoring Errors

Occasionally we just want to ignore an error altogether. For example, in our print_error() function, we had to handle the unlikely situation where printing the error triggers another error. This could happen, for example, if stderr is piped to another process, and that process is killed. As there’s not much we can do about this kind of error, we just want to ignore it; but the Rust compiler warns about unused Result values:

writeln!(stderr(), "error: {}", err);  // warning: unused result
The idiom let _ = ... is used to silence this warning:
let _ = writeln!(stderr(), "error: {}", err);  // ok, ignore result



Handling Errors in main()

In most places where a Result is produced, letting the error bubble up to the caller is the right behavior. This is why ? is a single character in Rust. As we’ve seen, in some programs it’s used on many lines of code in a row.

But if you propagate an error long enough, eventually it reaches main(), and that’s where this approach has to stop. main() can’t use ? because its return type is not Result.

fn main() {
    calculate_tides()?;  // error: can't pass the buck any further
}


The simplest way to handle errors in main() is to use .expect().
fn main() {
    calculate_tides().expect("error");  // the buck stops here
}


If calculate_tides() returns an error result, the .expect() method panics. Panicking in the main thread prints an error message, then exits with a nonzero exit code, which is roughly the desired behavior. We use this all the time for tiny programs. It’s a start.
The error message is a little intimidating, though:
$ tidecalc --planet mercury
thread 'main' panicked at 'error: "moon not found"', /buildslave/rust-buildbot/s
lave/nightly-dist-rustc-linux/build/src/libcore/result.rs:837
note: Run with `RUST_BACKTRACE=1` for a backtrace.
The error message is lost in the noise. Also, RUST_BACKTRACE=1 is bad advice in this particular case. It pays to print the error message yourself:
fn main() {
    if let Err(err) = calculate_tides() {
        print_error(&err);
        std::process::exit(1);
    }
}


This code uses an if let expression to print the error message only if the call to calculate_tides() returns an error result. For details about if let expressions, see Chapter 10. The print_error function is listed in “Printing Errors”.

Now the output is nice and tidy:
$ tidecalc --planet mercury
error: moon not found



Declaring a Custom Error Type

Suppose you are writing a new JSON parser, and you want it to have its own error type. (We haven’t covered user-defined types yet; that’s coming up in a few chapters. But error types are handy, so we’ll include a bit of a sneak preview here.)

Approximately the minimum code you would write is:
// json/src/error.rs

#[derive(Debug, Clone)]
pub struct JsonError {
    pub message: String,
    pub line: usize,
    pub column: usize,
}

This struct will be called json::error::JsonError, and when you want to raise an error of this type, you can write:
return Err(JsonError {
    message: "expected ']' at end of array".to_string(),
    line: current_line,
    column: current_column
});

This will work fine. However, if you want your error type to work like the standard error types, as your library’s users will expect, then you have a bit more work to do:
use std;
use std::fmt;

// Errors should be printable.
impl fmt::Display for JsonError {
    fn fmt(&self, f: &mut fmt::Formatter) -> Result<(), fmt::Error> {
        write!(f, "{} ({}:{})", self.message, self.line, self.column)
    }
}

// Errors should implement the std::error::Error trait.
impl std::error::Error for JsonError {
    fn description(&self) -> &str {
        &self.message
    }
}

Again, the meaning of the impl keyword, self, and all the rest will be explained in the next few chapters.



Why Results?

Now we know enough to understand what Rust is getting at by choosing Results over exceptions. Here are the key points of the design:


Rust requires the programmer to make some sort of decision, and record it in the code, at every point where an error could occur. This is good because otherwise, it’s easy to get error handling wrong through neglect.
The most common decision is to allow errors to propagate, and that’s written with a single character, ‘?’. Thus error plumbing does not clutter up your code the way it does in C and Go. Yet it’s still visible: you can look at a chunk of code and see at a glance all places where errors are propagated.
Since the possibility of errors is part of every function’s return type, it’s clear which functions can fail and which can’t. If you change a function to be fallible, you’re changing its return type, so the compiler will make you update that function’s downstream users.
Rust checks that Result values are used, so you can’t accidentally let an error pass silently (a common mistake in C).
Since Result is a data type like any other, it’s easy to store success and error results in the same collection. This makes it easy to model partial success. For example, if you’re writing a program that loads millions of records from a text file, and you need a way to cope with the likely outcome that most will succeed, but some will fail, you can represent that situation in memory using a vector of Results.


The cost is that you’ll find yourself thinking about and engineering error handling more in Rust than you would in other languages. As in many other areas, Rust’s take on error handling is wound just a little tighter than what you’re used to. For systems programming, it’s worth it.