vojtechkral vojtěchkrál.github.io

Drop order in Rust: It's tricky

Exploring a few strange cases of drop behaviour in Rust

Consider the following (simplified) code:

impl Foo {
    fn do_something(&mut self) {
        let _ = self.lock.lock();
        self.resource.use_somehow();
    }
}

This might be in some FFI- or OS-interfacing code or a similar special situation where one doesn't have the luxury of a MutexGuard which guarantees a resource can only be accessed through a lock guard deref.

In any case (as you can probably tell if you've got Rust experience) the code above doesn't do what it should.
The _ symbol in the first line of the function is not a variable name (or, more technically, a binding). It's a pattern, one that matches anything and discards it. This means the guard returned by lock() is dropped right away on that first line and the lock isn't locked when resource is accessed.

To fix it we just need to give the guard an actual binding:

let _guard = self.lock.lock();

... and now the lock only gets unlocked at the end of the function's scope.
The takeaway is that you should never use _ for RAII guards.

That example was straightforward enough, but what about this one (a):

impl Foo {
    fn do_something(&mut self) {
        let _guard = self.lock.lock();
        let _ = guard;
        self.resource.use_somehow();
    }
}

Or this (b):

impl Foo {
    fn do_something(&mut self) {
        match self.lock.try_lock() {
            Ok(_) => self.resource.use_somehow(),
            Err(err) => log_error(err),
        }
    }
}

Or even this (c):

fn do_something(resource: &mut Resource, _: Guard) {
    resource.use_somehow();
}

Can you tell whther the lock is held while the resource is accessed in these examples?

The drop rules

In detail, the rules are described in the reference, with some juicy examples.

A brief, simplified summary:

However, there's quite a lot of fineprint.

For demostration purposes, let's first define a simple type that prints a string on drop:

struct Token(&'static str);

impl Drop for Token {
    fn drop(&mut self) {
        println!("[drop] {}", self.0);
    }
}

Equipped with this gadget, let's try something:

fn scope() {
    let _ = Token("_ pattern");

    let t1 = Token("1");
    let _ = t1;

    let t2 = Token("2");
    let t2 = t1;
    println!("t2 re-assigned");
}

Everyone's favourite game; what will that code print?

Given the fact that the code compiles, you can already tell that the line let _ = t1; doesn't really do anything, there's no effect, since we can later use t1 and re-assign it as t2, shadowing the previous t2. So this answers the example (a) above, it's correct. How will the shadowing of t2 affect drop order though?
On this line:

let t2 = t1;

... since Token #2 is no longer accessible in any way, maybe it can be dropped right here?
Well, no, that's now how that works, both tokens are dropped at the end of the scope. But in what order? This code prints:

[drop] _ pattern
t2 re-assigned
[drop] 1
[drop] 2

So, what happened is that the re-assignment let t2 = t1 only swapped the drop order of the two tokens. Why on Earth would that be the case?
From the drop perspective, the re-assignment of t2 is as if a different binding were introduced, let's call it t2'. We now effectively have three variables in the scope, t1, t2, and t2'. And in fact they get dropped in the reverse order of declaration, exactly as the rederence says:

  1. t2' is dropped, its drop() implenetation prints "[drop] 1" (because it holds Token #1).
  2. t2 is dropped, its drop() implenetation prints "[drop] 2".
  3. t1 is dropped, its drop() function is not run, because it's been moved-from in the course of the function. (When you move a value in Rust, whatever is the new owner of the value also acquires the responsibility to eventually run drop().)

Expressions: match, if let, and if

Now let's do something similar to example (b) from above, but also with if let and a regular if:

fn exprs() {
    fn make_token(s: &'static str) -> Result<Token, ()> {
        Ok(Token(s))
    }

    match make_token("matched token") {
        Ok(_) => println!("match arm"),
        Err(_) => unreachable!(),
    }
    println!("after match");

    if let Ok(_) = make_token("if let token") {
        println!("if let body");
    }
    println!("after if let");

    if make_token("if token").is_ok() {
        println!("if body");
    }
    println!("after if");
}

Long story short, in case of match and if let, the value from make_token() gets bound to the scope of the whole match/if let expression and is only dropped after the final }, despite the pattern matching with _. (So, the (b) example from above is correct as well!).

However, the regular if is different. In that case, the value is only bound to the scope of the conditional boolean expression and is in fact dropped before the if's body is executed.

There are also drop scoping rules about match arms, there's while let and a few more nooks and crannies of the language, but I didn't find any of them particularly weird, so I'm not going to go into more detail here regarding expressions.

Function arguments

Ok, so what about (c)? Let's do something along those lines but, like, more evil:

fn fn_args() {
    fn takes_args(t1: Token, (_, t2): (Token, Token), _: Token) {
        println!("function body");
    }

    takes_args(Token("t1"), (Token("t2.0"), Token("t2.1")), Token("_"));
}

First thing you'll notice when running this one is that all of the arguments are dropped at the function exit, none before the body, not even the one matched as _.
The actual drop order ends up being:

[drop] _
[drop] t2.1
[drop] t2.0
[drop] t1

This seems mostly what you'd expect, except that t2.1 gets dropped before t2.0, which seems weird. Shouldn't tuples be dropped in order?

The reason why the order is this way is that arguments get passed to functions in a sort of a two-step manner. All the arguments get moved inside the function and then the pattern matching happens inside, but at that point the arguments are already bound to the scope of the whole function. So, in effect, the above function could be de-sugared to something akin to this:

fn takes_args(arg0: Token, arg1: (Token, Token), arg2: Token) {
    let t1 = arg0;
    let __invisible0 = arg1;
    let t2 = __invisible0.1;
    let __invisible1 = arg2;
    {
        println!("function body");
    }
}

In summary, function arguments do generally get dropped in reverse order like the reference says, but with pattern matching you can alter the order in non-obvious ways.

Corollary: The (c) example is correct too.

Function final value

When creating a temporary as part of a final expression of a function:

fn fn_final_value() {
    #[allow(unused)]
    fn final_value(t1: Token) -> &'static str {
        let t2 = Token("in body");
        Token("final value").0
    }

    let res = final_value(Token("arg"));
    println!("{}", res);
}

... the value gets dropped after all local variables (but before arguments).
This is kind of strange. But so far I haven't found a way in which this could bite someone in the rear.

Examples source code

All the above examples can be found in this gist.
You can also run the code in the Playground.

Preventing drops or enforcing custom order

In my experience the need to prevent drops altogether or enforce some non-standard drop order mostly comes up in FFI code.

The rule of thumb in those cases is to use ManuallyDrop over other things, particularly over mem::forget().

Let's consider a typical case, a callback called from C code where the C API holds an opaque pointer to a user-supplied context:

extern "C" fn ffi_on_data(data: Data, opaque: *mut c_void) {
    let this: Box<Foo> = unsafe { Box::from_raw(opaque as _) };
    this.on_data(data);
    // Problem: `this` must not be dropped, it would de-allocate the
    // opaque data too early!
}

Traditionally (at least before ManuallyDrop was around), one would use mem::forget(this) to make sure the instance is not dropped. However, using mem::forget() like that is fragile, especially once the function gets more complex. There may be a number of return points or someone might add one in future... And it becomes easy to forget to forget() :-)

With ManuallyDrop there's no such problem:

extern "C" fn ffi_on_data(data: Data, opaque: *mut c_void) {
    let this: Box<Foo> = unsafe { Box::from_raw(opaque as _) };
    let this = ManuallyDrop::new(this);
    this.on_data(data);
    // The `Box` contained in `this` doesn't get dropped.
}