Drop order in Rust: It's tricky
Exploring a few strange cases of drop behaviour in Rust
Consider the following (simplified) code:
impl Foo {
fn do_something(&mut self) {
let _ = self.lock.lock();
self.resource.use_somehow();
}
}
This might be in some FFI- or OS-interfacing code or a similar
special situation where one doesn't have the luxury of a MutexGuard
which guarantees a resource can only be accessed through a lock guard deref.
In any case (as you can probably tell if you've got Rust experience)
the code above doesn't do what it should.
The _
symbol in the first
line of the function is not a variable name (or, more technically, a binding).
It's a pattern, one that matches anything and discards it.
This means the guard returned by lock()
is dropped right away
on that first line and the lock isn't locked when resource
is accessed.
To fix it we just need to give the guard an actual binding:
let _guard = self.lock.lock();
... and now the lock only gets unlocked at the end of the function's
scope.
The takeaway is that you should never use _
for RAII guards.
That example was straightforward enough, but what about this one (a):
impl Foo {
fn do_something(&mut self) {
let _guard = self.lock.lock();
let _ = guard;
self.resource.use_somehow();
}
}
Or this (b):
impl Foo {
fn do_something(&mut self) {
match self.lock.try_lock() {
Ok(_) => self.resource.use_somehow(),
Err(err) => log_error(err),
}
}
}
Or even this (c):
fn do_something(resource: &mut Resource, _: Guard) {
resource.use_somehow();
}
Can you tell whther the lock is held while the resource is accessed in these examples?
The drop rules
In detail, the rules are described in the reference, with some juicy examples.
A brief, simplified summary:
- Scope variables are dropped in reverse order of declaration.
- Nested scopes are dropped in an inside-out order.
- Function arguments are likewise dropped in reverse order of declaration.
struct
fields are dropped in the same order as declared in the struct (source-code-wise, not memory-layout-wise).- "Sequenced things", ie. elements of tuples, enum variants, arrays, and owned slices are dropped in order of the sequence.
However, there's quite a lot of fineprint.
For demostration purposes, let's first define a simple type that prints a string on drop:
struct Token(&'static str);
impl Drop for Token {
fn drop(&mut self) {
println!("[drop] {}", self.0);
}
}
Equipped with this gadget, let's try something:
fn scope() {
let _ = Token("_ pattern");
let t1 = Token("1");
let _ = t1;
let t2 = Token("2");
let t2 = t1;
println!("t2 re-assigned");
}
Everyone's favourite game; what will that code print?
Given the fact that the code compiles, you can already tell that the line let _ = t1;
doesn't really do anything, there's no effect, since we can later use t1
and
re-assign it as t2
, shadowing the previous t2
. So this answers the example (a) above, it's correct.
How will the shadowing of t2
affect drop order though?
On this line:
let t2 = t1;
... since Token #2 is no longer accessible in any way, maybe it can be dropped right here?
Well, no, that's now how that works, both tokens are dropped at the end of
the scope. But in what order? This code prints:
[drop] _ pattern
t2 re-assigned
[drop] 1
[drop] 2
So, what happened is that the re-assignment let t2 = t1
only swapped the drop order
of the two tokens. Why on Earth would that be the case?
From the drop perspective, the re-assignment of t2
is as if a different binding
were introduced, let's call it t2'
.
We now effectively have three variables in the scope, t1
, t2
, and t2'
.
And in fact they get dropped in the reverse order of declaration, exactly as the rederence says:
t2'
is dropped, itsdrop()
implenetation prints"[drop] 1"
(because it holds Token #1).t2
is dropped, itsdrop()
implenetation prints"[drop] 2"
.t1
is dropped, itsdrop()
function is not run, because it's been moved-from in the course of the function. (When you move a value in Rust, whatever is the new owner of the value also acquires the responsibility to eventually rundrop()
.)
Expressions: match
, if let
, and if
Now let's do something similar to example (b) from above, but also with if let
and a regular if
:
fn exprs() {
fn make_token(s: &'static str) -> Result<Token, ()> {
Ok(Token(s))
}
match make_token("matched token") {
Ok(_) => println!("match arm"),
Err(_) => unreachable!(),
}
println!("after match");
if let Ok(_) = make_token("if let token") {
println!("if let body");
}
println!("after if let");
if make_token("if token").is_ok() {
println!("if body");
}
println!("after if");
}
Long story short, in case of match
and if let
, the value from make_token()
gets bound to the scope of the whole match
/if let
expression and is only dropped after the
final }
,
despite the pattern matching with _
. (So, the (b) example from above is correct as well!).
However, the regular if
is different.
In that case, the value is only bound to the scope of the conditional boolean
expression and is in fact dropped before the if
's body is executed.
There are also drop scoping rules about match arms, there's while let
and a few more
nooks and crannies of the language, but I didn't find any of them particularly weird,
so I'm not going to go into more detail here regarding expressions.
Function arguments
Ok, so what about (c)? Let's do something along those lines but, like, more evil:
fn fn_args() {
fn takes_args(t1: Token, (_, t2): (Token, Token), _: Token) {
println!("function body");
}
takes_args(Token("t1"), (Token("t2.0"), Token("t2.1")), Token("_"));
}
First thing you'll notice when running this one is that all of the arguments
are dropped at the function exit, none before the body, not even the one matched as _
.
The actual drop order ends up being:
[drop] _
[drop] t2.1
[drop] t2.0
[drop] t1
This seems mostly what you'd expect, except that t2.1
gets dropped before
t2.0
, which seems weird. Shouldn't tuples be dropped in order?
The reason why the order is this way is that arguments get passed to functions in a sort of a two-step manner. All the arguments get moved inside the function and then the pattern matching happens inside, but at that point the arguments are already bound to the scope of the whole function. So, in effect, the above function could be de-sugared to something akin to this:
fn takes_args(arg0: Token, arg1: (Token, Token), arg2: Token) {
let t1 = arg0;
let __invisible0 = arg1;
let t2 = __invisible0.1;
let __invisible1 = arg2;
{
println!("function body");
}
}
In summary, function arguments do generally get dropped in reverse order like the reference says, but with pattern matching you can alter the order in non-obvious ways.
Corollary: The (c) example is correct too.
Function final value
When creating a temporary as part of a final expression of a function:
fn fn_final_value() {
#[allow(unused)]
fn final_value(t1: Token) -> &'static str {
let t2 = Token("in body");
Token("final value").0
}
let res = final_value(Token("arg"));
println!("{}", res);
}
... the value gets dropped after all local variables (but before arguments).
This is kind of strange. But so far I haven't found a way in which this could bite
someone in the rear.
Examples source code
All the above examples can be found in this gist.
You can also run the code in the Playground.
Preventing drops or enforcing custom order
In my experience the need to prevent drops altogether or enforce some non-standard drop order mostly comes up in FFI code.
The rule of thumb in those cases is to use ManuallyDrop
over other things, particularly over mem::forget()
.
Let's consider a typical case, a callback called from C code where the C API holds an opaque pointer to a user-supplied context:
extern "C" fn ffi_on_data(data: Data, opaque: *mut c_void) {
let this: Box<Foo> = unsafe { Box::from_raw(opaque as _) };
this.on_data(data);
// Problem: `this` must not be dropped, it would de-allocate the
// opaque data too early!
}
Traditionally (at least before ManuallyDrop
was around), one would use mem::forget(this)
to make sure the instance is not dropped. However, using mem::forget()
like that is fragile,
especially once the function gets more complex. There may be a number of return
points
or someone might add one in future... And it becomes easy to forget to forget()
:-)
With ManuallyDrop
there's no such problem:
extern "C" fn ffi_on_data(data: Data, opaque: *mut c_void) {
let this: Box<Foo> = unsafe { Box::from_raw(opaque as _) };
let this = ManuallyDrop::new(this);
this.on_data(data);
// The `Box` contained in `this` doesn't get dropped.
}