Async Stack Traces in Rust

One neat result of Rust’s futures and async/await design is that all of the async callers are on the stack below the async callees. In most other languages, only the youngest async callee is on the stack, and none of the async callers. Because the youngest frame is most often not where a bug’s root cause lies, this extra context makes debugging async code easier in Rust.

How others do it
How Rust does it
Conclusion
Appendix
- What about “async stacks” in JavaScript?
- What if I want even more context for my async Rust code?

How others do it

First, before we look at the extra context Rust gives you in its async stack traces, let’s take a look at a stack trace from async code in another language. We’ll walk through an example in JavaScript, but this applies equally well to any async model that is callback-driven under the hood.

Here’s our async JavaScript snippet. foo waits for a short timeout asynchronously, then calls bar, which also waits for a short timeout before calling blowUp, which again waits before finally throwing an error. This example, while simple, is nonetheless representative of an async task that involves multiple async functions with multiple await suspension points.

// A `Promise`-returning, `async`/`await`-compatible
// wrapper around `setTimeout`.
function sleep(n) {
  return new Promise(resolve => setTimeout(resolve, n));
}

async function foo() {
  await sleep(100);
  await bar();
}

async function bar() {
  await sleep(200);
  await blowUp();
}

async function blowUp() {
  await sleep(300);
  throw new Error("nested error in async code");
}

If we run foo, we get an error with a stack trace like this:

blowUp@example.js:22:9

That’s the complete stack: just a single stack frame. It doesn’t give us any help with identifying who called blowUp.

Why do we have just this lonely stack frame? Because what async/await boils down to in JavaScript is roughly speaking⁰ a suspendable generator function and a series of Promise callbacks that keep pumping the generator function to completion. The async function’s desugared generator suspends at await points, yielding the Promise it is waiting on. The runtime takes that Promise and enqueues a callback to resume the generator (via calling next) once the promise is resolved:

// Enqueue a callback on the yielded promise.
yieldedPromise.then(x => {
  // Resume our suspended function with the result.
  suspendedFunction.next(x);
});

Whenever the Promise is resolved, its queued callbacks are run on a new tick of the event loop¹, without any caller on the stack.

If we visualize the call stack over time, with youngest frames on top and oldest frames on bottom, it looks like this:

                  ╱╱            ╱╱               ╱╱
  ^               ╲╲   ┌─────┐  ╲╲   ┌────────┐  ╲╲
  │               ╱╱   │ bar │  ╱╱   │ blowUp │  ╱╱
  │      ┌─────┐  ╲╲  ┌┴─────┤  ╲╲  ┌┴────────┤  ╲╲  ┌────────┐
Stack    │ foo │  ╱╱  │ foo  │  ╱╱  │ bar     │  ╱╱  │ blowUp │
  │      └─────┘  ╲╲  └──────┘  ╲╲  └─────────┘  ╲╲  └────────┘
  │               ╱╱            ╱╱               ╱╱
  │             +100 ms       +200 ms          +300 ms

             ────────────────Time──────────────────>

After blowUp sleeps for 300 milliseconds and throws an error, it is the only frame on the stack, and there is nothing left pointing to its async callers foo and bar.

How Rust does it

Now let’s look at some async Rust code that is roughly-equivalent to our previous JavaScript snippet:

use async_std::task;
use std::time::Duration;

fn main() {
    let future = foo();
    task::block_on(future);
}

async fn foo() {
    task::sleep(Duration::from_millis(100)).await;
    bar().await;
}

async fn bar() {
    task::sleep(Duration::from_millis(200)).await;
    blow_up().await;
}

async fn blow_up() {
    task::sleep(Duration::from_millis(300)).await;
    panic!("nested async panic!");
}

When we compile and run this Rust program with RUST_BACKTRACE=1 to capture a stack trace on panic, we get the following stack trace², which has the async callers foo and bar in addition to the youngest async callee blow_up:

thread 'async-task-driver' panicked at 'nested async panic!', src/main.rs:42:5
stack backtrace:
...
   8: async_stack_traces::blow_up::{{closure}}
             at src/main.rs:42
...
  15: async_stack_traces::bar::{{closure}}
             at src/main.rs:37
...
  22: async_stack_traces::foo::{{closure}}
             at src/main.rs:32
...

In Rust, async functions desugar into suspendable generator functions, which in turn desugar into state machine enums. Rather than yielding futures, like how JavaScript’s desugared async functions will yield promises, Rust’s desugared async functions’ state machines implement the Future trait themselves³. The same way that futures nest and it is common to have MyOuterFuture::poll calling MyInnerFuture::poll, awaiting a future desugars into the generator polling the nested future.

For example, if an activation of our async fn foo is awaiting a task::sleep future, it will poll that future inside its own poll implementation, and if it is awaiting an activation of async fn bar, then it will poll its bar activation inside its own poll implementation:

// Lightly edited output of `cargo rustc -- -Z unpretty=hir`
// for our `async fn foo`.
fn foo() -> impl Future<Output = ()> {
    use std::future;
    use std::pin::Pin;
    use std::task::Poll;

    future::from_generator(move || {
        // Awaiting `task::sleep`.
        {
            let mut pinned = task::sleep(<Duration>::from_millis(100));
            loop {
                match future::poll_with_tls_context(unsafe {
                    Pin::new_unchecked(&mut pinned)
                }) {
                    Poll::Ready(()) => break,
                    Poll::Pending => yield (),
                }
            }
        }

        // Awaiting `async fn bar`.
        {
            let mut pinned = bar();
            loop {
                match future::poll_with_tls_context(unsafe {
                    Pin::new_unchecked(&mut pinned)
                }) {
                    Poll::Ready(()) => break,
                    Poll::Pending => yield (),
                }
            }
        }
    })
}

With this nested polling, we have the async caller on the stack below its async callee.

To make it visual, here is the stack of Future::poll trait method calls over time for our Rust example:

                     ╱╱               ╱╱                ╱╱
  ^                  ╲╲               ╲╲    ┌────────┐  ╲╲
  │                  ╱╱               ╱╱    │ sleep  │  ╱╱
  │                  ╲╲    ┌───────┐  ╲╲   ┌┴────────┤  ╲╲  ┌─────────┐
  │                  ╱╱    │ sleep │  ╱╱   │ blow_up │  ╱╱  │ blow_up │
  │       ┌───────┐  ╲╲   ┌┴───────┤  ╲╲  ┌┴─────────┤  ╲╲  ├─────────┤
Stack     │ sleep │  ╱╱   │ bar    │  ╱╱  │ bar      │  ╱╱  │ bar     │
  │      ┌┴───────┤  ╲╲  ┌┴────────┤  ╲╲  ├──────────┤  ╲╲  ├─────────┤
  │      │ foo    │  ╱╱  │ foo     │  ╱╱  │ foo      │  ╱╱  │ foo     │
  │      └────────┘  ╲╲  └─────────┘  ╲╲  └──────────┘  ╲╲  └─────────┘
  │                  ╱╱               ╱╱                ╱╱
  │                +100 ms          +200 ms           +300 ms

             ────────────────Time──────────────────>

After blow_up has slept for 300 milliseconds and panics, we still have its async callers bar and foo on the stack, which is super useful for debugging.

Conclusion

Rust’s readiness-based, polling model for asynchronous code means that we always have the async caller that is waiting on an async callee function’s completion on the stack during polling. Therefore async callers show up in stack traces with async Rust code, not just the youngest async callee like in most other languages. In turn, this property makes debugging async code that much easier in Rust.

I wasn’t the first to notice this property of Rust’s futures and async/await design, nor was I involved in its design process. However, I don’t think most folks have discovered this property yet, so I’m doing my part to help explain it and spread the knowledge.

Thanks to Alex Crichton, Jason Orendorff, and Jim Blandy for providing feedback on an early draft of this blog post. Their feedback only improved this text, and any errors that remain are my own.

Appendix

What about “async stacks” in JavaScript?

To help users debug asynchronous code in JavaScript, engines introduced “async stacks”. The way this works, at least in SpiderMonkey, is to capture the stack when a promise is allocated or when some async operation begins, and then when running the async callbacks append that captured stack to any new stacks captured.

Async stacks come with both time and space overheads. These aren’t small costs, and there has been significant engineering work put into making the captured stack traces compact in SpiderMonkey. There are also limits on how many frames and how often async stacks are captured because of these costs.

Async stacks are also not a zero-cost abstraction, since the engine must capture the async portion and save them in memory regardless whether any execution actually ends up using them. On the other hand, since Rust’s async callers appear on the normal call stack, nothing needs to be captured eagerly and stored in memory.

What if I want even more context for my async Rust code?

The tracing facade looks like a promising framework for annotating async tasks, creating a hierarchy between tasks, and recording structured data about program execution.

⁰ This is not the actual desugaring, but is close enough to get the point across for our purposes. Read the actual ECMAScript standard if you want to know what is really going on under the hood of async/await in JavaScript functions. The exact details are too nitpicky for (and largely irrelevant to) this blog post. ↩

¹ I’m ignoring micro vs. macro task queues in this blog post, because (again) those details are irrelevant here. ↩

² I edited this stack trace to remove extra stack frames that come from the async runtime library and the stack unwinding library. In Rust, we have the opposite problem compared to other languages: we have too many stack frames! 🤣 ↩

³ Technically they are wrapped in a std::future::GenFuture via std::future::from_generator, and there is this Future implementation for GenFuture wrappers:

impl<T: Generator<Yield = ()>> Future for GenFuture<T> {
    type Output = T::Return;
    fn poll(self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Self::Output> {
        // ...
    }
}

But this is an implementation detail that’s mostly irrelevant to this discussion. ↩