Profiling Rust applications

FUJITA Tomonori
nttlabs
Published in
3 min readFeb 23, 2022

Profiling is indispensable for building high-performance software. I’ll explain profilers for async Rust, in comparison with Go, designed to support various built-in profilers for CPU, memory, block, mutex, goroutine, etc. Alas, this isn’t the case with Rust.

Equivalent to pprof

You found pprof-rs? It looks like the equivalent to Go’s pprof package supporting all kinds of profilers. Unfortunately pprof-rs supports only CPU profiling; collecting timer-based samples of the stack trace and storing them in the pprof format (also supports Flame Graphs format). The implementation is similar to Go’s CPU profiler. Both run in use mode and use OS timer facilities without depending any special CPU features.

In addition to CPU profiling, you might need to identify mutex contention, where async tasks are fighting for a mutex. Go’s mutex profiler enables you to find where goroutines fighting for a mutex. Let’s keep searching.

Runtime and profiling

Goroutines and async tasks can be thought of green threads managed by runtimes in user space. So mutex code is in runtimes. Go has the built-in runtime but Rust supports multiple asynchronous runtimes. In this article, I use Tokio, probably the most popular asynchronous runtime.

With Go’s mutex profiler enabled, the mutex lock/release code records how long a goroutine waits on a mutex; from when a goroutine failed to lock a mutex to when the lock is released. Tokio’s mutex code doesn’t implement such feature. But several months ago, Tracing support was added, which could be used for profiling.

Rust Tracing

Tracing crate is a framework for instrumenting applications to collect structured, event-based diagnostic information. You embed static instrumentations in your application and implement functions that are executed when trace events happen. You can cook event information in various ways, logging, storing in memory, sending over network, writing to disk, etc. Tracing is getting popular, some popular projects already support it. Tracing support was added to Tokio’s mutex code late last year.

I wrote simple code to print the state change of a mutex, when it’s locked and released. You could find where is the source of the contention in a similar manner.

% cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.27s
Running `target/debug/profiling-mutex`
03:34:30.070078: Mutex (file="src/main.rs" line=113) Released
03:34:30.070463: Mutex (file="src/main.rs" line=113) Locked
03:34:33.072371: Mutex (file="src/main.rs" line=113) Released
03:34:33.073031: Mutex (file="src/main.rs" line=113) Locked
03:34:34.076974: Mutex (file="src/main.rs" line=113) Released
03:34:34.077524: Mutex (file="src/main.rs" line=113) Locked

Note that the first line means that a mutex object is created with the unlocked state.

Already eager to use tracing crate? The change to your application is trivial; telling trace events that you are interested and what to do when they happen. However, there are some caveats.

  • Tracing support is unstable features in Tokio. Recompilation with an option is required.
  • Tracing is still under active development. You likely need to read the code rather than the documentations.

One concern about using Tracing for profiling is its performance overhead. You could adjust the sampling rate but the implementation of Tracing is complicated because it’s very flexible, can be used for many purposes.

Conclusion

Unlike Go, Rust doesn’t have build-in profilers. But Tracing crate enables you to get diagnostic information that can be used for profiling.

Also you can use profilers in kernel mode, perf, uprobes, etc, which work with Rust without difficulties.

--

--

FUJITA Tomonori
nttlabs

Janitor at the 34th floor of NTT Tamachi office, had worked on Linux kernel, founded GoBGP, TGT, Ryu, RustyBGP, etc. https://twitter.com/brewaddict