Scalable server design in Rust with Tokio

FUJITA Tomonori
2 min readSep 29, 2021

--

Tokio, an asynchronous runtime for Rust, is probably the most widely used one while there are several runtime implementations. Many OSS in Rust, such as web framework applications (rocket, warp, actix, etc), are built on Tokio.

With Tokio, it's easy to implement network server applications. However, typical code that you often find in guide books or such scales poorly on multiple CPUs. I'll explain why and how to work around with few lines changed.

Tonic (gRPC greeter server) performance examples

Typical code of gRPC greeter server implementation using Tonic, gRPC framework built on Tokio, is something like the following. You can implement gRPC server applications with little knowledge of Tokio.

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let addr = "[::1]:50051".parse().unwrap();
let greeter = MyGreeter::default();

println!("GreeterServer listening on {}", addr);

Server::builder()
.add_service(GreeterServer::new(greeter))
.serve(addr)
.await?;

Ok(())
}

However, with some knowledge of Tokio, you could get more than twice of the typical code.

server runs on EC2 c6gn.8xlarge (32 vCPUs)

Server design on multiple CPUs

Why does the typical code scale poorly? The below figure, how Tokio internally works, gives hints.

How the typical code works

Tokio creates threads as roughly many as CPUs. Each thread handles multiple clients. Only one thread monitors all the sockets via epoll syscall. Also there is only one listen socket for accepting new connections. They could easily become the bottleneck under heavy load.

The following is the design that scales better in the above graph, which is used by Envoy, Nginx (uses processes instead of threads), etc. Each thread monitors the sockets of the own clients via epoll syscall. Also each has a listen socket, which waits on the same port thanks to SO_REUSEPORT feature.

With few lines changed, creating Tokio runtime by hand, you can move to this design. You can find the code here.

How the scalable code works

Conclusion

For more scalability, you might be interested in an asynchronous runtime designed exclusively for that, such as glommio. However, there are lots of useful OSS built on Tokio. So I think that improving the scalability of applications built on Tokio a bit might be helpful in some cases.

Corrections, comments, and suggestions would be greatly appreciated.

--

--

FUJITA Tomonori

Janitor at the 34th floor of NTT Tamachi office, had worked on Linux kernel, founded GoBGP, TGT, Ryu, RustyBGP, etc. https://twitter.com/brewaddict