Microservices in Rust with actix

Marco Amann
Jun 30 · 18 min read
Image for post
Image for post
Photo by Atik sulianami on Unsplash


Why even consider Rust?

To be honest, I think it would be faster to create the application we will develop using Java instead of Rust, at least if we used some of the goodies from the Spring ecosystem. I say this despite Rust being my favorite language, so why even bother trying to create it in Rust? Microservice applications bring, despite their intriguing benefits, a whole range of challenges, some of which I dare to say are a perfect fit to tackle with Rust.

What we will build

Since I evaluate the feasibility of efficiently building microservices with rust, the following scenario was designed to cover many different aspects typical to such applications.

Image for post
Image for post
Our use case
  • The image is downloaded and saved alongside a thumbnail in an S3 bucket (this is the focus of the next post)
  • When the download is done, the user can preview the thumbnail and download the image
  • The user can print a bunch of images and will receive them in their mail

Dividing the system into Services

To be able to build the system in multiple iterations, I chose the following setup:

Image for post
Image for post
Image for post
Image for post

The print service

Since we want to expose the print-http-endpoint, it is advisable to use a web-framework: In the last post I used rocket, so this time, actix-rs will be used. Actix does not try to hide the asynchronous nature of its foundation as much as rocket does and hence feels a bit more powerful to me. We will see some of the good and bad sides of it over the course of this article.

Service interfaces

The scope of our service is pretty small, I will nonetheless only discuss the interesting ones, these are:

GET  /print/jobs/{:id}   For the user to see the job status
POST /print/jobs For the user to create a new print job

Consumed interfaces

A key aspect of microservices is, that they cooperate with other services. To cover this functionality in this evaluation, I designed the print service to consume three interfaces: the API of the printer itself, that we have no control over, the management API, written by a different team in a different language and the S3 API to asynchronously download the image.

Image for post
Image for post

No swag for us

It would be cool to have the boilerplate code directly generated from its specification, for example by using swagger. Unfortunately, the swagger-codegen for rust does only support the rather low-level HTTP library ‘hyper’ instead of the more abstract frameworks. ‘Paperclip’ aimed to build a actix-web code generator for OpenAPI specs but there seems to be no active development. This leaves us with writing the actix-code by ourselves.

Callbacks from the printer

To handle callbacks from the printer, once it finished printing, correlation-IDs are used. They are stored in a database to keep track of active jobs across requests. This requires our rust code to interact with the database. I chose to use tokio_postgres, a sub-project of rust-postgres, allowing for clean integration with asynchronous programming using tokio (what we will be using due to actix anyways). But since we want to keep the load on the db caused by connections to a minimum, a connection-pool would be a sensible choice. The most prominent project, 2rd2, unfortunately does not already support the tokio variants of the rust-postgres crate. Therefore I opted for bb8, describing itself as “A generic connection pool, designed for asynchronous tokio-based connections”. I really like their choice of crate-names.

Async-Await

When uploading 500 images to several printers in parallel (we assume the print hardware is slow), we do not want to have 500 threads running. To avoid this, Rust provides a pretty new and exciting feature: async await. Since this feature is huge, the rust-guys are writing a whole book about it, similar to the rust-nomicon. Despite being a bit tricky, this feature makes rust such a great fit for parallel execution, so let me quickly introduce the concept.

let body = reqwest::get(url).await;
println!("Got {}",body);

The Actix-web implementation

Actix-web is based on actix, an actor framework and organizes most of its functionality around (async) handler functions, that create responses when acting on requests. The handlers are executed in a tokio-runtime, allowing them to utilize multiple CPU cores. However, since there is only a finite amount of CPUs, the handler code must not block, e.g. by waiting for a database query.

#[derive(Validate, ...)]
struct PrintJob{
//...
#[validate(length(min = 1, max = 20))]
name: String
}
// in the handler:
job.validate().map_err(ErrorBadRequest)?;
// processing code
Image for post
Image for post
The obligatory, unrelated image of a server «|» Photo by Taylor Vick on Unsplash

Providing the API with actix

Out of the above-mentioned API endpoints, GET requests asking for a special job_id, might be the simplest ones, so let me quickly walk you through their implementation so you can get a feeling for actix.

GET  /print/jobs/{:id}
async fn get_job_by_id(id: web::Path<u32>) -> impl Responder { ...
  • web:Path<u32> is the type of our id parameter, since it is encoded in the request path. This parameter is defined in the app-routing directive: ...route("/print/jobs/{id}",...) .
  • impl Responder defines our response-type as anything that does implement the Responder trait. So e.g. HttpResponse but also its result-type. This means we can directly return Err(HttpResponse::BadRequest) .
async fn get_job_by_id(
id: web::Path<i32>,
storage: web::Data<Storage>
) -> impl Responder {
storage.select_print_job_by_id(id.into_inner()).await;
pub async fn select_print_job_by_id(&self, id: i32)
-> Result<Option<PrintJob>, PrintJobError> {...}
impl ResponseError for PrintJobError { }
Image for post
Image for post
Let’s fetch some resources «|» Photo by K. Mitch Hodge on Unsplash

Consuming an API with reqwest

When the user wants to print something, they supply a user-id and a file-id. The print service asks the management service, if the requesting user is allowed to print the file and where to find the file. There are several ways this user-request could go wrong: e.g. the file has been deleted in the meantime or the user was naughty and tries to print a file they are not allowed to. Perhaps the management service is overloaded or not reachable. Either way, we do not want to answer the user until we have processed the reply of the management service. But we promised actix not to block in the handler, so what to do? Use async of course, here’s how:

POST /print/jobs         For the user to create a new print job
let response = reqwest::Client::new()
.get("http://127.0.0.1:8090/lookup")
.json(&job)
.send()
.await
.map_err(|e|...)?;
let result = response.json::<ManagementLookupResponse>()
.await
.map_err(|e| ...)?;

Slow backends

If you worry about slow backends, you might be right. Although actix is taking care of backpressure quite well and a million waiting tokio-tasks do not really pose a problem, two million opened TCP connections do. We therefore need to prevent work from piling up and stalling our service. So let me use this challenge to show how you can incorporate low-level system primitives in high-level web-frameworks.

Testing the service

It obviously makes sense to test our service before we put it in production.

Unit Tests

Rust allows you to write unit-tests directly in the files that contain your functions. That way it is easy to test private functions without performing any hacks.

async fn health_probe(_: HttpRequest) ->  HttpResponse {
HttpResponse::Ok().finish()
}
#[actix_rt::test]
async fn test_status_ok() {
let req = test::TestRequest::default().to_http_request();
let resp = super::health_probe(req).await;
assert_eq!(resp.status(), http::StatusCode::OK);
}

Integration tests

The slightly more complex integration tests of our service can test the “real” application: we can have an app with either the real routing and data attached or mocked ones. We could for example replace the real storage from above with a mock implementation.

#[actix_rt::test]
async fn test_xyz() {
let storage = ...

let mut app = test::init_service(
App::new()
.data(storage.clone())
.route(...)
).await;

let req = test::TestRequest::get()
.uri("/print/jobs/10").to_request();
let resp = test::call_service(&mut app, req).await;
assert!(resp.status().is_success());
}

Metrics

If many instances of our microservice will be run, we need to make sure they are healthy and we can detect broken ones, to replace them. In this section we will have a look at how you can collect metrics from the app to aggregate them in a central place and further process them there.

histogram.observe(semaphore.available_permits() as f64 );
Image for post
Image for post
Grafana dashboard visualizing the prometheus metrics during a stress-test

Possible improvements

Although the service works in its current form, there is a lot of things not covered yet

  • Reusing connections: It would greatly reduce the load on backend-services, if we reused the connections to them. However, this requires a bit more involved implementation, since we need to share the connection pool between requests, like we did with the bb8 database pool.
  • Authentication and authorization were completely neglected in this post. OAuth alone can easily fill a whole blog post about it.
  • Dynamic configuration: Once the service runs, it should be able to adopt to a changing environment, even without restarting the service instances. This is especially important if we want to adopt dynamic service discovery in the deployment, where services can register themselves to be available for usage.
  • More in-depth testing on the public APIs is needed. In one of the next posts we will come back to this and discuss, how we can ensure different services understand each other.
  • Deployment: Although it is easy to run Rust binaries just as they are, it might be advisable to also create a super light-weight docker-container to be able to run the application on a Kubernetes cluster. Since these binaries have very few to none dependencies, multistage-builds are easily achievable, producing light-weight containers.

Lessons learned

It is possible to create a microservice in Rust and integrate it with an existing environment, across language boundaries. However, development with Rust is slower compared to Java in my case but is arguably more fun: The type-system of Rust forces developers to think about error handling in advance, saving you from frustrating hours of debugging problems that occur at runtime.

Next Posts

In the next post we will build the download-service using a different technology: Apache Kafka. Stay tuned to read on the different challenges and possibilities that come with loosely coupling the service.


Digital Frontiers — Das Blog

Dies ist das Blog der Digital Frontiers GmbH & Co.

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store