Rust in production, for fun & profit

At Telepass Assicura I had the opportunity to rewrite a legacy application in Rust: this is the report of that journey 🦀

Antonio Bonifacio

Published in

Telepass Digital

9 min readJan 22, 2024

Intro

I started to fall in love with Rust language in the past few years, so I was eagerly looking for an excuse to go beyond the Rust Book and write some “real world” Rust code.

Recently, the perfect opportunity fell in my lap, since my team and I were requested to update a legacy microservice by adding a cache layer with Redis, so I stepped in and proposed to “adopt” the code and rewrite it in Rust.

At Telepass Assicura we already have backend services written with heterogeneous technologies, mostly Java and Ruby on Rails. One of those RoR services was quite small, it exposed a single REST API to retrieve some data stored on Google BigQuery in the background, when the customer is requesting a quotation for its insurance using the Telepass app or via the Telepass Assicura website. The perfect test subject!

Why Rust? Mainly for two reasons, correctness and efficiency. Correctness is really important to me, especially at work: being mainly a Java developer, I am used to strongly typed programming languages and I am happy to spend some time and have some frictions upfront to make the compiler happy, since I’m confident that in return the code should behave as I expected.

The efficiency of Rust comes from its low-level origins, since it is meant to be in the same ballpark of C and C++: the code is compiled to run natively on the machine, without a VM, a JIT or a garbage collector like Java. For this reason, a program written in Rust will usually consume less CPU and less memory than an equivalent one written in Java or Ruby. As a byproduct, this efficiency in resource usage puts Rust in an interesting position in a “cloud-first” world, since it can significantly reduce costs: do more with less! A win-win situation, both for economical and environmental reasons.

Building blocks and feature parity

As a first step, I did some scouting to decide which framework from the Rust ecosystem would be the best fit for me, since the scale of this project was pretty small and it would have been a solo work.

In the past, I took a look at Rocket, but at that time version 0.5 was still in beta and version 0.4 required a non-stable version of Rust to compile, so for me was a no-go. Moreover, the Rust community was already moving towards Axum, a new web framework built around Tokio and its technologies to provide an async-ready framework, with good documentation and lot of posts and tutorials written by the community to get started easily.

The remaining requirement was finding a way to call BigQuery and Redis, possibly through some already developed library, since I didn’t want to reinvent the wheel 😇. Fortunately, Rust has a superpower: a project and dependency manager, similar to maven or npm, called cargo, and its companion registry crates.io. A quick search and I was able to find the two implementations I needed, that I added to my project with just a single command line:

cargo add google-cloud-bigquery redis

At this point I had everything I needed to create a service that had feature parity with the Ruby On Rails implementation: given a customer code, query BigQuery for its clustering data in JSON format and give it back to the caller. One choice I made early on was to handle errors in a clean way, without resorting to .unwrap() everything, initially using anyhow, later moving to the crate color-eyre in order to map domain errors. This is, more or less, the initial code I wrote for the first implementation:

// ...

#[tokio::main]
async fn main() -> Result<(), axum::BoxError> {
    let big_query = gcp_bigquery_client::Client::from_application_default_credentials().await?;

    let app = Router::new()
        .route("/api/v2/coefficients", get(find))
        //
        // One nice touch of Axum is the use of State to do dependency injection in the router
        //
        .with_state(big_query);

    let addr = SocketAddr::from(([0, 0, 0, 0], 8080));
    let listener = tokio::net::TcpListener::bind(addr).await?;

    axum::serve(listener, app)
        .with_graceful_shutdown(shutdown_signal())
        .await
        .expect("Cannot start the server!");

    Ok(())
}

async fn find(
    //
    // State extractors let you access your dependencies in a type-safe way
    //
    State(big_query): State<gcp_bigquery_client::Client>,
) -> Result<Json<Vec<Data>>, anyhow::Error> {
    let mut result_set = big_query
        .job()
        .query(
            "project_id",
            QueryRequest::new(format!(
                "SELECT * FROM `{}.{}`",
                "dataset_id", "table_id"
            )),
        )
        .await?;

    let x = if result_set.next_row() {
        result_set
            .get_json_value_by_name("data")?
            .ok_or_else(|| anyhow::anyhow!("Cannot find data"))?
    } else {
        // ...
    };

    let c: Coefficient = serde_json::from_value(x)?;

    Ok(Json(vec![c]))
}

// ...

«Thou shalt not query BigQuery»

Having reached feature parity with the old microservice, I started to add new feature requests. The first one was dictated from the necessity to not do a lot of single queries to BigQuery, since the Google service has a per-query pricing policy and we wanted to reduce that cost.

We came up with this solution: pour all the data that we were reading from BigQuery into a Redis cloud instance, in one shot, then query Redis directly.

Luckily, the crate google-cloud-bigquery supports the BigQuery Storage READ API to read all the data from a table in bulk. So I decided to implement in the service a batch run mode that opens the connection to BigQuery and copies the data in Redis until there are records to stream. To do so, I refactored the code, adding the clap library, to get the preferred run mode in which the service should start as an input from the command line: in the default “serve” mode, Axum starts a server and listens on incoming requests; in “import” mode, the service connects to BigQuery, pulls the data and inserts them in Redis, closing the application at the end.

This batch process is dockerized from the CI (more on this later) and fired with a simple cronjob on Google Cloud, using the following command:

docker run -t <image> /bin/server -m import

And this is the refactored main() function:

use axum::{response::Result, BoxError};
use clap::Parser;
use google_cloud_bigquery::client::{Client, ClientConfig};

// ...

#[derive(clap::ValueEnum, Clone, Debug, Default)]
enum RunMode {
    #[default]
    Serve,
    Import,
}

#[derive(Parser, Debug)]
struct Args {
    // Start the rest api server or run the import from BigQuery and quit
    #[arg(short, long, default_value_t = RunMode::Serve)]
    #[clap(value_enum)]
    mode: RunMode,
}

#[tokio::main]
async fn main() -> Result<(), BoxError> {

    let redis_url = dotenvy::var("REDIS_URL")?;

    info!("Redis URL: {redis_url}");

    let redis_client = redis::Client::open(redis_url)?;

    // clap magic happens here:
    let args = Args::parse();
    match args.mode {
        RunMode::Serve => {
            info!("Service starting...");

            let app = router(redis_client);

            let addr = SocketAddr::from(([0, 0, 0, 0], 8080));
            let listener = tokio::net::TcpListener::bind(addr).await?;
            info!("listening on {}", addr);

            axum::serve(listener, app)
                .with_graceful_shutdown(shutdown_signal())
                .await
                .expect("Cannot start the server!");
        }
        RunMode::Import => {
            let (config, project_id_opt) = ClientConfig::new_with_auth().await?;
            let client = Client::new(config).await?;
            let project_id = project_id_opt.expect("Cannot find project_id!");

            BatchImportService::pull_from_bigquery(&client, project_id.as_str(), &redis_client)
                .await?;
        }
    };

    Ok(())
}

// ...

The performance of this bulk import is very good: in production it was able to move 7.362.734 records from BigQuery to Redis in just 1 hour, with ~30MiB of constant RAM usage and ~0.25 CPU:

A little bit of swag

Another feature that I decided to add to the service was some sort of documentation for the route served by the application.

As always, crates.io and the Rust community are your friends: the project utoipa lets you document your API with macros that can generate openapi documents and even a swagger-ui. Neat!

#[derive(OpenApi)]
#[openapi(
    paths(coefficients_controller::find),
    components(schemas(
        view_model::coefficient::Coefficient,
        crate::errors::app_error::AppError
    )),
    modifiers(&SecurityAddon),
    tags((name = "coefficients", description = "Coefficients v2 API"))
)]
pub(crate) struct CoefficientsApiDoc;

// ...

#[utoipa::path(
    get,
    path = "/api/v2/coefficients",
    responses(
        (status = 200, description = "List all coefficients for the user", body = [Coefficient]),
        (status = 404, description = "Coefficients not found"),
        (status = 401, description = "Invalid x-service-token")
    ),
    params(("customer_code" = CustomerCodeParam, Query, description = "Customer Code")),
    security(("x-service-token" = []))
)]
pub async fn find(
    State(redis_client): State<redis::Client>,
    customer_code_param: Query<RequestParam>,
) -> Result<Json<Vec<Coefficient>>, AppError> {
// ...

Middlewares and security

Another aspect of Axum that I appreciated while creating this application is the concept of middlewares, basically filters that can be chained and run on every request or response and can modify them. In my case, I used this feature to implement a security filter on the exposed route of the service that reads a custom service token in header, a simple security measure that we already use with our Java/Quarkus microservices.

In this case, I used the feature typed_header in the crate axum-extra to implement a ServiceToken struct that represents a custom header “x-service-token”, from which we read the JWT token and check for its validity:

static X_SERVICE_TOKEN: HeaderName = HeaderName::from_static("x-service-token");

#[derive(Clone, Debug, PartialEq, Eq, PartialOrd, Ord, Hash)]
pub struct ServiceToken {
    token: HeaderValue,
}

impl ServiceToken {
    pub fn new(token: HeaderValue) -> Self {
        ServiceToken { token }
    }

    pub fn token(self) -> HeaderValue {
        self.token
    }
}

impl Header for ServiceToken {
    fn name() -> &'static HeaderName {
        &X_SERVICE_TOKEN
    }

    fn decode<'i, I>(values: &mut I) -> Result<Self, Error>
    where
        Self: Sized,
        I: Iterator<Item = &'i HeaderValue>,
    {
        let value = values.next().ok_or_else(Error::invalid)?;
        Ok(ServiceToken::new(value.clone()))
    }

    fn encode<E: Extend<HeaderValue>>(&self, values: &mut E) {
        values.extend(std::iter::once(self.token.clone()));
    }
}

pub(crate) async fn jwt_service_auth(
    State(groups): State<Vec<&str>>,
    TypedHeader(auth): TypedHeader<ServiceToken>,
    request: Request<Body>,
    next: Next,
) -> Result<Response, AppError> {
    let jwt_key = dotenvy::var("JWT_PUBLIC_KEY").expect("Missing JWT_PUBLIC_KEY");

    let token = auth.token();

    debug!("Validating token {:?}", token);
    match token_is_valid(&token, &jwt_key) {
        Ok(token_data) => {
            if claims_contains_any_group(token_data, &groups) {
                let response = next.run(request).await;
                Ok(response)
            } else {
                let message = format!("Groups {:?} not present in service token!", &groups);
                Err(AppError::from(ManagedError::InvalidToken(message)))
            }
        }
        Err(e) => Err(AppError::from(ManagedError::InvalidToken(e.to_string()))),
    }
}

Using this middleware is just a matter of attaching a route layer to the router definition:

          .route_layer(middleware::from_fn_with_state(
              vec!["group1", "group2"],
              jwt_service_auth,
          ))

The middleware can be easily extracted from this project and shared in a private crate meant for internal use, so that future Telepass Rust services could import and reuse this logic, without code duplication.

Build it 🛠 Box it 📦 Ship it! 🚢

The last step to put the application in production was to create a GitHub Action to build a service container that could be run on Kubernetes.

To do it, I created a Dockerfile in the project directory using the doker cli command and just followed the instructions in the wizard:

docker init

Easy-peasy. The only changes I made to the generated Dockerfile were the update of the Rust compiler version and the addition of some dependencies with apt in the final image stage:

FROM debian:bookworm-slim AS final

RUN apt-get update
RUN apt-get install -y openssl
RUN rm -rf /var/lib/apt/lists/*

On the CI side, I created an action like this one, that caches recent compilation artifacts to reduce the build time on every push of new code:

name: Run build, tests and publish image

on:
  push:
    branches:
      - main

env:
  CARGO_TERM_COLOR: always
  GC_DOCKER_HOSTNAME: ...
  GC_DOCKER_REGISTRY: ...

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3

      - uses: actions/cache@v3
        with:
          path: |
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ runner.os }}-cargo-${{ hashFiles('Cargo.lock') }}

      - name: rust-rustfmt-check
        uses: mbrobbel/rustfmt-check@0.7.0
        with:
          token: ${{ secrets.GITHUB_TOKEN }}

      - name: Check
        run: cargo check --verbose

      - name: Run tests
        run: cargo test --verbose


  docker-push:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Get current rust app version from its Cargo.toml
        id: toml-version
        uses: wowjeeez/rust-conf-read@v1.0.3

      - name: Login to gcr
        run: |
          echo '${{ secrets.GCLOUD_JSON_KEY }}' | docker login -u _json_key --password-stdin https://${{ env.GC_DOCKER_HOSTNAME }}

      - name: Docker build and push to gcr
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: ${{ env.GC_DOCKER_REGISTRY }}/rust-application:${{ steps.toml-version.outputs.version }}

That’s all folks (…for now!)

And this is the story of how I sneaked in the first Rust project in Telepass 🥸

It has been a pleasant and fun journey, and I finally had the opportunity to taste the maturity and advantages of the Rust ecosystem, from the robustness of the code to the excellent quality of the most used crates published on crates.io.

And I really think that the journey will not end here: Rust is a language that in my opinion can find its niche in companies like Telepass, at the same table of more “mundane” technologies that we use for our backend, like Java or Ruby: this was just the first proof of concept to prove that it can be done!

This article was written by Antonio Bonifacio, Staff Software Engineer (Backend), and edited by Marta Milasi and Gaetano Matonti, respectively UX Content Lead and Managerial Software Engineer at Telepass. Interested in joining our team? Check out our open roles!