Road to asynchronous stability
Migrating from Rocket to Actix
When I was starting Bipa, I had only just started learning Rust a couple of months earlier at my previous job. We built a simple API using Rocket that served the results of a more complex piece of code. That small interaction I had with the language and with the Rocket framework was enough to make me want to use it for everything that I could. So, when it came time for me to decide what to use, it was a no-brainer: Rust/Rocket seemed like the perfect match.
Rocket is amazing! It has a very concise and beautiful API that anyone that knows Rust can learn very quickly. Even though it only runs on Rust’s nightly build, it has been stable and didn’t show any problems for the entire time we had it on production. It was always flexible enough to allow me to write whatever I needed while guaranteeing security at compile time.
However, Rocket has two main downsides, as mentioned before: it relies on the nightly build, which can be a hurdle if you want to update it, as things are not 100% guaranteed to work; But also, it does not yet have support for Rust’s async/await, which means it is missing out on a bunch of performance improvements that having a server that can handle a lot more connections at the same time gives you.
Back then, Rust had only just stabilizedasync-await
on version 1.39.0. For me, this meant that it was possible that a lot of the libraries that I would need would still not support the new Tokio asynchronous runtime. It made sense to ignore the fact that Rocket did not support async-await, especially since they had already started working on it as part of the 0.5 release, and eventually also removed the need to be on the nightly build, which meant that rocket compiling on stable rust would also come on the same release.
Fast forward to now, Tokio Runtime 1.0 has been released, there is wide support in the community to build using the new runtime. In fact, new versions of most libraries will most definitely use the new async-await runtime, and I suspect support for the previous blocking runtime will slowly fade away. What this all means for us, is that we should probably start migrating some of our servers to use the new runtime, which will allow us to use the new versions and features of the libraries and frameworks we rely on.
While I would absolutely take a shot at trying Rocket’s 0.5 version and would migrate our server to use its async-await support, as of today (Feb-2021) they have not yet released it. There are still some features to be implemented before they can release it. For now, we would have to use the master
branch, which is not something we are looking to do for such a big and important dependency of our software. Keep in mind, I am not criticizing the Rocket team at all, they have done and are doing a great job with Rocket, I am a big fan of their work.
Migration
With that in mind, we decided to take the action to migrate a smaller service that we have, called pricing
to use one of the other major Rust web frameworks, Actix. As you might imagine, this server is responsible for importing, saving, and serving all things related to the prices of the assets our users trade. This particular server would greatly benefit from being able to receive larger amounts of requests at the same time, as it is being requested by our mobile apps constantly.
We did it in a couple of days and the experience (as I will explain more later) was absolutely amazing. The server was deployed with 0 changes to our infrastructure, Rust stable improved our build speed, and the results were terrific. The pricing server was now using at least some of the async-await power (we still had more work to do): just by loading the app, I could notice that it was loading faster than before. Keep in mind this is a very biased benchmark since we didn’t really do any serious benchmark to compare the two. Even though it would be an interesting learning experiment, there are plenty of benchmarks on the web on these frameworks, and you can clearly see that not only actix
is really fast, it's among the fastest web frameworks out there, only losing to C++’s drogon
.
While Rocket is still very fast, it loses by a big margin to Actix. Please note that these benchmarks test exactly the concurrency that Actix is so good at. When Rocket introduces async-await
support, it will most definitely jump a lot of positions in this list.
Once we successfully migrated our pricing
server and started learning more about Actix and its ecosystem, we realized there were other major benefits in the migration. For instance, it has better support for professional-grade logging, which is a very important feature once you have a server that users are actually using it and you need to investigate a bug and understand what steps the user took to get there. In addition to that, Actix was better supported by our crash reporting tool Sentry, which also helps us improve the quality of our software by having better visibility into various problems.
Since the pricing
server migration went so well, we decided to take an even bigger step and move Bipa’s main server to Actix as well, a server that has 50+ routes, interfaces with a bunch of external services, and has multiple dependencies. Due to the fact that it was much bigger and interacted with a lot more external services, it required more work than the pricing server.
This migration took around a week to be completed from start to finish, and even though it was successful, it did come with some downsides and risks. Some are inherent to the difference between how both frameworks do things, and others due to the nature of the massive change.
Database
Rocket’s rocket_contrib
crate has some APIs for creating a pool manager that makes it much easier to integrate Postgres or other databases with the framework. This allowed us to use the database
macro to create a diesel_postgres_pool
, as you can see from the image below.
Then all we had to do was add this Conn
struct as a fairing
middleware to rocket.
And we would magically receive the Conn
struct on every single endpoint that wants to use it.
Very nice, right? Yeah, this is part of the reason why Rocket is amazing and has a very bright future ahead, its API is very nice and concise.
Now, how would we use Actix
to implement the same thing? That’s where we needed to do a bit more work than we had when integrating Rocket. When integrating the database with Actix, there are no magical macros, you will need to create the connection pool manager type yourself by leveraging the amazing r2d2 library.
Then, when starting up the server, you will need to establish a connection that will return the Db
the type you have created, and pass that a clone of the connection as a data
attachment to Actix's
App
.
This will ultimately mean that actix
will now pass a Data<Db>
type to every endpoint that needs it, just like Rocket did, but a little more verbose, since our type comes wrapped in Data
.
Overall, this is a painless downside as this process is largely covered by the documentation, and the compiler will catch most of the problems one may have.
Big change
There was no way to incrementally update, this had to be done in one go. For the pricing server, it was not a problem at all, the server only has a handful of endpoints and tables. But, when it came to the main server, it required us to touch every single web-related file of the project. In a server with 50+ routes, this change can be very daunting. Now, it helps that most of these changes were literally minor syntax changes, such as changing the route
macro from rocket::get("/my-route")
to actix_web::get("/my-route")
and updating the response from Result<Json<MyType>, BadRequest<Json<MyError>>
to actix_web::Result<Json<MyType>>
.
External Services
This was only a problem because, in order to talk to the external services that Bipa needs to interact with, we were making all the network requests using the reqwest::blocking
feature. This allowed us to be in the latest version of the library and not have to worry about writing asynchronous requests when we didn’t want to do so.
The problem is due to the fact that when you instantiate reqwest’s blocking client, it will lock a thread in order to be able to run the request synchronously. This is fine for synchronous runtimes, but when it comes to asynchronous ones, you can’t just be greedy and lock the thread for yourself.
This was the hardest problem to spot, the compiler won’t know that this is an invalid operation, so you will have to find this by testing your software in a real-life scenario by actually making the networking requests your server needs to do in order to function, either by hand or through unit/integration testing.
We worked around this issue by spawning a new std::thread
every time we needed a blocking thread to make the operation, and then we leverage the std::thread::JoinHandle.join
method to make sure we wait for the thread to finish to extract its result. As you can see from the image below.
Future Work
The server is successfully running after implementing all of the changes described above, it has not displayed any problems. But, as you saw, some of the work we did was pretty much a quick fix to get the whole thing working so that we could then incrementally asyncfy the entire codebase.
We have already removed the need for the blocking threads on the pricing
server, and some of the main server. It has been an amazing experience to be able to do such a huge change, with lots of implications, and not break a thing. Now, don’t get me wrong, things could still be broken and we just haven’t discovered yet, but not having found anything for over a week with hundreds of people using the app says something about the quality of the Rust compiler, and all the frameworks available for web development.
Finally, we intend to continue working to serve Bipa’s users with high-quality, high-integrity software that has the backing of the Rust compiler.
EDIT: If you like what we are doing and would like to work with us. DM me on Twitter or Linkedin.