Developing High-Performance Cassandra Applications in Rust (Part 2)
Author: Piotr Kołaczkowski
Building a high-performance web application in Apache Cassandra doesn’t have to be hard. This post will show you how with the DataStax Driver, the Actix web framework, and Rust to build a simple web service for collecting data from remote sensors.
In a previous blog post on this topic, we explored various Apache Cassandra® drivers available for Rust developers. However, you can’t build an application with just a database driver. An application requires an interface for communicating with the outer world. The way this interface handles concurrency and how it interacts with the driver can affect its performance and scalability.
In this blog post, we’ll show you how to build a simple web service for collecting data from remote sensors. We’ll use the DataStax C++ Driver for Cassandra and the Actix web framework. The resulting application can easily handle hundreds of thousands of requests per second and even more if horizontally scaled.
Our development goal
The application we’re building will receive data from multiple remote sensors and save them into Cassandra. Each sensor has a unique identifier. Our sensor measures temperature and humidity repeatedly and sends the measured values as a JSON document to the REST endpoint. The example message looks like this:
The service must also record the timestamp of each event. The database structure must allow reading a bunch of measurements of a given sensor sorted by time.
Preparing the project
Let’s create a fresh binary Rust project:
This command created a complete Rust application for you! The source code has been placed in the
src/main.rs file. You can compile it and run it, although it doesn’t do much yet:
Let’s add the required dependencies to
Cargo.toml. We need
actix-web to be able to code the web interface,
cassandra-cpp to use the C++ driver from Rust,
chrono to get the current timestamp and
serde to serialize and deserialize data to/from JSON.
Building the project will fetch and compile all the dependencies. This will happen only once, subsequent builds will be faster.
Getting started with Actix is very straightforward. You can convert the “hello world” application generated by
cargo new in the previous section to a minimal web application by putting the following code into the
src/main.rs file, which was earlier generated automatically:
This app will start the embedded web server, bind it to port 8080 and wait for requests. Sending a request for
/ results in
Hello Actix string to be returned:
Let’s modify this code to accept our sensor readout data. First, let’s define a Rust structure to hold the results of measurements:
We’ll use signed integer type for
sensor_id, because Cassandra doesn’t provide a type for unsigned integers.
We’ll receive a
SensorReadout struct in a POST handler:
Don’t forget to register the handler in the App object:
Now you can
curl the data to the application:
And the application logs:
Note that the content-type header is very important. If it’s missing, the server won’t understand the input and will respond with an
HTTP 400 Bad Request response and the handler won’t get called at all. This might be a bit confusing because in such a case you won’t see any errors anywhere. If this ever happens, you can inspect the detailed server response by adding the
-i parameter to the
To summarize, in this section we coded a REST endpoint capable of receiving JSON-encoded messages as fully-typed Rust values, and all that in just about 20 lines of code. Now let’s see how we can save the data to the database.
Saving data to Cassandra
Before we can save anything, we need a running Cassandra cluster with a proper keyspace and a table. Let’s define the following schema in Cassandra Query Language (CQL):
We’re going to make the day number be a part of the partition key in order to keep partition sizes sane. This way each partition would hold readouts of one sensor from a single day. For a deeper discussion on various ways of modeling time series data in Cassandra, refer to this blog post by Jon Haddad.
If you wanted to use this schema in production, you’d need to at least tweak the replication strategy settings to meet the typical durability and reliability requirements.
Let’s add the code to establish the Cassandra connection to the main function:
We’ll also need to prepare a statement in advance, so Cassandra doesn’t waste time re-parsing each request:
Now we can call them in main:
Now we only need to use the session and statement objects to issue a CQL insert statement in the handler and we’re done… Except, there is one problem: how to pass the session object and the prepared statement to the handler?
In some languages like C, C++ or Python (and probably dozens of others) you could just use global variables. But, safe Rust doesn’t allow global variables. In some other languages like Java or C# you could use a static field in a class. But there are no static fields in Rust either, nor classes. What about global singleton objects like Scala or Kotlin? No luck here, either.
Fortunately, Actix offers a global application state feature that allows you to share a value between handlers. It must be a single value, though, and we have two (the session and the statement) so let’s pack them into a structure:
In a more complex application, you’d obviously store more prepared statements there. (See this documentation for more on managing state in Actix.).
The following code registers our
CassandraCtx in the App and launches the webserver:
Note that the
HttpServer the constructor takes an application factory, not an application instance. A separate application instance is created for each Actix worker thread. Consequently, the application data object must be cloneable, so each worker thread gets its own. But we don’t want to clone
We wouldn’t be able to because Session is not
We need a single
CassandraCtx and multiple references to it. Hence, we needed to wrap the context object into
web::Data, which internally uses automatic reference counting (
Arc). We clone only the wrapper, not the wrapped context.
You might wonder what would happen if we didn’t wrap the context into
web::Data or forgot to clone it. No worries, Rust is safe; it simply wouldn’t compile.
Now let’s write the code to save the received row:
Note that a timestamp must be converted to a Unix epoch in milliseconds and bound as a 64-bit integer.
Finally, we can rewrite the handler to call it:
So, how fast is it?
Before we start benchmarking, there is a minor change in the code we need to do to make the benchmarks fair. Unfortunately, most of the HTTP benchmarking tools don’t offer a way to post different content in each request. If we posted data for the same sensor all the time, the updates would overwrite each other because the resolution of the timestamp is only one millisecond, and we’d be writing many requests in that time. Therefore, let’s add the
rand crate to the project…
… and temporarily replace
sensor_id with a randomly generated number:
The first law of benchmarking says “never perform benchmarks on a laptop”. Laptops are relatively low-performance, unreliable, get warm pretty quick, throttle, etc. Also running a benchmarking client, the web service process, and the database all on a single machine is probably a very bad idea when one wants to beat performance records. Even so, I purposely made all of these mistakes to just see what the *worst* case could look like.
Benchmarking with ApacheBench
My first attempt was to use the Apache HTTP benchmarking tool
ab with default settings:
Results with ab using default settings
That’s only about 2400 requests per second. It’s definitely not the performance level I expected, even on a tiny four-core development laptop. But I quickly realized that the default concurrency level was one. Let’s bump it up to 512:
And, here are the results I got:
That’s much better, but my gut was telling me this could go faster. Apache benchmarking tool is quite old, and it doesn’t use
HTTP keep-alive feature by default, so each request opens a separate connection, which is in fact a big cost compared to serving the request. After enabling keep-alive with
Results in ab with higher concurrency and with the HTTP keep-alive feature enabled
Benchmarking with API Bench
In search of a better benchmarking tool, I also tried
apib, a modern replacement for ab, which is not limited to a single thread like ab is.
Initial results using apib defaults
This hasn’t improved the throughput number I got from
ab, but it produced a nice histogram and allowed me to test performance at 20,000 concurrent connections:
Results at higher scale
During that test, the web service process took only about 63 MB of RAM!
So overall, 50,000 requests per second over 20,000 concurrent connections on a laptop… Not bad, eh?
How fast is it on a big iron?
Let’s try some real hardware. The server I got for this test had the following specs:
- CPU: 2x Intel(R) Xeon(R) CPU E5–2650L v3 @ 1.80GHz (total: 24 physical cores)
- RAM: 128 GB
I installed a single instance of Datastax Enterprise (DSE) 6.8, Cassandra-web-app is shown here and
The initial attempt was surprisingly bad. I got only about 32,000 requests per second. It was slower than my laptop!
A quick look at
iostat reveals the problem: the machine load was very low, 70% of the CPU was idle. This typically suggests there is insufficient parallelism and a single-thread bottleneck somewhere. It turned out to be in the driver. We haven’t done any parameter tuning yet and the default number of I/O threads used by the driver is one. So all the requests were being serialized on a single thread. (Note, however, that the driver is internally asynchronous, so the thread is not being blocked waiting on the response from the server.) Argh!
Let’s bump up the number of io-threads:
This, together with increasing the number of HTTP connections, improved performance a lot:
Interestingly, I haven’t found much performance impact from increasing the number of connections between the app and the database. The highest throughput I could get with connection number tuning was about 220,000–230,000 requests per second. This shows that DataStax Enterprise is very good at balancing load across many threads even if all the write queries are received from a single connection.
In this blog post, we showed you how you can build a simple but performant web service in less than 100 lines of Rust code. The task was easy because of the great libraries that are available to abstract away most of the complexity of dealing with JSON deserialization, HTTP request handling, and sending data to Cassandra. The DataStax Cassandra driver integrates painlessly with Actix.
It’s also important to note that we didn’t need to use unsafe Rust features or any other low-level hacks to get great performance out-of-the-box. The resulting binary was less than 10 MB in size and had very low memory requirements even under load.
- Developing High performance Apache Cassandra Applications in Rust (Part 1)
- Apache Cassandra
- Apache Cassandra Documentation: Client Drivers
- Datastax Enterprise (DSE)
- DataStax C++ Driver for Apache Cassandra (DataStax Documentation)
- DataStax C++ Driver for Apache Cassandra (GitHub)
- Actix Web Framework
- Getting Started with Actix
- Rust Toolchain Installation
- Cassandra Query Language (CQL)
- Cassandra Time Series Data Modeling For Massive Scale
- The Rustonomicon: Meet Safe and Unsafe
- Actix Documentation: Writing an Application
- Apache HTTP Benchmarking Tool