Cicada Distributed: Major Improvements!

Published in

Geek Culture

5 min readMay 24, 2021

Rewriting the Cicada testing framework from scratch

Last year, I wrote Cicada-2, a low-code testing framework. Since its release, I’ve considered ways to improve upon it, particularly for running load tests. This is the story of how I created Cicada Distributed, a Python based load testing framework and why I believe it should be the go-to tool for testing your services.

Why it was time for a rewrite

Cicada-2 was based on lessons I learned trying to test complex applications. While it was great for integration tests, it did not have the features I needed to write effective load tests. The hard-coded load model of Cicada-2 was limiting when it came to writing a test beyond something like “hit my API a bunch of times". I wanted a tool that could hit a service hard enough to know what it’s true limits were.

With that in mind, I started adding more programmatic testing features into Cicada-2. However, I quickly realized that I would rather be able to write tests in Python than with some awkward recursive YAML mixed with Jinja2. So I started over, and began writing Cicada from scratch.

A not particularly good load test written in Cicada-2 YAML

Rewriting Cicada again… and again

The core feature I wanted the new Cicada to have was to allow users to have complete control over the load model of the test. A test should be able to not only call a service a certain number of times, but deliver ramping load, scaling to a threshold, and a bunch of other situations I hadn’t considered.

To do this, Cicada uses a virtual user model. Essentially, the code to simulate the actions of a user is run in parallel to create load. I wrote the initial version of this to run the virtual users inside of threads. Unfortunately, this turned out not to be a great approach. Print statements would break the rest of the test. I’d get weird bugs about what could and could not be pickled to run inside a thread. Code written outside of a test wouldn’t always work. In addition, the process managing the user threads became a bottleneck. It was hard to control which users could start and stop without significantly affecting performance.

How the new Cicada works

After several revisions, I settled on a distributed user model that was loosely coupled to the scenario via an event broker (Kafka as of right now). Instead of running on a managing instance, virtual users run inside of containers. This greatly simplifies the virtual user code because it allows Cicada to take advantage of a container orchestrator in managing the user pool, instead of managing individual threads across a machine or multiple machines. In addition, the event model allows users to receive commands and send back results at their own pace, making the test less prone to performance bottlenecks.

Bird’s eye view of Cicada Distributed’s Architecture

Another major improvement is in how much more configurable Cicada Distributed’s load model is than that of Cicada-2’s. It allows you to write it in plain Python and control the scenario via an API. This means you can scale users up and down programmatically, as well as divvy up load amongst the user pool. Finally, you have complete control over how results are gathered and analyzed via user definable aggregation and error filtering functions.

A quick example

To demonstrate the improved load testing features of Cicada Distributed, we’ll walk through an example of a simple test. For this example, I’ve created an API with an endpoint for creating a user and storing it in a database:

For a basic load test, we can hit this endpoint with a limited number of users for a certain time. First, we’ll need to install Docker and Cicada Distributed and create a blank project:

pip install cicadadmkdir load-test
cd load-testcicada-distributed init .

In the load-test directory, you’ll see a couple of files:

Dockerfile
test.py

Because Cicada uses Docker to package the tests, you can add any dependencies to the image to use in a user or scenario. Add the requests package to the Dockerfile:

Next, update test.py with a basic load test:

In this example, Cicada will perform the post_user test to create a user for 180 seconds with 30 users. Additionally, each user is limited to 4 requests per second. To execute the test, you’ll need to start the cluster (an event broker and a service to create containers) and run the test:

cicada-distributed start-clustercicada-distributed run

When this runs, we’ll end up with a load curve that looks like this (I used Prometheus + Grafana to monitor the API):

The API’s load with 30 users at 4 requests per second

What if we wanted to see how much load the API could take in 3 minutes? Remove the line @user_loop(iterations_per_second_limited(4)) and replace it with @user_loop(while_alive()) (import it with from cicadad.core.scenario import while_alive). This will remove the constraint on requests per second so the virtual users can make as many requests as possible. On my machine, I was able to process approximately 200 requests per second (although Cicada’s virtual users are capable of putting out a much higher RPS against a more capable host). Let me know in the comments how much load your system was able to handle.

Load with iterations per second limit removed

Conclusion

In conclusion, I’m much more satisfied with Cicada Distributed’s flexibility over Cicada-2. Please feel free to try Cicada Distributed out and let me know what you think!