Cicada Distributed: Major Improvements!
Rewriting the Cicada testing framework from scratch
Last year, I wrote Cicada-2, a low-code testing framework. Since its release, I’ve considered ways to improve upon it, particularly for running load tests. This is the story of how I created Cicada Distributed, a Python based load testing framework and why I believe it should be the go-to tool for testing your services.
Why it was time for a rewrite
Cicada-2 was based on lessons I learned trying to test complex applications. While it was great for integration tests, it did not have the features I needed to write effective load tests. The hard-coded load model of Cicada-2 was limiting when it came to writing a test beyond something like “hit my API a bunch of times". I wanted a tool that could hit a service hard enough to know what it’s true limits were.
With that in mind, I started adding more programmatic testing features into Cicada-2. However, I quickly realized that I would rather be able to write tests in Python than with some awkward recursive YAML mixed with Jinja2. So I started over, and began writing Cicada from scratch.
Rewriting Cicada again… and again
The core feature I wanted the new Cicada to have was to allow users to have complete control over the load model of the test. A test should be able to not only call a service a certain number of times, but deliver ramping load, scaling to a threshold, and a bunch of other situations I hadn’t considered.
To do this, Cicada uses a virtual user model. Essentially, the code to simulate the actions of a user is run in parallel to create load. I wrote the initial version of this to run the virtual users inside of threads. Unfortunately, this turned out not to be a great approach. Print statements would break the rest of the test. I’d get weird bugs about what could and could not be pickled to run inside a thread. Code written outside of a test wouldn’t always work. In addition, the process managing the user threads became a bottleneck. It was hard to control which users could start and stop without significantly affecting performance.
How the new Cicada works
After several revisions, I settled on a distributed user model that was loosely coupled to the scenario via an event broker (Kafka as of right now). Instead of running on a managing instance, virtual users run inside of containers. This greatly simplifies the virtual user code because it allows Cicada to take advantage of a container orchestrator in managing the user pool, instead of managing individual threads across a machine or multiple machines. In addition, the event model allows users to receive commands and send back results at their own pace, making the test less prone to performance bottlenecks.
Another major improvement is in how much more configurable Cicada Distributed’s load model is than that of Cicada-2’s. It allows you to write it in plain Python and control the scenario via an API. This means you can scale users up and down programmatically, as well as divvy up load amongst the user pool. Finally, you have complete control over how results are gathered and analyzed via user definable aggregation and error filtering functions.
A quick example
To demonstrate the improved load testing features of Cicada Distributed, we’ll walk through an example of a simple test. For this example, I’ve created an API with an endpoint for creating a user and storing it in a database:
For a basic load test, we can hit this endpoint with a limited number of users for a certain time. First, we’ll need to install Docker and Cicada Distributed and create a blank project:
pip install cicadadmkdir load-test
cd load-testcicada-distributed init .
In the load-test
directory, you’ll see a couple of files:
Dockerfile
test.py
Because Cicada uses Docker to package the tests, you can add any dependencies to the image to use in a user or scenario. Add the requests
package to the Dockerfile
:
Next, update test.py
with a basic load test:
In this example, Cicada will perform the post_user
test to create a user for 180
seconds with 30
users. Additionally, each user is limited to 4
requests per second. To execute the test, you’ll need to start the cluster (an event broker and a service to create containers) and run the test:
cicada-distributed start-clustercicada-distributed run
When this runs, we’ll end up with a load curve that looks like this (I used Prometheus + Grafana to monitor the API):
What if we wanted to see how much load the API could take in 3 minutes? Remove the line @user_loop(iterations_per_second_limited(4))
and replace it with @user_loop(while_alive())
(import it with from cicadad.core.scenario import while_alive
). This will remove the constraint on requests per second so the virtual users can make as many requests as possible. On my machine, I was able to process approximately 200 requests per second (although Cicada’s virtual users are capable of putting out a much higher RPS against a more capable host). Let me know in the comments how much load your system was able to handle.
Conclusion
In conclusion, I’m much more satisfied with Cicada Distributed’s flexibility over Cicada-2. Please feel free to try Cicada Distributed out and let me know what you think!