Locust experiments — Feeding the locusts

Karol Brejna

Published in

Locust.io experiments

11 min readApr 8, 2019

There are times when you want your Locust scripts to use some specific input data in order to perform the tests.

“Closeup of eyes and antennae on a grasshopper” by Boris Smokrovic on Unsplash

For example:

you want to use a list of users and their credentials in order to log in to some service,
you have a set of form values that need to be submitted,
you need to upload specific files as a part of requests to your service,
you want to have repeatable/comparable test conditions (by providing the exact same input data),
you have some data that exploits the service’s corner cases (validation, special values, special characters, etc.).

In this “episode”, I’ll try to examine one of the ways of addressing such problems.

Along the way, the following “techniques” will be used here, which can also be useful in other Locust tests developments:

determining if the code runs on master or slave,
doing some preparations before the tests start,
sending some data from the master to workers.

The use-case

Let’s write down our requirements.

Assume, that we are testing a web service’s endpoint that accepts POST requests with JSON payload.

For this, we want to use a specific data set. Let’s say, we have a CSV file where every row represents a single message to be sent. We want to be able to read the content before the tests start.

Furthermore, we require being able to run the test in a distributed manner — running multiple slave workers that will issue the POST request.

Moreover, we don’t want to have any duplicates (sending the same value more than once).

The plan

So, we want Locust to be able to read the file and its workers to use the data to test the service, but each record should be utilized just once.

It seems that in a naive approach — each worker reads the file on its own — it’s hard to make sure that one worker doesn’t use the record that some other worker had already processed.

When dealing with the problem in a standalone setup (single locust node, no additional workers) there is no problem with coordination among the tasks — they all run in the same “place”. We’d probably build a queue containing the data upfront (before tests start) and then the tasks would consume the records one by one.

How we could achieve that for a distributed setup?

In general, in Locust, it’s the master who is in charge of things like spawning new locusts (users), collecting and presenting results, running some preparatory tasks, etc. So, it appears like we need some code that runs on the master and will:

get the data from the disk
communicate with the workers, to exchange data

Preparations

Let’s see what and how we need to do and if we have “tools” for that.

Running pre-test actions

There are some points in Locust cluster’s or locust test session’s lifecycle that you could use to run some required actions — setting up some resources before tests or cleaning things up afterward. A few of these “moments” are:

when the cluster starts
when the tests start (for example, a user presses “Start swarming” button on the UI)
when all the locusts had hatched (the process of creating new test users had finished)

The topic itself probably deserves a dedicated article, as it is very interesting and a bit confusing at the same time. Let’s focus for now on what we need to do here.

So, we need to load the input data. As it is a static CSV file, doing this once is enough (there is no point in rereading the dataset every time the test starts — it’s a waste of resources).

Then, we want to begin distributing the data every time the tests are started.

Running code at the startup time

This case is pretty simple. It’s enough to put the code in the global scope of your locus file. The following will execute do_some_stuff() when the cluster is coming up.

Please note, that due to GIL nature (and if you won’t take care of it explicitly), if the code is time-consuming, it will “stop the world” — the rest of the code needs to wait for this one to finish.

Running code before tests

There are at least a few options here. Without going into too much detail (as I said, the topic is probably in line for an article of his own), I choose to use master_start_hatching event hook for that.

def on_master_start_hatching(**kw):
    print(f"Do some work when the tests are started")events.master_start_hatching += on_master_start_hatching

Using the preceding code in the locust file does the required job:

defines a handler (a function that will do the actual work) for the event
registers the handler

(It turns out that master_start_hatching is not fired in standalone mode, so the code will only run properly in distributed mode).

It looks like we have the topic covered.

Detecting the master

According to the documentation (ATTOW the latest version is 0.9), as I understand it, master and workers use the same test script file:

Both the master and each slave machine, must have a copy of the locust test scripts when running Locust distributed.

Because we are going to run some code on the master exclusively (not on the slaves), we need to find a way to detect the current running context.

The context of running tests

If you take a look into Locust code, you’ll find runners module with classes responsible for managing locusts lifecycle. There are subclasses dedicated to master, slave and standalone nodes, that can be utilized like this:

Please remember, that when running in standalone mode, the node plays a role of both master and slave, so you may want to consider this if you want your code to be portable between standalone and distributed mode.

The context of locust file code

If there is some code running in the global scope of your locus file, the following can be used to detect where it runs:

Reading CSV files

I used csv package here to create a simple object which sole responsibility is to read the file and return an array of dictionary objects (so attributes can be accessible by CSV column name):

Master-slave communication

First of all, let’s figure out how the master could distribute some information among its slaves.

There could be many approaches here, but for this experiment, let’s choose one that:

is simple enough to illustrate the principle,
doesn’t introduce new dependencies (new external components, frameworks).

Well, Locust comes with ZeroMQ as its communication “backend”. Browsing quickly through the library’s documentation it appears it covers some messaging patterns including the one that looks of particular interest to us:

ØMQ — The Guide — ØMQ — The Guide

If you’ve done any work with threads, protocols, or networks, you’ll realize this is pretty much impossible. It’s a…

zguide.zeromq.org

Take a look at the following diagram (taken from 0mq docs)

source: https://github.com/imatix/zguide/raw/master/images/fig5.png

In short: The ventilator pushes tasks to the workers. Workers pull the data and do their job. The workers send their results to a sink.

In our case:

Maser would play the role of the ventilator
Tasks running on slaves will act as workers
we are not interested in sending any results from the workers (no pushing to Sink is required)

The steps cover the scenario of input data distribution that we’ve planned and ZeroMQ would be more than enough for this experiment, then.

The execution

OK. Here goes the spoiler: ventilator-workers pattern looked really great, but it doesn’t work so great for me. (Of course, I know that only after writing all the code already.)

Disappointment

After running some tests (with different numbers of workers, different hatching rates) I saw data distribution acted really weird.

The messages’ order was greatly disturbed and the timing was, let’s call it, less than optimal.

I don’t have a definite explanation for this yet, only a working theory.

The docs say:

A socket of type ZMQ_PUSH is used by a pipeline node to send messages to downstream pipeline nodes. Messages are round-robined to all connected downstream nodes.

If the receivers are chosen with a simple round-robin, without knowing if they are able to consume the message, there is no real load balancing going on there. I think the following happens:

On the left, we have an outline of an ideal situation. On the right, we have a worker that is about to receive a message, but it’s busy. I suspect that sending blocks (and stays blocked until the worker is finally ready for the message), messing up the entire flow.

New hope

I changed the approach for data distribution for the one that allows the workers for “volunteering” to receive a message when they are ready for it. In case the worker is busy, it will simply not take part in the data exchange not preventing other workers from consuming the messages:

I’ve kept the problematic code in locust-scripts/locustfile_pushpull.py for further investigation or for reader’s inquiring mind and I am focusing on the new solution.

The key elements here are:

code for sending the data to workers
code requesting new data
code initializing the process

Let’s quickly go through them…

Sending logic is quite simple:

The class uses an internal queue to store the input data (data to be sent). It populates the queue on init (lines 3–4).

The run method loops forever and waits (blocks) on a new message to be received (line 15). If the message text is “available”, it tries to get a portion of data from the queue. If the queue is empty it sends a special message back (empty JSON object, line 25) to inform the client that there is nothing more to do, otherwise, it sends proper data to the worker (line 20).

The feeder object can be initialized like this:

def init_feeder():
    sender = ZMQFeeder(INPUT_DATA, 
                f"tcp://0.0.0.0:{FEEDER_BIND_PORT}")
    sender.run()def on_master_start_hatching():
    gevent.spawn(init_feeder)events.master_start_hatching += on_master_start_hatching

It will be executed on the master when the tests will be started. An important piece of information here is, that the feeder will work in its own greenlet and will not block other code running on the master.

Receiving part is even simpler:

await_data method’s job is to ask the master for new data (l. 11) and receive and return it when available. Then, the test code that utilizes the class will run on the slaves and could look like this:

On start, it initializes the requester (line 3). The actual work is happening in task1 method: it waits for new data to be delivered (line 6) and uses it for making the request (l. 11) if something meaningful was sent.

Now, that we have all the code in place (see: locust-scripts/locustfile.py), let’s try it out.

Testing

For testing the solution we need to run Locust in distributed mode. You could do this by manually running master and slaves yourself from the command line. Another option here is doing it with kubernetes:

Locust.io experiments — running in Kubernetes

Locust supports running load tests on multiple machines. It’s a perfect fit for Kubernetes which makes distributed…

medium.com

This time, on the other hand, I’ll use Docker Compose (my Kubernetes doesn’t want to get up after last Windows update…).

Setting up the cluster

For setting up the cluster two versions of docker compose configuration were created: one for running Locust headless (without web UI) and one for running the UI, too.

Docker compose files include the definition of services for Locust master and locust slave with all required environment variables set and a volume containing test code configured. Take a look at docker-compose-headles.yml, for example:

Locust master and slave docker compose definition

You will notice, that the test will run for 10 seconds (-t 10s) with no GUI (--no-web). You will also notice the settings for the feeder (host, port), the delay between the tasks (TASK_DELAY) and the address of web service we are testing (ATTACKED_HOST).

Tested service mock

The docker compose file also contains a small section for simulating the system under test was included.

The whole use case (to remind you) is about reading some input data and using them to form POST requests. In order not to write anything here, I quickly googled out docker image that will play the role of the tested system.

The service is able to accept different requests, including POST, and it will simply log the request (with the body) so we can confirm which data was used.

Maybe mocking the system under test could be easier. For me, using “kennethreitz/httpbin” and “lucascimon/nginx-logging-proxy” docker images was the quickest.

Test data

This experiment assumes there is SourceDataReader class that reads data.csv file. The code “doesn’t care” about the format of data it uses. It is written in such a way that after reading the CSV file, it returns an array of dictionary objects (so attributes can be accessible by CSV column name).

The test data I used is the list of top-ranked chess players that contains 1000 records. I had obtained the test data with data/obtain_elo_ranks.py. See data/README.md for the details.

If you want to play with your own data, make sure it is a comma-separated CSV file (with a header).

Results

Let’s start the tests:

docker-compose -f docker-compose-headless.yml up

This will bring up all the containers — Locust master, slave, tested service — and start logging in the console.

The logs will represent the order of events: starting the nodes, reading the input data by the master, starting the “feeder” and then requesting new data by the slave and using them to issue POST request:

It will probably take some time for you to find some order in this.

In the real code (see the associated GitHub project), I had also added appending the data received by slaves to a text file (with timestamps) — received.txt. This way it is easier to track the data distribution.

Conclusions

It looks like the presented way of distributing data to Locust slaves works quite well. Writing the communication code using ZeroMQ was quite easy. On the other hand, quite a thorough understanding of ZeroMQ internals was required.

The implementation is probably enough for most of the simple use-cases.

There are topics here, though, that would require more thoughts. For example, should we stop the tests when we ran out of input data? If yes, then how? Expanding on the topic would give some possibilities for more sophisticated testing. Probably a topic for some future experiment…

Anyhow, I hope you can find the article helpful and informative.

Thank You!

Few last formalities:

As usual, the sources mentioned in this article are stored in https://github.com/karol-brejna-i/locust-experiments. feeding-locusts folder holds locus files, docker compose files, etc. (see the readme for details).

A docker image for running Locust 0.9.0 on Python 3.6 was used ( source: https://github.com/karol-brejna-i/docker-locust)