Distributed Load Testing on K8s Using Locust

Varun Mallya
Published in
7 min readDec 3, 2021


One of the main objectives of load testing is to identify the maximum operating capacity of the application. We can simulate real load scenarios and understand how our system performs under such loads, test out if our application can scale and if the current setup has enough resources to support the load.

It will eventually help us to resolve issues that we may encounter in the future. For example, if we have a web app, we may want to identify the request latency time at peak load or find out what percentage of requests end up failing and if that is within our SLO.

Github Repo: https://github.com/vmallya-123/learn-locust


Locust is one of many open-source tools which aid in load testing. There are many other tools in the market like JMeter, Gatling, etc. However, Locust has some advantages over them, the biggest one being that the locustfile is just a regular Python file.

A simple locustfile will be a list of tasks that simulate activities a user performs when he/she interacts with your app. There is no need to work with a different UI in setting up a load testing job. Another big plus point is that we can perform load testing in a distributed manner, with few additional lines of code. Locust comes with many features such as event hooks and messaging which helps in performing setup and communication between the master and workers (more on this later!).

Load Testing Example

Let us first have a small web app against which we can perform our load test. This is a FastAPI app that has mainly two endpoints:

async def hello(info : Request):
return {"Hello": "World"}
async def getInformation(info : Request):
req_info = await info.json()
return {
"status" : "SUCCESS",
"data" : req_info

Now, let’s write a simple load test scenario. This small script can run a load test:

from locust import HttpUser, task, between
import json
URL = f"http://localhost:8000/"sample_data = {"id": 123, "name": "Jay", "city": "Pune"}class APIUser(HttpUser):
wait_time = between(1, 2)
def hello_world(self):
def get_information(self):
self.client.post(URL + "getInfo", json.dumps(sample_data))

We can run this by simply mentioning locust in the cli.

This would point us to a UI where we can specify the test parameters.

Click on Start swarming to initiate your test!

Let us take a closer look at the locustfile. Mainly at the decorator task and function between. The task decorator is mainly used to list down the tasks, these are the endpoints a user will hit. We can see weights assigned to each task, in the example above we can see that getInfo task has two times the weight of “hello world” which makes Locust twice as likely to pick the getInfo task. To make a user sleep between sequential task execution we have between which will make the user sleep from 1 to 2 seconds before picking up the next task.

Distributed Load Testing

The above example can run on a single machine but sometimes we may need to run the load test in a distributed fashion. This may be when we have a complicated test plan or we want to generate even more load. Locust supports this.

Let us take another web app that has one endpoint which tells the file size. We will generate distributed load for this app. This setup involves one master and 5 workers. The role of the master is to divide the load between workers and keep track of their progress. In this case, the master will distribute the files equally between workers, and the workers will hit the app endpoint by downloading those files from a GCS bucket.

The setup would look like this:

locustfile for distributed load testing has inbuilt event hooks:

def on_locust_init(environment, **_kwargs):
"""This runs on initialization of locust, before any test starts"""

logging.info("initializing locust message types and listeners")

if not isinstance(environment.runner, MasterRunner):

environment.runner.register_message("handle_files", setup_workers)

if not isinstance(environment.runner, WorkerRunner):

environment.runner.register_message("acknowledge_file_recv", on_acknowledge)

Event hooks allow us to run code at specific time intervals. The above code, for example, runs when Locust is initialised. When Locust runs in a distributed environment, the same file is shared between workers and the master. However, we would still like to run some logic in the master but not in the workers. Locust allows us to differentiate between the two using the environment attribute, which informs us about the runner. The runner can be a MasterRunner, WorkerRunner or LocalRunner.

Locust also allows us to define messaging hooks by running register_message. These help us to communicate between Master and Worker. In the above function, we are registering two messages. One is of type handle_files and the other is acknowledge_file_recv. The handle files have a listener (setup_workers) associated with them, which runs every time it receives a message of that type.

Therefore, every time a worker receives a message of type handle_files, setup_workers runs and downloads the files from GCS to a local path. After a successful download, it sends a message back to the master informing that the file has been processed. This is done by sending a message to acknowledge_file_recv.

Think of the message types as topics. Internally, Locust is using ZeroMQ for communication.

def on_test_start(environment, **_kwargs):
"""When the test is started, evenly divides list between
worker nodes to ensure unique data across threads""" if not isinstance(environment.runner, WorkerRunner): files = [] for blob in storage_client.list_blobs(BUCKET_NAME): files.append(blob.name) worker_count = environment.runner.worker_count chunk_size = int(len(files) / worker_count) logging.info(f"number of files to be divided between each worker is {chunk_size}") for i, worker in enumerate(environment.runner.clients): start_index = i * chunk_size if i + 1 < worker_count: end_index = start_index + chunk_size else: end_index = len(files) data = files[start_index:end_index] logging.debug(data) logging.debug(f"sending this data {data} to this {worker}") environment.runner.send_message("handle_files", data, worker)

The above event, as the name suggests, runs when we start a test. The code runs only in the master and it is dividing the files equally amongst the workers. It lists files that are present in the GCS bucket, and a subset of this list is shared amongst the workers. This is done by sending over the list to the handle_files topic.

After the files are downloaded by the workers to the local, they start running the task, i.e calling the API.

Deploying On K8s

First, we need to build our Locust docker image. In the example, you will notice a DockerFile under distributed folder. You can build an image by running:

docker build . -t <your_dockerhub_repo>:<tag>

To deploy on k8s we need 2 deployments and 1 service. The master and worker deployments, and service to expose the Master Web UI.

The manifests are placed under k8s folder. There is a kustomization file that can be used to modify the image name. Currently, it is using a pre-built image. But if you change the code, make sure to build a new image. Once you have replaced the image name in kustomization.yaml, you can run:

kustomize build distributed/k8s | kubectl apply -f -

You can then access the Master UI by retrieving the external IP of the master service

kubectl get svc locust-master

Access the Master UI on port 8089. As we did for the non distributed example we specify the number of users, spawn rate and the host to initiate the load test.

Running Without UI

In some cases, we may want to run Locust without the UI; maybe when we want Locust to be part of the CI. This can be achieved with the following command:

locust -f locustfile.py --headless -u 1000 -r 100 -H <http://host-to-test

Where u is the number of users to spawn, r is the rate at which the users are spawned per second. To run headless mode in a distributed setting, we need to specify expect-workers when we start the master worker. This tells the master to wait for the given number of workers before initiating the test

locust -f locustfile.py --headless -u 1000 -r 100 -H <http://host-to-test --expect-workers 5

Locust Test Results

Interpreting Locust test results vary from use case to use case. In our case, we wanted to use Locust to mainly identify how our app would respond to high load.

The app in question is an ML app that accepts image files and returns annotations. We had arrived at peak load estimates and wanted to test our system against that load. During iteration, we identified some scaling issues which were rectified and got an estimate on request latency and expected number of failures. This helped us to understand the number of replicas we need to set up if we wish to keep the request latency and number of failures under a certain threshold.

Exporting Metrics

Locust web UI allows us to download the report of the test, and also retrieve the test metrics as a csv file by clicking on the Download Data tab. However, if you use Prometheus as your metrics DB, we can easily export the Locust metrics to Prometheus and use Grafana to visualise the load test metrics.

This can be done by using LocustExporter. The master deployment needs an additional container that deploys the metrics exporter along with the master. To enable this, you have to deploy master-controller-with-metrics.yaml instead of master-controller.yaml.

There is also a Grafana dashboard that comes along with the exporter.


Hopefully, this article can help you in setting up a load testing environment on K8s. Please let me know if you face any issues in the setup or if you have any burning questions!

If you like working with Kubernetes, crafting your own solutions, and wrangling Machine Learning Pipelines, then maybe you would make an excellent fit for the team!