Globally distributed load tests in Azure with Locust

Published in

Microsoft Azure

7 min readMar 30, 2021

I recently came across the challenge to conduct massive distributed load tests on an scalable application hosted in Azure. After doing some research on the web I found a couple of viable tools, including Locust which I had already heard of in one of my recent customer projects. I therefore digged a bit deeper into Locust to learn more about it and its capabilities and here are my results.

Locust is an easy to use, scriptable and scalable open source load and performance testing tool. You define the behaviour of your users in regular Python code, instead of using a clunky UI or domain specific language. This makes Locust infinitely expandable and very developer friendly.

One of the first blog posts I found, related to Locust on Azure, was Davide Mauri’s “Running Locust on Azure” here on medium.com. He describes pretty well how to run Locust on Azure Container Instances (ACI) using Azure Resource Manager (ARM) Templates to deploy the required components and gave me a good starting point. I, other than Davide, decided to use Terraform instead of ARM to deploy my infrastructure as this fit better into my existing setup and added a couple improvements you will see further down.

Locust consists of master and worker nodes, the master node is doing the test orchestration and is hosting a web interface (it can also be used in a headless mode for example as part of a CI/CD pipeline, more about that further down) where we can start tests and see live statistics, the worker nodes are running the tests itself, simulating our users. Locust can be used on a single machine as well as distributed using Docker containers. We are going for the container option and we therefore need a solution to host our containers, in my case I have chosen Azure Container Instances (ACI).

The reason why I’ve decided to use ACI instead of Azure Kubernetes Service (AKS) or other platforms is the fact that ACI is a Platform-as-a-Service (PaaS) offering in Microsoft Azure and “serverless”, that means that it comes completely without any infrastructure we have to set up and maintain and therefore without any overhead, we can spin up our master and worker nodes on-demand and scale them down to zero when we do not need them. This gives us full flexibility and also saves money.

In addition to that we need a place to store our locustfile.py which contains our test definition. I’m using an Azure Storage Account file share for that, all our container instances will mount it to load the test definition from there. I’ve also added an Azure Key Vault to store the credentials used for the Locust web interface authentication in a secure way instead of writing them in plain text into my Terraform definition.

The overall architecture in the end will look like this:

Before we dive deeper into the deployment itself, why does it make sense to run tests distributed, using multiple instances in different geographical regions?

First of all do we want to have a realistic usage pattern, typical applications, especially the ones with global coverage and usage, are accessed by users in various locations in different geographical regions. This is even more important when the application itself is deployed across multiple regions using global load balancing technologies and services like Azure Traffic Manager, Azure Front Door, AWS Route 53 or others.
The other, more technical reason are the technical limitations of a single instance. A single Virtual Machine or a single Container is limited in multiple ways, it has only a fix number of ports, it is limited by its CPU and memory as well as bandwidth. Other limitations might be network latency, throughput and the fact that all requests are coming from a single IP or IP range.

Using a larger number of instances, spread across multiple data centers and geographical regions adresses the various limitations listed above, helping us to have a more realistic usage pattern for our load tests.

Deploy Locust on Azure

Let us now take a look into how to deploy Locust. Starting with the master node, our Terraform definition (below) will create a single master instance (if var.workers is ≥ 1). It also allows us to scale to zero by setting var.workers to 0. It is using a container instance with a public DNS name, accessible on port 8089 (which is the default port for the locust web interface) and is protected using basic web-auth with its password stored in Azure Key Vault (more details further down below).

Our worker nodes are defined in a pretty similar way with some small differences, they can scale between 0 and n (only limited by the available container groups in your subscription), they do not have a web interface and they communicate with the master node on port 5557/TCP. Our workers also do not have a public DNS name.

The port definition below is not really needed, the reason it is in here is that TF requires it.

Interesting to note in the definition above, in line 4 (in locust-worker-terraform.tf), is the use of a list of worker_locations. This list contains all the Azure Regions we want to use for our load tests (see locust-variables-terraform.tf below). Our workers will be spread across those regions randomly based on the number of workers we are going to deploy.

The end result of our deployment looks like this, with eight worker nodes, one master node as well as an Azure Storage Account and Azure Key Vault:

Locust ACI deployment with 8 worker nodes

Our Locust web interface is accessible on the locust-master container instance FQDN on port 8089 (Terraform will also print out its full FQDN in its outputs). It is protected using basic web-auth, the username is set to “locust” and the password is stored in Azure KeyVault:

Locust web-auth password stored in Azure Key Vault

Accessing the Locust web-interface will bring up the following dialog that lets you configure the total number of users to simulate, the spawn rate as well as the target host:

After starting a new test we are seeing our load test results coming in in near realtime:

As well as details about our setup, including the number of worker nodes, requests per second and failure rate:

How to further automate Locust?

Now that we have seen how to deploy Locust via Terraform on ACI to conduct tests via the web interface, how can we further automate the use of Locust? One option you can see in my repository is to deploy Locust via GitHub Actions, you can scale Locust up when needed, conduct load tests via the web interface and scale it down afterwards. That’s a viable way to spin up your test infrastructure on demand.

The master’s public DNS name will be shared at the end of a successful workflow run as an output of Terraform, the password to access the web interface is stored in Azure Key Vault, as described above:

You can use the same GitHub workflow to scale down the environment by running the same pipeline again, setting the “Number of Locust worker nodes” to 0 which will remove all infrastructure except the Storage Account and Key Vault.

But what about even more automation? As mentioned above, Locust does also support a so called “headless” mode. In headless mode we do not have a web interface and specify more things like the number of nodes, spawn time, run time etc. upfront, before we deploy our test infrastructure.

This workflow will spin up the required infrastructure in Azure, conduct the load test as defined when starting the workflow, write the test results into its Storage Account and will tear down the infrastructure (except of the Storage Account that contains the test results).

As you can see in the previous screenshot are our tests results written to Azure Storage as CSV files — in case you want to convert these test results into an easier to read HTML report I recommend to take a look on to the Locust HTML Report Converter. It’s a simple Go command line application to convert the results into a HTML report.

Locust HTML Report Converter — Example Report

If you want to see more of my Terraform code including the code for a headless Locust deployment, the discussed GitHub workflows etc. please visit my locust-on-aci repository on GitHub. You’ll find all the components you need to setup the exact same deployment that I’ve described here.

In case you’re wondering where the fancy names like ‘guidedchimp’ and others in my previous screenshots came from, for deployments that require flexibility and to avoid naming conflicts i usually use the random_pet resource of the Terraform random provider.

Globally distributed load tests in Azure with Locust

Deploy Locust on Azure

How to further automate Locust?

Written by Heyko Oelrichs