Locust.io experiments — Emitting results to external DB

Karol Brejna
Locust.io experiments
16 min readOct 17, 2019

One of the things that Locust does for you when used as your testing tool is collecting request/response data.

By default Locust periodically dumps test results and computed aggregated values like median, average response time, number of requests per second, etc.

Source: https://asiatrend.org/lifestyle/10-plagues-egypt-ai-era/

They are presented in the UI, printed in the logs or can be persisted to a file (using — csv option).

Basically, you have access to request-specific distribution pre-aggregated data, as illustrated in the following picture:

Statistics screen from Locust 0.11.0

Most of the time, the information presented in such a way is good enough.

In this article, I’ll try to show how to deal with cases where you would like to do more with the results — correlating with some CPU/mem/IO metrics, archiving for future analysis, drawing charts, etc. — which would require storing them in a database.

Moreover, let’s assume that having only aggregated data is not enough (for example you want to be able to analyze/drill the results down to a single request).

In one of the previous installments of this series, Locust.io experiments — enriching results (https://medium.com/locust-io-experiments/locust-io-experiments-enriching-results-183d2ae4a4c2), I’ve touched on the topic of capturing atomic results.

Now, building upon the information on how to access individual requests data, I will introduce the following elements:

  • programmatically accessing aggregated request data,
  • sending data to an external database,
  • making sure that communication doesn’t affect test performance.

Overall solution description

The logical architecture of the solution could look like this:

Logical architecture

Assuming there is a “System Under Test” (SUT), we use Locust cluster (master and slaves) to stress the system.

Internal mechanics of Locust determine at which points we are able to collect (and then emit to an external database) test results:

  • atomic results (request by requests) are handled by the workers (success handler and failure handler)
  • aggregated data are collected by the master from the workers

To persist atomic data, we’ll need to write some code for the workers. To persist pre-aggregated data — some code running on the master is required.

In other words:

  • SUT is our system under testing
  • Locust will do the stress/load testing
  • Locust will send the results to Elasticsearch

Elasticsearch

For this experiment, I am going to use Elasticsearch as the backing database.

Depending on the technology stack you have in your production or development environment, experience and preferences, you could probably go for some time-series database (InfluxDB, Graphite — you name it) or monitoring solution (Prometheus, Amazon CloudWatch, etc.).

My choice was purely subjective: I had good experiences using ELK stack before, it’s quite popular, it has simple API and a new version has been just released so I will have a chance to play with it!

The code I am showing here can be easily adjusted to support the backend of your choice if needed.

Elasticsearch basics

This article is not about Elasticsearch (ES) itself — it focuses on the mechanics of collecting data by Locust and persisting them in a DB — but let’s spend a few paragraphs on the product so the terminology and ideas used here are clear.

ES is an open-source tool that started as a full-text search engine based on Apache Lucene. (I am saying “started” because right now it includes many features useful for data analysis, graph traversals or machine learning.) Since its release (in 2010), it has quickly become the most popular search engine (for example, number 1 in search engine category, and first 10 in database engines in general according to https://db-engines.com/en/ranking as for May 2019).

Let’s take a look at some concepts.

Index, document

Elasticsearch stores data (documents) in indices.

An index is a named collection of documents. You could look at an index as something similar to a SQL table. You could have for example an index for users, another one for products, orders, whatever.

In a single cluster, there can be as many indexes as needed.

Documents in Elasticsearch are represented in JSON format. Hence, to store a document you send a JSON object to ES and when retrieving a document, you’ll get JSON object as a response.

When stored in some index, it’s given a unique ID. Within an index, you can store as many documents as you want.

The following API call stores a document in myindex under id = 1:

curl -X PUT "localhost:9200/myindex/_doc/1" \ 
-H 'Content-Type: application/json' -d'
{
"metric": "average",
"value": 7
}'

Scalability and high availability

ES gives the ability to manipulate large datasets. It does that by providing the ability to partition any index into multiple smaller pieces called shards. Each shard is, in fact, a fully-functional index that can be hosted on any node in the cluster.

Elasticsearch handles shards smartly, so you can horizontally scale the cluster and have better performance when searching — searches can be distributed and parallelized across shards.

Another positive aspect of Elasticsearch is high availability. It allows you to make one or more copies of your index’s shards (called replicas) to store them on different nodes. When a node fails, one replica will take over the role of the primary shard.

Indexing, mapping

When you store your document in Elasticsearch, it processes it using mapping to be able to search it or analyze it more effectively. As a user, you can decide which fields of the document need to be stored, which don’t, what type they will have (string, date, number, …), which will be subject of full-text search, etc.

Moreover, the text undergoes a process called analysis. This includes breaking down a text into smaller portions (tokens, terms), removing stopwords (frequent words as “and”, “the”, etc.) and reducing words to their root form (removing the difference between singular and plural form, the difference between lowercase and uppercase, difference between tenses, etc.)

This way Elasticsearch prepares some look-up indexes for itself for fast and accurate searches.

Querying

Retrieving a document, e.g. the one we’ve indexed a few paragraphs before — when you know in which index it resides and the id of the document — can be done with a call like this:

curl -X GET "localhost:9200/myindex/_doc/1?pretty" \
-H 'Content-Type: application/json'
{
"_index" : "myindex",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"_seq_no" : 0,
"_primary_term" : 1,
"found" : true,
"_source" : {
"metric" : "average",
"value" : 7
}
}

But more interesting usage patterns can be delivered with Query DSL. It allows for combining multiple search/filter condition, including:

  • match the exact value, match value greater/smaller than a given one,
  • test existence of a value (match documents that have a given attribute set/non-empty),
  • text search related (match phrase, match the exact text, regex match, match word similar to a given one, etc.),
  • geo match (find objects that are located in a given area, find an object within given distance).

Which is more, the Elasticsearch API allows for different kinds of aggregations. I won’t spend time describing them here, we’ll use some of them to get some insights into the results later on.

More info

I’ve just only mentioned some of the terms relevant to Elasticsearch discussion. They are probably obvious things to some of you, but if you are new to Elasticsearch, I’d suggest some good readings from the producer:

Project code walkthrough

After this short interlude, let’s go back straight to business.

The git project accompanying this article contains that addresses these areas:

  • test logic — example test code that is producing some results (so we can collect them and send to Elasticsearch, obviously),
  • intercepting test results — code for collecting the results (both pre-aggregated and individual),
  • storing data in Elasticsearch — code for sending the results to Elasticsearch,
  • dealing with communication costs — recipes for making database communication a bit more efficient

The following sections try to explain some implementation details.

Test logic

To make the results a bit more interesting, I decided to use httpbin server as the system under test (SUT). It is a very useful development tool that is able, among other things, to:

  • Receive different HTTP requests (GET, POST, etc.),
  • Respond with different payloads (JSON, XML, etc.),
  • Simulate different response lengths and times.

The following test code will produce numbers that allow for a demonstration of some analysis (for example: find the slowest requests — which wouldn’t make much sense if all the responses would look almost identical).

You see that few different HTTP methods being used (again, to have more diverse results) and some error response produced intentionally (/status/404 endpoint). One part to be noted is line 4.

First of all, it uses /delay/ endpoint with a random wait value. This will cause httpbin to “process” the request for the given time length.

Secondly, gevent.spawn is used, so Locust doesn’t block on the call and go on with other requests instead (so we can have many more requests during the same time).

Intercepting test results

As we planned at the beginning, we’ll try to collect both individual and aggregated results from the tests.

In Locust, individual results are produced every time some request is issued. You can intercept them using request_success and request_failure event hooks. The mechanics were already described in Locust.io experiments — Enriching results, so I won’t repeat myself.

Take a quick look at the code:

This method will be added as an additional success_handler. Its job will be to prepare a message to be stored in an external database and use a forwarder object to remember it (soon, I’ll tell more about the forwarder), and the forwarder will take care of the actual sending.

As for aggregated results, Locust slaves collect atomic results and combine them to compute aggregations (relevant Locust code is located in stats.py if you are interested) and sent to the master.

You can leverage slave_report event hook to get access to the data. In our case, we have report_data_producer method added as the slave_report event listener:

locust.events.slave_report += report_data_producer

For the sake of the complicity, the code sends some portion of the pre-aggregated results to Elasticsearch. On the other hand, this is not that interesting from the point of storing request data in some external database and being able to analyze them.

Some details on slave_report payload are explained in the companion git repository. Let’s proceed with the more important stuff.

Storing data in Elasticsearch

I decided to use the official Python client. Please, take a look at the example from the documentation of that package.

In order to save a document in Elasticsearch, you need to connect to the database first (es = Elasticsearch()) and then simply index the document (es.index(index=”my-index”, id=42, body={“any”: “data”, “timestamp”: datetime.now()}). You can find the relevant code in https://github.com/karol-brejna-i/locust-experiments/blob/emitting-results/sending-results/locust-scripts/tools/elastic.py.

Note, that indexing operation (see: “Index, document” section) is actually translated to an HTTP call. Such a call takes time…

Let’s see what we can do about smoothing out the costs.

Dealing with communication costs

As communication with the external database is expensive, we’ll try to minimize the impact of sending the information about every single request.

Here is the proposed solution to this problem.

First, we’ll use an internal buffer to store the information. While collecting the results we will only do a write to this buffer. This will be fast.

Then we can have a dedicated routine that will read data from that buffer and take care of the actual sending. The code can run in the background (namely in its own greenlet) — at its own pace. This way the communication won’t disturb the execution of the test scripts.

The piece of code responsible for that is a DBForwarder class.

Dealing with different backend systems

In this article, we’re storing the results in Elasticsearch database. Underneath, we are sending JSON objects over HTTP. In the future maybe we’ll want to switch to some other DB (or simply add yet another). Then the format of the exchanged data and/or the communication channel can be completely different.

Please, take a look at the following code that implements the features discussed above.

Lines 9 t o17 hold the definition of a base backend adapter. The class declares methods that a concrete database adapter (the object that is responsible for sending data to a given database) should implement.

Actually, only one method is responsible for the actual work: send is supposed to do just what the name suggests — send some data to the database.

Then goes the DBForwarder class definition (lines 23–48):

  • add_backend and remove_backend are used to modify backend lists. For now, the forwarder will send data only to Elasticsearch backend, but in the future we may want to forward results to other databases as well. Then we’ll just add another backend to the forwarder
  • add is used to queue new data to be sent
  • run does the heavy lifting: in an infinite loop it takes the data that are stored in the internal queue (line 45) and pushes them to any backend defined in the forwarder (line 48)

That’s pretty much it.

What is left is initializing and starting the forwarder “in the background” on master and workers. The following code does that:

​Running the experiment

Presented experiments can be run on any platform of choice: bare metal, VM, cloud, Kubernetes, etc. I’ve prepared a set of docker and docker composer files to get the advantage of the simplicity of development with docker.

The next few sections explain setting up different components of the experiment.

Starting Elasticsearch and Kibana

Elasticsearch will collect the test results and Kibana can be used for their visualization.

There are two “flavors” of these components prepared here.

One defines full 3-node Elasticsearch cluster. You can start it using docker-compose-elastic.yml:

docker-compose -f docker-compose-elastic.yml up -d

The other one is defined in docker-compose-mini.yml. The following command:

docker-compose -f docker-compose-mini.yml up -d

will start Elasticsearch in a single-node “development” mode (to simplify things). If you want to have everything “clean” there, you should update the default replication factor for the indices:

curl -H "Content-Type: application/json" -XPUT "http://localhost:9200/_template/dev" -d '
{
"order": 0,
"index_patterns": "locust*",
"settings": {
"number_of_replicas": 0
},
"mappings": {
"numeric_detection": true
}
}'

Otherwise, the indices you create will be under-replicated (the default replication factor is 1 — meaning Elasticserach is trying to create one “copy” of each index — and since we have only one node, there is no way to create the copy) and the cluster will appear unhealthy.

Validating Elasticsearch deployment
If Elasticsearch is up and running, it provides an HTTP-based API for controlling the cluster and accessing the data.

The simplest way to validate if Elasticsearch is running is by issuing the following request:
curl -sS http://localhost:9200

After this, you should see something similar to:

{
"name" : "elasticsearch1",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "SzrjlU8PQ6aS3YK5nD856g",
"version" : {
"number" : "7.0.1",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "e4efcb5",
"build_date" : "2019-04-29T12:56:03.145736Z",
"build_snapshot" : false,
"lucene_version" : "8.0.0",
"minimum_wire_compatibility_version" : "6.7.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}

Docker compose scripts should also bring up Kibana. To verify if it’s running, you could open http://localhost:5601/app/kibana in a browser.

This should display Kibana UI:

Starting the tests

To start the tests, you could use either docker-compose.yml or docker-compose-headless.yml.

These docker compose files include the following services:

  • master, slave — for creating a cluster
  • standalone — single node Locust setup
  • httpbin, sut — for emulation of a system under test and request logging

The first option kicks off Locust with the UI. You’ll be able to see the statistics charts there and start and stop the test manually.

The second option starts locust without the UI and runs the test automatically for a given period.

To start the test you need to bring the test endpoint (sut, httpbin) and Locust setup of your choice.

For a headless setup you could do:

docker-compose -f docker-compose-headless.yml up httpbin sut master slave

where you control starting individual services, or start them all together:

docker-compose -f docker-compose-headless.yml up

Inspecting the results

After running the test for some time, Elasticsearch will hold the results.

From this moment, you could do whatever analysis you want. Let’s examine a few examples.

There are several ways of accessing/analyzing the results:

Let’s start with a few simple queries (using curl) then turn to Kibana (which is a tool from Elastic that lets you “visualize your Elasticsearch data and navigate the Elastic Stack so you can do anything from tracking query load to understanding the way requests flow through your apps”).

Basic statistics

Let’s get a few results to see the format it is stored in:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search?pretty -d '
{
"size":5,
"query": {
"match_all": {}
}
}'

It will return a JSON object, something like:

{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1809,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "locust",
"_type" : "_doc",
"_id" : "Qz4IZ20BfogUoBnF3JQf",
"_score" : 1.0,
"_source" : {
"type" : "success",
"payload" : {
"request_type" : "GET",
"name" : "/get",
"result" : "OK",
"response_time" : 5.785226821899414,
"response_length" : 255,
"other" : { }
}
}
},
...
]
}

We see there is information on if the request was successful or note (type field), and the request details (stored in payload object) as request type, size, name of the endpoint, response time and size.

Let’s get five slowest requests:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query": {
"match": {
"type": "success"
}
},
"sort": {
"payload.response_time": "desc"
},
"size": 5
}' | jq '.hits.hits[]._source.payload.response_time'

(Please, note, I am using jq to manipulate/shorten the presented results. You can skip it to see the whole response.)

It’s time for something more complicated. Let’s inspect what are different types of results, different endpoints and how many results there are for them:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match_all": {}
},
"size":0,
"aggs":{
"grouping_by_type":{
"terms":{
"field":"type.keyword"
},
"aggs":{
"grouping_by_name":{
"terms":{
"field":"payload.name.keyword"
}
}
}
}
}
}' | jq '.aggregations'
curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match":{
"type":"success"
}
},
"size":0,
"aggs":{
"grouping_by_name":{
"terms":{
"field":"payload.name.keyword"
},
"aggs":{
"avg_response_time":{
"avg":{
"field":"payload.response_time"
}
}
}
}
}
}' | jq '.aggregations'

I pick a piece of the whole response to explain how to read the results:

{
"key": "error",
"doc_count": 441,
"grouping_by_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "/status/404",
"doc_count": 441
}
]
}
}

The type of the result is “error” (recorded for request that finished in error), there are 441 docs of that type (doc_count), and there are 441 results for URL /status/404.

Now, check the average response times by URL:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match":{
"type":"success"
}
},
"size":0,
"aggs":{
"grouping_by_name":{
"terms":{
"field":"payload.name.keyword"
},
"aggs":{
"avg_response_time":{
"avg":{
"field":"payload.response_time"
}
}
}
}
}
}' | jq '.aggregations'

With the following results:

{
"grouping_by_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "/post",
"doc_count": 476,
"avg_response_time": {
"value": 6.643738065447126
}
},
{
"key": "/get",
"doc_count": 456,
"avg_response_time": {
"value": 6.635955547031603
}
},
{
"key": "/delayed",
"doc_count": 400,
"avg_response_time": {
"value": 4090.4237358283995
}
}
]
}
}

Let’s see some percentiles. Send the following request:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match":{
"type":"success"
}
},
"size":0,
"aggs":{
"load_time_outlier":{
"percentiles":{
"field":"payload.response_time"
}
}
}
}' | jq '.aggregations'

which produces:

{
"load_time_outlier": {
"values": {
"1.0": 4.651503562927246,
"5.0": 4.971170425415039,
"25.0": 5.747049384646946,
"50.0": 6.653642654418945,
"75.0": 1006.5738342285156,
"95.0": 8006.144401041666,
"99.0": 9007.9398046875
}
}
}

And some even more stats for successful requests to /delayed endpoint:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query": {
"bool": {
"filter": [
{ "term": { "payload.name.keyword": "/delayed" } },
{ "term": { "type.keyword": "success" } }
]
}
},
"size": 0,
"aggs": {
"time_stats": {
"extended_stats": {
"field": "payload.response_time"
}
}
}
}' | jq '.aggregations.time_stats'
{
"count": 400,
"min": 4.570722579956055,
"max": 9012.703125,
"avg": 4090.4237358283995,
"sum": 1636169.4943313599,
"sum_of_squares": 10237234378.891014,
"variance": 8861519.608599177,
"std_deviation": 2976.8304635298223,
"std_deviation_bounds": {
"upper": 10044.084662888044,
"lower": -1863.237191231245
}
}

The type of the result is “error” (recorded for request that finished in error), there are 441 docs of that type (doc_count), and there are 441 results for URL /status/404.

Now, check the average response times by URL:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match":{
"type":"success"
}
},
"size":0,
"aggs":{
"grouping_by_name":{
"terms":{
"field":"payload.name.keyword"
},
"aggs":{
"avg_response_time":{
"avg":{
"field":"payload.response_time"
}
}
}
}
}
}' | jq '.aggregations'

With the following results:

{
"grouping_by_name": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "/post",
"doc_count": 476,
"avg_response_time": {
"value": 6.643738065447126
}
},
{
"key": "/get",
"doc_count": 456,
"avg_response_time": {
"value": 6.635955547031603
}
},
{
"key": "/delayed",
"doc_count": 400,
"avg_response_time": {
"value": 4090.4237358283995
}
}
]
}
}

Let’s see some percentiles. Send the following request:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query":{
"match":{
"type":"success"
}
},
"size":0,
"aggs":{
"load_time_outlier":{
"percentiles":{
"field":"payload.response_time"
}
}
}
}' | jq '.aggregations'

which produces:

{
"load_time_outlier": {
"values": {
"1.0": 4.651503562927246,
"5.0": 4.971170425415039,
"25.0": 5.747049384646946,
"50.0": 6.653642654418945,
"75.0": 1006.5738342285156,
"95.0": 8006.144401041666,
"99.0": 9007.9398046875
}
}
}

And some even more stats for successful requests to /delayed endpoint:

curl -Ss -H "Content-Type: application/json" localhost:9200/locust/_search -d '
{
"query": {
"bool": {
"filter": [
{ "term": { "payload.name.keyword": "/delayed" } },
{ "term": { "type.keyword": "success" } }
]
}
},
"size": 0,
"aggs": {
"time_stats": {
"extended_stats": {
"field": "payload.response_time"
}
}
}
}' | jq '.aggregations.time_stats'
{
"count": 400,
"min": 4.570722579956055,
"max": 9012.703125,
"avg": 4090.4237358283995,
"sum": 1636169.4943313599,
"sum_of_squares": 10237234378.891014,
"variance": 8861519.608599177,
"std_deviation": 2976.8304635298223,
"std_deviation_bounds": {
"upper": 10044.084662888044,
"lower": -1863.237191231245
}
}

Summary

In this article, I wanted to demonstrate, that sending results to an external database could be easy. Storing data in the DB gives us some additional options. These include:

  • being able to search or drill down the results,
  • filter or aggregate the data in the way we need,
  • compare results among different test runs,
  • have some alerting in place (for example, publishing warnings on a slack channel if there are some errors),
  • etc. etc.

In the implementation, I tried to make it easy to implement alternative backend (so data are sent to some other database than Elasticsearch) or additional backend (the data are sent to multiple databases).

The code itself is “demonstrate-the-problem quality”. I wanted it to be simple. There is certainly room for improvement. For example, Elasticsearch has the batching capability — in other words, you can send a whole bulk of data in one request. This could improve performance greatly.

Thank You for your patience.

As usual, the sources mentioned in this article are stored in https://github.com/karol-brejna-i/locust-experiments. sending-results folder holds locus files, docker compose files, etc. (see the readme for details).

--

--