HAProxy — PART II — Caching API with HAProxy

Nexsol Technologies
4 min readJun 14, 2024

--

HAProxy — how to add cache to an API

At Nexsol Technologies, we love PostgreSQL databases and PostgREST to expose APIs.

Before getting started, it’s recommended to read the PART I of this series if you haven’t done so already.

Now, let us explore optimizing PostgREST performance using HAProxy with HAProxy cache, as well as implementing security measures for the PostgREST API (as demonstrated in PART I).

While the process is straightforward, it does require some expertise :-)

Find all the project files here in our repo on GitHub

HAProxy cache

For security reasons, the HAProxy cache will never be used when the request contains an Authorization header.

Find here all the HAProxy cache limitations : https://docs.haproxy.org/2.9/configuration.html#6.1

HAProxy cache will not be used if PostgREST doesn’t return a Cache-Control header. To achieve that, we need :

- A PostgreSQL function : custom_headers (see header.sql)

- PGRST_DB_PRE_REQUEST parameter (see docker-compose on demo-api service)

In this example, we have configured a HAProxy cache like this :

cache mycache
total-max-size 500 # Total size of the cache in MB
max-object-size 1000000 # Max size of one object in bytes
max-age 3600 # Data will persist 1h in the cache

HAProxy is also configured to add an http header X-Cache-Status.

The possible values are

- MISS if data is not present in the cache

- HIT if data is present in the cache

To be sure data is in the cache, just run this twice :

curl -XGET http://localhost:8091/postal?postalcode=eq.1227&order=common.asc --verbose

And take a look at the headers returned by HAProxy (x-cache-status : HIT):

< HTTP/1.1 200 OK
< transfer-encoding: chunked
< date: Fri, 14 Jun 2024 09:14:08 GMT
< content-range: 0-4/*
< content-location: /postal?order=common.asc&postalcode=eq.1227
< content-type: application/json; charset=utf-8
< x-cache-status: HIT
< content-security-policy: frame-ancestors 'none'
< x-content-type-options: nosniff
< x-frame-options: DENY
< cache-control: no-store
< access-control-allow-origin: *
< strict-transport-security: max-age=16000000; includeSubDomains; preload;

Performances benchmark

To mesure the performance, we used bombardier on a Apple Mac M1–8GB of memory.

100000 http requests and 125 clients

The database contains all the Swiss postal codes.

  • HTTP port 3000 is served by PostgREST
  • HTTP port 8091 is served by HAProxy

Performances comparaison — with and without HAProxy

Test 1/ On a small query returning 5 rows

./bombardier -c 125 -n 100000 http://localhost:8091/postal?postalcode=eq.1227&order=common.asc
./bombardier -c 125 -n 100000 http://localhost:3000/postal?postalcode=eq.1227&order=common.asc

In this scenario HAProxy cache makes it about 7x faster and it uses about 4x less CPU during the test …

Test 2 / On a medium query returning all the cities of Geneva Canton.

./bombardier -c 125 -n 100000 http://localhost:8091/postal?canton=eq.GE&order=common.asc
./bombardier -c 125 -n 100000 http://localhost:3000/postal?canton=eq.GE&order=common.asc

And in this case HAProxy cache makes it about 14x faster and uses about 6x less CPU during the test …

By default, PostgREST doesn't perform any compression on the results delivered to the APIs. This is easy to implement using HAProxy, we simply add a compression directive to the backend of the API, this will shrink the responses’ size about 7 to 10x, therefore minimizing latency from a client’s perspective:

    # gzip compression
compression algo gzip
compression type application/json

In conclusion, using HAProxy as an API cache can be extremely beneficial in two main ways: requests to your API are much faster, and they require significantly fewer system resources and less bandwidth (thanks to gzip compression).

This type of solution stands in contrast to implementing cache directly within the API code by developers.

The definition of the cache and its behavior can be easily configured and modified in HAProxy, which is always less complicated and costly than coding it directly into the API.

Ultimately, this is a win-win for both DEV and OPS teams:

  • DEV: Developers don’t have to manage the complexity of implementing an API cache within their code.
  • OPS: Operations teams experience less pressure on their infrastructure.

Both DEV and OPS save time and money.

In our third and final article on HAProxy, we will explore how to set up and configure HAProxy without requiring any technical skills, simply by describing what you want to achieve:

  • DEV: Developers will describe in a simplified manner what they want to implement.
  • OPS: Operations teams will set up the load balancer as an infrastructure service and configure it as code.

All of this will be achieved with a single, powerful Opensource tool.

--

--

Nexsol Technologies

Based in Switzerland, we are driven by a commitment to excellence, precision, and environmental responsibility.