How to handle HAProxy log format

Luca Mattivi
5 min readApr 24, 2024

--

During our migration from Rancher1 to Rancher2 and K8S, we also moved our reverse proxies from nginx to HAProxy.

Nobody of us is an expert of HAProxy, so we studied a lot how it works and how to fix issues using it.
There is an important thing that we would like to share, and it’s about logs generated by HAProxy.

For us it’s very important to handle properly the log format, because we have external monitoring tools that ingest logs and allow us to make use of them (dashboards, alerts, error analytics, …).

Custom log format

The default log format could be not enough for you, and you would like a custom log format that can satisfy your needs.

In our case, we have two different log formats:

  • The first one is a basic log format, with only the request url, the response status, time taken and few other things
  • The second one, is a specific custom log format, with lots of information very useful for critical services and for debugging

The second one is of course the most interesting, and it looks like this:

<30>Mar 13 14:06:24 haproxy[6]: 13/Mar/2020:14:06:24 +0000 client_ip="xxx.xxx.xxx.xxx" lb=10.42.1.0 container=10.42.3.158 response_status=200 path=HEAD /api/v1/endpoint HTTP/1.1 auth_token="" client_key="" response_time=8 total_time=8 bytes=391 user_agent="python-requests/2.22.0" host="apiurl.domain" body="" accept_language=""

Here we have lots of useful information:

  • client_ip: The ip address of the client
  • lb: The Load Balancer where traffic is coming
  • container: The container that elaborated the request
  • response_status: The status code of the response
  • path: The requested URL
  • auth_token: If the user is authenticated, we can identify it and follow its requests
  • client_key: If the client application is one of ours, here we have a key to identify it
  • response_time: The time taken by the service to elaborate the request
  • total_time: The time taken by all the infrastructure to answer to the request
  • bytes: The size of the response
  • user_agent: The user agent of the client
  • host: The hostname requested
  • body: The body of the request sent
  • accept_language: the locale of the client

How can we create a custom log like this?

Create the custom log format

HAProxy has lots of attributes that can be used in your custom log format, but some of them are not enabled by default, so you have to do a bit of work to have it in your logs.

Let’s see what we have to set in haproxy.cfg in order to have the log format above:

capture request header X-Forwarded-For len 20
capture request header authorization len 80
capture request header X-Client-Auth len 80
capture request header User-Agent len 400
capture request header Host len 150
http-request capture req.body len 10000
capture request header Accept-Language len 10

log-format "%trg client_ip=%{+Q}[capture.req.hdr(0)] lb=%ci container=%si response_status=%ST path=%r auth_token=%{+Q}[capture.req.hdr(1)] client_key=%{+Q}[capture.req.hdr(2)] response=%Tr total_time=%Tt bytes=%U user_agent=%{+Q}[capture.req.hdr(3)] host=%{+Q}[capture.req.hdr(4)] body=%{+Q}[capture.req.hdr(5)] accept_language=%{+Q}[capture.req.hdr(6)]"

This could be a bit difficult to understand 🙂

The important thing is the log-format instruction, here we have our string with the format that we want.

All variables like %… are standard, you can find a list here (always read carefully the documentation, most of times what you need is there!)

The log format doesn’t have a “standard pattern”, you can have it as are most comfortable with, for us it’s better to have a string that identify every printed value.
The capture requests above the log-format instruction are used to grab custom data, and you have to specify if they are headers or something else (like the body).
Every capture request uses a slot in a capture.req.hdr array, so the order of your definitions is the order where you’ll find the value in the array.
The command %{+Q} is used to put the value (declared with square brackets) between double quotes.

Body logging

One of the most useful thing when debugging applications is to have the body request in logs.
In the section above we already added the body in the log format, but we found two different issues with this.

The first issue is about the length of the body: We spend couple of hours trying to understand why we have empty body in logs or truncated body, and finally we found the reason, in the documentation:

<length> is an optional maximum line length. Log lines larger than this value
will be truncated before being sent. The reason is that syslog
servers act differently on log line length. All servers support the
default value of 1024, but some servers simply drop larger lines
while others do log them. If a server supports long lines, it may
make sense to set this value here in order to avoid truncating long
lines. Similarly, if a server drops long lines, it is preferable to
truncate them before sending them. Accepted values are 80 to 65535
inclusive. The default value of 1024 is generally fine for all
standard usages. Some specific cases of long captures or
JSON-formatted logs may require larger values.

And here we are, the max line length is 1024, also if you set len 10000 in the capture request.

In order to increase this value, you have to set the desired length in the log instruction:

log stdout len 10000 daemon debug

The second issue, is about carriage returns in the body request.
If a client sends a request with carriage returns in the body, you will have something like this in log:

As you can see, you don’t have a single log for every request, instead it’s split into multiple lines, causing most of analyzing tools to not properly decode this log.

Fortunately, HAProxy has a VERY NICE functionality that allows to make operations on all variables and attributes that are between square brackets.
So in our case, we used the regsub function on the body value:

body=%{+Q}[capture.req.hdr(5),regsub(\n,\\n,g)]

And now the log will be on a single line.
Keep in mind that this operation impacts only the attribute in log, it will not alter the original request.

Conclusion

HAProxy is a really good Load Balancer and Reverse Proxy.
We are really happy to moved away from nginx.
As always, we write this post to share issues we found and how we fixed it.

If you need it we also created a Stats Visualizer for HAProxy.
We hope this can be helpful.

--

--