Active HealthCheck in OpenResty using LUA

Mohammad Varmazyar
2 min readFeb 20, 2019

This post is going to attempt to demystify the active health check that we have in Nginx Plus with lua in OpenResty. As we know the NGINX and NGINX Plus monitor transactions as they happen, and try to resume failed connections. If the transaction still cannot be resumed, NGINX and NGINX Plus mark the server as unavailable and temporarily stop sending requests to it until it is marked active again. In passive health check we have two argument that values are fail_timeout and max_fails. For example we have:

upstream MyApp {
server worker1.domain.local:8080 max_fails=3 fail_timeout=30s;
server worker1.domain.local:8080 max_fails=3 fail_timeout=30s;
}

The active health check is only supported in Nginx plus so we need get our hands dirty with lua in openresty or nginx with LUAjit module. Imagine that one upstream is a 5x or 4x error. If you do not use the solution that I will explain in this post, the queries will be faced with each other in error.

In this article we have to use lua-resty-upstream-healthcheck for Active mode.This library performs healthcheck for server peers defined in NGINX upstream groups specified by names.

Here is my configurations for setup active healthcheck in openresty:

Note: Different upstreams’ healthcheckers use different keys (by always prefixing the keys with the upstream name), so sharing a single lua_shared_dict among multiple checkers should not have any issues at all. But you need to compensate the size of the shared dict for multiple users (i.e., multiple checkers). If you have many upstreams (thousands or even more), then it is more optimal to use separate shm zones for each (group) of the upstreams.

--

--