Testing your code against production using Nginx Mirroring

Gaurav Shukla
FarziEngineer
Published in
3 min readApr 27, 2019

How often have you wondered what if there was a way to simply deploy to production and fix the errors that you encounter ?

In startups and very early products it’s still acceptable but if you happen to be in the league of developers who write code that affect millions of customers you end up reviewing your code a thousand times, asking QA to write test cases, and even after all this if there’s a miss then there’s a retro for that.

Hmm, for most applications having an HTTP interface it’s possible using the new mirror directive introduced in Nginx 1.13.4. This directive does exactly what it is supposed to do. It “Mirrors” your request to another upstream or location block without affecting the original request*.

How does it do that ?

Nginx has ability to fire another request from main request and wait for it’s response to complete before returning the response of main request.
Sample use case for sub request would be — Aggregating response from multiple sources before returning response to user.

The secondary request sent from the main request is called sub-request.
Sub request is a request which is async and can be processed in parallel to the main request; provided there’s a free worker.

Nginx exploits this ability of sending sub-requests to replicate the traffic onto another server. Difference being that it does not wait for the response of sub-requests.

How to use it for testing your code ?

I often find myself working on a very critical piece of code and often there’s the case of people using observable undocumented behaviour of system to their benefit, so it gets very difficult to write test cases for all the possible inputs.

In those scenarios I use the following setup to test my code against production traffic -

I replace sink with replica of Prod DB depending on whether or not I want to test what gets written to storage. So far I’ve not had to verify up to that level.

One use case that comes to my mind is when you have changed connection setting to your database and now it supports a different encoding then you might want to test what actually gets stored in DB.

I mostly replace sink with a service that prints whatever is passed to it so that I can verify inputs while debugging.

Sample configuration with Openresty (Nginx + Lua)

upstream staging_backend {
server <staging server ip>:80;
keepalive 32;
}
location = /mirror {
internal;
rewrite_by_lua_file /mirror.lua;
proxy_pass http://statging_backend$request_uri;
proxy_pass_request_body off;
proxy_set_header Content-Length “”;
proxy_set_header mirrored 1;
}
location = /test {
#mirror /mirror;
#mirror_request_body off;
content_by_lua_file /var/www/lua/test.lua;
}

Explaining the various parameters -

Why HTTP in proxy pass and why the additional headers?

Nginx does not support https and http2 in sub-requests hence despite your original request being a https request you need to make an http request to your staging.
I use the mirrored header on my staging server to decide that it’s a mirrored request, what you do with this information is up to you.

Request Body Off ??

My request is get request hence it has no request body so I simply ask nginx to ignore request body when replicating the request. It saves tons of resources.

Why the endpoint is internal ?

We don’t want hackers to exploit this endpoint hence we make this internal. An internal endpoint can only be called from inside server block (using rewrite and sub request)

What about rewrite by lua ?

I have multiple production servers and to be sure that all possible inputs are covered I take a small percentage of traffic on all production nodes and mirror it.

Logic for that is contained inside mirror.lua. It uses a random number to determine whether or not the request should be mirrored.

Benefits of this Setup

  • The obvious benefit is less mental strain and faster more confident deployments
  • No more scapegoating the QA for missed cases
  • No more time spent in covering the cases which you never encounter in production
  • Surety that you would not break any third party tool relying on some observable behaviour of your service

So that’s all for this post, I’ll continue posting on Nginx and Openresty related topics. Follow me on twitter for updates.

--

--