How to ignore large requests with Nginx and uWSGI (and several ways not to)

Cronitor.io
Crafting Cronitor
Published in
4 min readNov 4, 2015

At Cronitor, our primary engineering objective is to never miss a tracking ping. We process ping requests into telemetry data and a missed ping can lead to errant reporting and, even worse, false-positive alerts. With this focus we’ve grown from a Django app on a single Linode box to a stack that today handles our traffic bursts of 100 requests/sec.

Last week our LogWatcher daemon alerted us to failed pings in our Nginx error log. The daemon’s job is to rescue failures by re-processing them but it was puzzled by these requests and flagged them for manual review. I was on the train when my phone buzzed in my pocket.

Client intended to send too large body: 14601004 bytes, client: xx.xx.xx.xx, server: cronitor.link.

Hmm. 14 MB is a big POST.

Occasionally a customer will push a bug into production that tries to DOS us in some way. Their traffic gets blocked at the firewall while we reach out. These large requests, however, were no accident: A new customer is monitoring an external SaaS service that happens to send along these large payloads. Out of the customer’s control, we had a clear choice: Figure out a way to handle these requests or lose their business.

Take One.

There are many ways to solve the problem but leaning on our engineering instincts we dismissed any solution that introduced new complexity. We had to solve the problem with the tools we already had.

There is obvious peril in simply increasing the client_max_body_size limit to 20 megabytes. Our traffic comes from machines and machines operate on a clock that is usually synchronized with NTP. Over a third of our pings are received in the first 10 seconds of each minute. Large request bodies passed down from kernel to web server to application server to application would have a certain impact on our throughput in those 10 seconds.

We tried it anyway. Our ping collectors average below 25% cpu utilization. The notion that tweaking a setting might buy us some time was too attractive to resist. I rolled it out to a hot-spare, then to a single production ping collector. I crossed my fingers and held my breath. I wasn’t holding it very long.

Take Two.

Later that day, with a little distance from the problem, I thought about how these large requests were wending thru our stack. Cronitor doesn’t use a single load balancer. Instead, we return a virtual IP from our equally-weighted Route53 DNS entries. The request is passed to an Nginx proxy server. Nginx routes ping requests to a tracking server built on the Falcon micro-framework, and all other requests to a Django application.

After the TCP handshake completes, a client begins a POST request by sending the headers first and then the body. But HTTP is half-duplex and clients are going to send the entire body even if the CONTENT_LENGTH header exceeds the client_max_body_size. Knowing this we could trust that our performance issues were after Nginx: in the uWSGI server or our application. Reading the uWSGI docs helped connect the dots and before long I was deploying a test config to a hot-spare that included uwsgi_pass_request_body off. If we could drop the request body before the hand-off to the uWSGI upstream it could work. After rolling it out I tested a few simple POSTs and received 200’s. To simulate large requests I generated 30,000 kb of fake data and started posting it with Siege:

$ dd if=/dev/zero of=/tmp/large.dat bs=1024 count=30000
$ siege -c 2 -t 30S -b “http://test.cronitor/d3x0 POST < /tmp/large.dat”

Watching CPU and IO performance, I was stoked. Success! Then I checked the error log. [error] 23073#0: *60177384 upstream prematurely closed connection while reading response header from upstream.

Shit.

Take Three.

WSGI, like the PEP it’s based on, is a complex beast. I once fantasized about building our tracker on WSGI “bare metal” but breaking-trail proved painful and I played with Werkzeug before settling on Falcon. Stepping thru the Falcon code I came to the source of the problem:

if self.content_length is not None:
self.stream = helpers.Body(self.stream, self.content_length)

The naive POST requests I sent worked because I hadn’t taken the time to pass a body. The CONTENT_LENGTH request header was 0 and this code path in Falcon was never hit. The problem arose when it received a non-zero CONTENT_LENGTH without a valid stream object. By dropping the body I introduced a mismatch between the CONTENT_LENGTH header and the actual content length. I needed to finish the job:

uwsgi_pass_request_body off;
uwsgi_param CONTENT_LENGTH 0;

Success.

After testing the build locally and rolling it out blue/green to our production environment we kept a close eye on performance. We’ve seen no measurable load impact and we’ve passed the good news on to our customer.

If you find yourself needing to handle large POST bodies at scale, our advice is to avoid passing the request body to your application server. If you can’t discard the body like we , consider parsing it locally as a ngx.req.socket stream in the HttpLuaModule.

Cronitor is a simple monitoring tool for scheduled jobs, periodic tasks, external SaaS tools, and almost anything else.

Try Cronitor free for 2 weeks.

--

--

Cronitor.io
Crafting Cronitor

The best way to monitor scheduled jobs, services, daemons, and almost anything else. Realtime alerts and a new level of visibility into your systems.