[PoC4] using NGINX as a proxy for a Netty-based REST web service under heavy-load

Mert Çalışkan
3 min readJul 20, 2015

--

I’ve already implemented a RESTEasy based REST web service that runs on Netty and it was acting very well under heavy-load. I thought it’d be better to have it proxied with a NGINX to serve under stress without any compromises. That’s also similar to what I should have in the final product. Of course there will be more than 1 server for hosting REST web services probably. I’ve chained 2 NGINX’es together already, where one was acting as main application and the other one was acting as a proxy on top of it. That didn’t turn out well, but we’ll see how a REST client hooked up with an NGINX client.

So with this new PoC, let’s see how our REST web service client serves when proxied through an NGINX. For the ec2 stack, I used the AMI ami-a8221fb5 and the instance type m4.xlarge for creating all 3 instances given as follows:

To configure the NGINX in instance B for delegating requests to instance A, I edited nginx.conf and add the parts that are wrapped by bold words.

http {
upstream backend {
server <instance-a-internal-ip-address>:8080;
}
}
server {
location / {
proxy_pass http://backend;
}
}

My client http-requester was invoked with the parameters below (Exactly the same with the rest of the PoCs).

java -jar http-requester-0.0.1.jar 10.0.0.190 80 1000000 1000000 200

The output numbers were pretty low and made me very sad at first sight :(

1068 req/sec
1753 req/sec
2155 req/sec
2201 req/sec
2477 req/sec
3148 req/sec
3712 req/sec
4393 req/sec
4455 req/sec
4740 req/sec
5090 req/sec
6207 req/sec
6039 req/sec
6323 req/sec
8428 req/sec
8736 req/sec
502
502
502

And after some point, NGINX started to throw out 502, which is Bad Gateway HTTP Error code. The gateway in the error description depicts our NGINX server since it acts as a proxy point between 2 communication channels, the requester client and the REST web service.

According to these results, it occurred to me that there is a drain on some of the resources so I started to investigate it. For at least, I should be seeing 10k/sec as in my previous NGINX →NGINX example.

First, I boosted the worker_connections value in nginx.conf to 8092. Output numbers were acceptable but testing still ended with 502. worker_connections configuration just sped up the ramp-up, but didn’t resolve the problem. yuck!

5366 req/sec
9394 req/sec
9418 req/sec
9458 req/sec
9515 req/sec
9475 req/sec
9295 req/sec
9285 req/sec
9143 req/sec
9481 req/sec
9218 req/sec
9250 req/sec
9323 req/sec
9332 req/sec
502
502
502

Since our NGINX handles 2 HTTP connections per REST web service request, one with the http-requester client and one with the upstream, that is the netty-rest-simple server; I thought there could be a lack of resources due to server configurations. So I checked the /var/log/nginx/error.log and saw this error:

2015/07/20 13:53:32 [crit] 2618#0: *127872 connect() to 10.0.0.190:8080 failed (99: Cannot assign requested address) while connecting to upstream, client: 10.0.0.192, server: localhost, request: “GET / HTTP/1.1”, upstream: “http://10.0.0.190:8080/", host: “10.0.0.191:80”

While searching for this error, I came up with posts stating that I was running out of local ports due to sockets in TIME-WAIT state. There are possible alternative solutions but a simple solution would be to enable TIME-WAIT sockets reuse.Editing /etc/sysctl.conf and adding the key-value pair given as follows will suffice.

net.ipv4.tcp_tw_reuse=1

To load the configurations again without the need of restart, simply execute:

sudo sysctl -p

After doing the sysctl configuration and rolling back worker_connections value to 1024, I got the numbers given as follows:

661 req/sec
1616 req/sec
2034 req/sec
2296 req/sec
2488 req/sec
3099 req/sec
3054 req/sec
3510 req/sec
3964 req/sec
5104 req/sec
5031 req/sec
6198 req/sec
9327 req/sec
9263 req/sec
9196 req/sec
9149 req/sec
9398 req/sec
9528 req/sec
9564 req/sec
9320 req/sec
9366 req/sec
8866 req/sec
9288 req/sec
9390 req/sec

Numbers are saturated ~9k/sec, which is OK for a single server on 1Gbit network. Amazon only provides 10Gbit network for VM’s with at least 16 cores of CPU, which are $$$/hour. See my comparison with iperf on that.

--

--

Mert Çalışkan

Opsgenie Champion at Atlassian. Oracle Java Champion. AnkaraJUG Lead. Author of Beginning Spring & PrimeFaces Cookbook.