HTTP load-balancing on gRPC services

Published in

Applied engineering reports

4 min readAug 27, 2016

In this post it is explained how to load-balance gRPC services through classic HTTP1.x health-checks. gRPC does not use HTTP status codes to signal problems, thus a multiplexing approach will be used to serve traffic of two types:

gRPC (over HTTP 2.0)
HTTP 1.x traffic, including the /status path for the load balancer health-check and possibly human-friendly extras like documentation etc

Case at hand for this post: AWS Elastic Load Balancing; this post comes with Github-published source code and embedded cheap jokes.

gRPC whereabouts

You may (or may not) have already heard of gRPC, which is a solid service communication framework created by Google; if you would like to know more then I suggest reading the motivations behind its creation.

In case you missed it so far, gRPC requires HTTP2, which is the leading clue for the rest of this post.

go-grpc

Google has also developed and released Go, however the Go version of gRPC is not yet being tagged nor its branches officially maintained, since it is in high flux; nonetheless it has been possible to crawl through the breaking changes and use it professionally and reliably at work.

<off-topic rant> client-side balancers

I still have some bullets to dodge there, like an internal client-side balancer, a “feature” which I have seen already in another client library; these are apparently becoming a trend, but they always make me cringe nonetheless.

Scalable services shouldn’t balance themselves..they know too little about their neighborhood.

Bottom line: load-balancing is not a job for the client library of any service. If you see any very ambitiously omniscient client-side balancing, go disable it (disabled should be a sane default for these). It will simply make your proper load-balancing not work. When designing reliable back-end services, we do not trust clients to do the right thing, but rather trust the server-side to always put clients in a position where they can’t possibly ever create a Denial Of Service situation.</off-topic rant>

The problem

Amazon ECS (a.k.a. EC2 Container Service) is the Docker-based container solution sold by Amazon*, its ELBs have however no support for HTTP/2 or more complex gRPC health checks.

More generally speaking, one might want to be able to perform health checks on a service without the need of a gRPC client, mainly for two reasons:

a health check does not need the complexity of protocol buffers to report the health status, its most important bit of information is actually one (boolean, healthy or not healthy), plus eventually some error messages with context
HTTP1.x clients (think of curl, wget and all those embedded in network hardware/software) are everywhere and can be easily leveraged

But HTTP2 is backward-compatible with HTTP1, so surely it must be possible to offer a health status endpoint on a gRPC service, right? Wrong. Go give a look to the jumps through hoops needed to simulate a gRPC conversation on this post by Juho Mäkinen.

*= I do not advise using/buying ECS at the moment — unless you really have to — because experience has been mediocre until now (for various reasons)

The solution

Some progress has been done upstream at grpc-go regarding the capability of multiplexing regular HTTP1 and HTTP2 traffic on the same port:

https://github.com/grpc/grpc-go/issues/75 (I argue that this is really not fixed)
https://github.com/grpc/grpc-go/issues/549 (still open at the time of writing)

NOTE: just ignore there the many hi-jacking issue comments about “supporting gRPC on HTTP1.x”, that is an off-topic chimera for me and for this article (zero interest on seeing it happen).

So, grpc-go has finally support for non-gRPC requests via ServeHTTP, right? Wrong. Apparently it works only when using TLS (!?); I had given up my attempts earlier today and ended up using cmux instead:

gdm85/grpc-go-multiplex

grpc-go-multiplex - gRPC/HTTP multiplexing example (AWS ELB etc)

github.com

Feel free to use this working example to implement your own gRPC service with HTTP1.x health-checks; the trick there is that all connections identified as gRPC are served via gRPC protocol, while the rest is served as regular HTTP.

Performance impact is negligible as this recognition happens only right after handshake, and thus once per opened connection (because you are re-using your connections, are you?).

Conclusion

I am looking forward for a more mature grpc-go that allows what is currently possible via cmux, however I currently see absolutely no downside in using this cmux approach in favor of stability, waiting for upstream improvements.

Update 29 August 2016: improved readability, added preface