Amazon Elastic Load Balancer auto-registration

Martin Raison
Keep It Up
Published in
2 min readFeb 10, 2014

At Kifi we operate in continuous deployment mode, so we need robust mechanisms to support both fast deployment and service continuity. Since our Kifi services sit behind Amazon Elastic Load Balancers (ELBs), an important problem is to make sure load balancers don’t route traffic to inactive instances. In a typical setup, ELBs are configured to perform regular health checks on their assigned instances, and update their routing policy accordingly.

Using the strictest possible settings, health checks are performed every 6 seconds, fail if no response is received after 2 seconds, and an instance is deregistered from its ELB after 2 failed health checks. Even with such a configuration, an ELB keeps sending traffic to a stopped instance for a duration of 8 to 14 seconds, depending on when the first health check occurs after shutdown. This means that each deployment may cause some requests to be lost. A quick fix to this problem is to make the instance stop responding to health checks long enough for the ELB to deregister the instance, while still processing other requests. However this does not adapt well to changes in health checking configuration, and also makes restarting services slower.

An additional issue is that the ELB will keep health checking inactive instances, and reregister them as soon as they restart. If an instance needs some warm up time (to health check itself, warm up indexes, and so on), it is good to control exactly when it starts receiving external traffic.

A better solution is to use the AWS Java API to make instances automatically register themselves to their ELB on startup and deregister on shutdown. So instead of directly configuring the ELB to serve a predefined set of instances, we tag instances with their ELB name, and make them register/deregister as shown below (the code is written in Scala, using the Java API):

This can easily be hooked into a Play! application’s start and stop events as follows:

…and restarting instances becomes much more graceful.

We wrote this post while working on Kifi — Connecting people with knowledge. Learn more.

Originally published at eng.kifi.com on February 10, 2014.

--

--