Scaling with HAProxy and EC2 Autoscaling groups

Léonard Hetsch
Feb 23, 2017 · 6 min read

AWS EC2 Autoscaling groups offer a very powerful solution to get a pool of machines automatically scaled up or down depending on the ressources needed. Whenever your application needs to handle an increasing number of requests, the autoscaling group will fire up new instances to spread the load on more nodes. As traffic decreases, machines are terminated to avoid having too much unused resources, therefore optimizing your infrastructure costs.

A load balancer is then needed to dispatch the incoming HTTP requests on all the available nodes. The EC2 Elastic Load Balancer (ELB) is used almost every time with Autoscaling because:

  • You don’t have an additional infrastructure layer to manage, as the ELB is working out of the box.
  • It integrates with autoscaling groups without any further configuration. You just have to target the group, and the ELB will automatically spread the load on the group instances.
  • It is cheap (about $0.008/Gb of used bandwidth)

ELB is working great for most cases. But, in some situations, you may want to use another load-balancing solution. It might be because:

  • You want more control on the way you manage traffic and your load balancing configuration (the ELB essentially being a black box that you cannot really tweak).
  • Depending on the kind of business you operate, your application may receive important or sudden spikes of traffic at certain times. In a similar way to autoscaling groups, the ELB will internally scale to accommodate the load. However, if the traffic ramp-up is too important or fast, you will basically be DDoS-ing your ELB (which was likely not yet scaled and ready for such an amount of traffic).

Enters HAProxy. According to the homepage of its website:

HAProxy is a free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications.

It has been for the past years a very popular solution because of its simplicity (it does one thing and does it very well), its performance and robustness. Virtually every big tech company out there uses it to balance traffic load on their servers (Twitter, Airbnb and Reddit, to name a few).

In this post, we’ll build a HAProxy load balancer machine on a classic EC2 Ubuntu instance, then use the AWS SDK with some scripting to automatically configure HAProxy with the instances present in our autoscaling group.

Setup HAProxy on a EC2 instance

Since we’ll need to call the EC2 API from the instance in order to retrieve the current nodes we need to balance traffic on, we could have to deal with storing access keys and secrets on the load balancer to authenticate API calls. Also, if you’re using some configuration management tool like Ansible or Chef, you would have to deal with encryption to avoid keys being readable from your configuration code.

Good for us, the AWS API offers a handy alternative to the use of access keys: by attaching an IAM role to an instance, this instance can be automatically authenticated under this role when calling the API using instance profile credentials, without us having to do anything. Sounds better, no?

So, let’s start by creating a role in the IAM dashboard and configure it with the EC2 read access:

  • Enter a name for your role, for example HAProxyRole
  • On the next page, select Amazon EC2 role type in the list AWS Service Role
  • Choose to attach the AmazonEC2ReadOnlyAccess strategy to your the role, and finally confirm the role creation on the next page.

Now, any instance launched with this role attached will be able to authenticate API calls with the EC2 reading permissions.

Next step is to create and launch the instance. We’ll launch from the Ubuntu Server AMI. You can select any instance type for this example. Next, in the Configure Instance Details section, you can attach the role HAProxyRole we just created to the new instance. The rest of the options are up to you, then you can launch the instance.

HAProxy having a very low memory footprint, but high CPU usage at scale, you’ll probably want to go for a C4 instance type for production use.

Tweaking system limits

This is not a required step, but you’ll probably want your HAProxy load balancer to handle as many connections as the instance resources allows.

Linux is, by default, limiting the number of file descriptors that can be open at the same time. TCP connection sockets being treated in the same way that regular files, this limit could throttle the number of simultaneous connections handled by HAProxy. Let’s bump it up a bit.

SSH into the freshly created instance and edit the file /etc/systcl.conf:

fs.file-max = 10000000
fs.nr_open = 10000000

We also have to update the file /etc/limits.conf

* soft nofile 10000000
* hard nofile 10000000
root soft nofile 10000000
root hard nofile 10000000

If you’re interested in further tweaking of the system to maximize the load that your LB can handle, this post gives more details about how to achieve this.

HAProxy Setup

Add the apt PPA repository to install HAProxy 1.6:

$ sudo add-apt-repository ppa:vbernat/haproxy-1.6
$ sudo apt-get update
$ sudo apt-get install -y haproxy

Edit the file /etc/default/haproxy to enable the HAProxy service daemon:

ENABLED=1

Then start HAProxy with sudo service haproxy start

While not required, you might want to install the hatop utility to monitor your HAProxy load balancer.

$ sudo apt-get install hatop

We will have a default “template” configuration file for HAProxy, /etc/haproxy.cfg.template, the only missing thing being the backend nodes. Since we’ll dynamically retrieve the current autoscaled instances from the EC2 API, we’ll generate a final configuration file /etc/haproxy.cfg from the template one, adding all node IP addresses.

Retrieving Autoscaling group instances

We’re now gonna do some scripting to get autoscaling group instances from the AWS API. The script would run, let’s say, every 3 minute, to refresh the list of backend servers. You can use any language for this, but for the rest of this tutorial I’ll go for Ruby.

We’ll need to install Ruby along with some other libraries and the aws-sdk gem to interact with the AWS API:

$ sudo add-apt-repository ppa:brightbox/ruby-ng
$ sudo apt-get update
$ sudo apt-get install -y software-properties-common ruby2.3 ruby2.3-dev zlib1g-dev libxml2-dev build-essential libpcre3 libpcre3-dev
$ sudo gem install aws-sdk

Begin the script by setting up the AWS SDK, using instance profile credentials to authenticate the load balancer instance using the attached HAProxyRole.

Using the Autoscaling SDK, we retrieve current instances of the group we want to target (here, api).

Then, get EC2 instance information from the EC2 SDK and register the private IP address and DNS name. We actually don’t need the DNS public name, but we’ll use it as a label for HAProxy, so we can easily monitor attached nodes.

Finally, we can copy the template file to the location of the “real” configuration file and append the backends configuration. We need to reload the HAProxy service daemon to use the new configuration.

Add the script to your crontab so it will run every 3 minute (you might want to use a different strategy, depending on your application or infrastructure needs):

*/3 * * * * sudo ruby /usr/bin/haproxy-autoscaling-update.rb

After the script has run, run sudo hatop -s /tmp/haproxy to monitor HAProxy stats. If you have some instances running in your autoscaling group, you should see them listed as backend servers like on the screenshot below:

Conclusion

We’re done! This is a working solution, but it could be improved in many ways:

  • We could keep a record of the latest loaded list of instances, so we can update the HAProxy configuration and reload it only when the list of instances has changed since the last reload.
  • If your LB continuously receives an important amount of traffic, you’re probably dropping some connections during the short time window HAProxy needs to reload its configuration. There are ways around this, and Yelp engineers wrote a very detailed post on that matter, as well as the Github engineering blog.
  • The basic HAProxy configuration of this post only handles HTTP traffic coming through port 80, but you probably want your application traffic over HTTPS. You can implement the SSL termination on the HAProxy LB instead of the web servers, then dispatch requests on port 80.

Léonard Hetsch

Written by

French software engineer based in London, currently @dicefm, previously @oncetheapp // Studied @gobelins_paris // Coder, reader, hungry learner.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade