Dynamically provisioning Jenkins nodes in AWS

The saying “Treat your servers like cattle, not like pets” has become increasingly popular among developers in recent years, largely thanks to the proliferation of virtual machines and containers. It encourages us to design machines that are ephemeral and replaceable, instead of ones that are long-lasting and manually managed, the latter of which tend to become Snowflake Servers.

On top of being cattle-like, servers should ideally be defined as code (using a tool like Terraform) and checked into a version control system like Git.

Continuous Integration infrastructure is no different. We should try to avoid manually managed Jenkins nodes that have who-knows-what installed who-knows-where. Besides, using ephemeral Jenkins nodes has a few key advantages, mainly in the costs saving front:

  1. Build nodes are perfect for taking advantage of AWS spot instances and their reduced pricing
  2. Ephemeral nodes are easy to scale horizontally, meaning more nodes can be provisioned if there are jobs waiting to be executed.
  3. Nodes can be shut down after a period of inactivity. Since the majority of builds are done while developers commit code during the work day, this will also help reduce costs by not having machines running when they are no longer being used.

Preparing the image

“Jesse, we have to cook.” — Walter H. White

The first thing we’re going to do is bake an AMI with the tools used in CI. This image will be used later to provision the nodes. In order to automate things and define them as code we’re going to use Packer.

This is a minimal packer.json that creates an AMI based on the most recent Ubuntu 16.04 one, adds an EBS volume to it and runs an user-data script:

packer.json

And here is an example user-data script using cloud config syntax that can be used to install the necessary tools used in CI:

user-data

Now, if everything goes fine, after executing packer build packer.json the AMI should be created and available to use in a couple of minutes.

And of course, this should also be version controlled and eventually executed by Jenkins itself.

Amazon EC2 plugin for Jenkins

“Let’s build us a happy little cloud.” — Bob Ross

After installing the Amazon EC2 plugin from Jenkins’ plugin manager there will be a new option to add an Amazon EC2 Cloud in Jenkins global configuration.

The plugin offers a lot of configuration options which depend on a given AWS infrastructure (region, security groups, roles, subnets, etc.). Be sure to select spot instances, an instance type according to your needs, and an appropriate idle termination time. Also set an instance cap to avoid problems if everything decides to go rogue.

Keep in mind that a low idle termination time can also be detrimental. It would give up local caches such as the Maven or Docker ones, causing builds to take longer as they have to start from scratch in a fresh node.

Also, the recently released JCasC (Jenkins Configuration as Code) does exactly what it says on the tin and supports defining all of these configurations as code.

And finally, some tips

“Your scientist were so preoccupied with whether or not they could that they didn’t stop to think if they should.” — Dr. Ian Malcolm

Sometimes, when using tools that take a serious hit on the machines, having concurrent executions can render a node unresponsive. A little neat trick to avoid this if you are using Jenkins Pipelines is to leverage the Lockable resources plugin and use it in a step like this:

lock(resource: "mvn_${env.NODE_NAME}") {
sh 'mvn clean package'
}

This will dynamically create a lock that affects only the node in which the build is running, limiting concurrent heavy computational tasks without having to reduce the number of executors in a node.

Then, using Jenkins Shared Libraries, it’s easy to create a repository with a wrapper step like so:

vars/mavenPackage.groovy

After putting the code above inside vars/mavenPackage.groovy in the shared library, it can then be used from a pipeline as a mavenPackage()step.

This way pipelines can reuse code, which in turn makes it easier to make changes (and also to break things).

One last thing to note regarding locks is that they show in the Jenkins management page. With the current version of the plugin, there’s no way to specify them as transient and this will end up bloating the UI. Since removing them manually is a chore, the following script can take care of them for you:

Remove Locks

GIFs in this post belong to AMC’s Breaking Bad and were retrieved from Giphy.