Should we start baking immutable infrastructure instead of frying on-demand?

Ramnath Nayak
cloudnativeinfra
Published in
7 min readMay 5, 2019

It is an age-old dilemma that awaits me every time I walk into the chip shop. Should I go for the fried-to-order fish? Or the baked and ready-to-eat meat pie? A similar question can arise in the context of how to build infrastructure.

Photo by Laurenz Kleinheider on Unsplash

The fried-to-order approach

‘Frying’ infrastructure on-demand is the default approach to building infrastructure. Pick up an IaC tool or two and program it to carry out the same steps as you would follow while doing a manual installation.

Change management process in this case is well known. When there is a new version of the code or a patch or a config change to deploy, get the configuration management tool to carry out the changes on your live servers.

The pre-baked approach

‘Baking’ your infrastructure, or in other words, building your server images with the application pre-deployed is a radically different idea with some unique benefits. There is a more appropriate name for this approach — immutable infrastructure. The name comes from the fact that you do not change existing infrastructure, but deploy completely new servers with the changes pre-installed.

If adopting Infrastructure-as-code was about moving from snowflake servers/pets to cattle, immutable infrastructure pushes the envelope further and turns the cattle into phoenixes that die every time you deploy and rise up renewed from their ashes with the new version pre-installed.

Immutable infrastructure is achieved by getting your CI/CD pipeline to produce a fully ready OS image with the newer version of the application/code/config already deployed and ready to start when provisioned on a compute instance.

A lot of High Performance Compute (HPC) applications that need to process a large amount of data tend to run in a cluster of servers. Constantly patching and updating these individual servers can be a challenge and adopting an immutable infrastructure architecture can make it relatively easier to roll out changes.

Benefits of building immutable infrastructure

The most obvious benefit is that deployment to production becomes easy. Spin up a new set of server(s) with the new version of the application image to replace the old ones, and that is all there is to it (of course there may be some housekeeping required like rewiring load balancer backends or DNS updates to point to the new server, but that can be automated to be part of the deployment process).

This is immensely useful in an environment where application uptime is crucial, as all the Ops/SRE team needs to do is provision a server/cluster from the image to start/scale-up.

There is no longer a need to worry about things going wrong while deploying the patch to multiple live servers, as all your instances would have been spun up from the same source image and they contain the patch already without you having to distribute them. Configuration drifts cannot happen as there are no ‘deltas’ to apply. Going back to previous versions is also easy, as you just deploy the previous image.

The Immutable infrastructure approach helps manage your VMs like containers in a Kubernetes cluster, only with OS images instead of container images. You can destroy, spin up and deploy using your favourite deployment strategy (Red/Black, Canary, Rolling updates etc)

Building immutable infrastructure also creates new challenges that you did not have to deal with before. To build immutable infrastructure or not — that really is the question.

Challenges in adopting immutable infrastructure

The first challenge is that you will need to shift left and there will be much more to do in earlier stages of your CD pipeline as baking the image becomes a step in the delivery pipeline. Instead of just building code, now you need the entire server stack to be built.

Automated test coverage will need to be complete, as finding issues leads to rebuild instead of applying patches. The issues need to be found and addressed before committing a server image.

The build will also need to be triggered for a multitude of different reasons, such as newer versions or patches of the OS, database or any third-party libraries your application needs.

Service discovery can be a challenge in some cases, depending on the architecture of the application. If a rising phoenix server is not able to discover the other services through a service discovery mechanism, the application’s architecture may not make it conducive to adopt immutable infrastructure. Legacy applications that tend to have problems with this because of their affinity for IP addresses do not make good candidates for adoption of this approach.

Stateful data needs to be carefully managed when you destroy and rebuild servers. Typically, in cloud environments, this can be achieved by retaining the block and/or boot volumes and reattaching to the newly spun up server. If the server you are replacing is a database of any kind, the data needs to be written to block volumes that then need to be reattached to the new servers.

Tools

Your cloud’s imaging and clustering solution

The simplest tooling may be what your cloud provider already provides. For e.g., in the case of Oracle Cloud Infrastructure (OCI), you can generate images using the tooling. Consider instance configurations and instance pools if you need a cluster of servers instead of just one. Other clouds too have equivalent mechanisms to generate images and instance pools.

Packer

Packer is one of a kind tool that can spin up a base image, deploy your application and configurations on top of that (with the help of other IaC and Configuration Management tools) and create a final image that you can simply spin up your servers from. Packer has plugins for the leading CI/CD tools, so this would be triggered in your pipeline as a build step.

Step 1. Packer Builder spins up an instance on your favourite cloud, for e.g., OCI, AWS, Azure or GCP.

Step 2. Packer Provisioner deploys and configures the application, configurations, libraries using a configuration management tool like Chef/Puppet/Ansible/Salt.

Step 3. Packer saves the resulting image in your account which you can then use to spin up your server/cluster.

Adding the Git commit id and the Jenkins build number to the image name can provide traceability back from running server back to the version of code it was built from.

Spinnaker

Netflix is the company that pioneered immutable infrastructure and Spinnaker was a tool they built internally to make it happen. Spinnaker is a comprehensive end-to-end build and deploy solution for you to embrace if you want to embrace immutable infrastructure.

There are three main reasons why Spinnaker helps with immutable infrastructure:

Reason #1. It is a batteries included and opinionated CI/CD pipeline tool that spits out a machine image (just as it does with container images). Spinnaker internally uses Packer to generate images.

Spinnaker pipeline showing a deployment on OCI

Reason #2. It has deep integration with clouds that allows it to manage other resources like load balancers and also provides a single pane of glass to manage your resources that can span zones, regions, accounts and cloud providers.

Creating an OCI Load Balancer from Spinnaker

Reason #3. It supports versioning, deployment/rollout strategies like Red/Black, Canary, Rolling update and importantly — easy means of rolling back to previous version of your infrastructure with two clicks.

Spinnaker GUI showing a versioned application scaling

Summary

As with any new technology, the key to successful adoption lies in knowing when to use it and how conducive your application is to benefit from its adoption.

If your application’s architecture supports spinning up new instances without failing to discover other services and you want a simple Ops set up where deploying to production is made easy by ensuring its quality earlier in the release cycle, immutable infrastructure is perfect for you.

References

--

--

Ramnath Nayak
cloudnativeinfra

Outbound Product Manager at Oracle Cloud Infrastructure