What’s that Baking? It’s VM Images!

Backstory

Mark Nash
Contino Engineering
15 min readOct 9, 2023

--

You stand up and stretch, you’ve been building and configuring Virtual Machines (VMs) all morning and the compliance dashboard is still not happy. There are still more systems that need to be updated and configured with the new security standard. But it’s lunch time, so you decide to take a walk down to your local bakery to pick up something to eat.

As you’re waiting in the queue you look at the array of baked goods prepared in batches, baked, and now all available to be picked up on demand by customers. It would be very painful if each cake had to be created from scratch every time it was ordered!

You think back to the task at hand — it would make your life so much easier if you could create the VMs in a similar way. An automated process that would use Infrastructure as Code (IaC) and Configuration as Code (CaC) to create VM and Container images via a standardised and repeatable process, configuring and testing the VMs so that they are ready for use?

Your mind jumps back to a Medium article you skimmed last week…

“An image bakery is a codified, automated and repeatable process where you create custom images used for creating VMs, Containers or Disks. They can help speed up the provisioning time for a VM by baking the required configuration of the systems into the VM image rather than configuring it after the instance is built.”

You stroll back to the office and decide to do some more investigations into Image Bakeries…

Introduction

Many Organisations and Platform teams find themselves in similar situations to the above scenario, Ageing VM infrastructure that is in need of updates, configuration that is not being updated in a standardised way across the estate and new compliance requirements that need to be applied across all resources.

By putting an image bakery solution in place platform teams can create company specific base images that align to company compliance and security policies via a repeatable process that allows the images to be updated easily. This then enables application teams to build upon these base images installing required applications to create their required resources.

In this blog we will explore the idea of Image Bakeries, discussing why you might build one into your cloud estate. We’ll look at how an image bakery solution can help you achieve compliance, security and goals whilst also speeding up provisioning of resources.

Finally we will look at some of the issues I’ve personally hit whilst creating an Azure Image Bakery, in the hopes that you won’t get burnt by your experience in future!

Pat-a-cake, pat-a-cake, Technology man, Bake me an Image, as fast as you can!

Let’s start by breaking down some of these terms that I’m using.

What is an Image Bakery?

The process of creating a custom image is called baking and so the process that creates multiple custom images tends to be called a bakery.

An image bakery is a codified, automated and repeatable process where you create templated/custom images used for creating VMs, Containers or Disks.

Image bakeries tend to be associated with the practice of immutable infrastructure, whereby you want your VMs/Containers to wholly contain all the applications and configuration that they need to be successful, not needing any additional changes following deployment. When changes to applications or updates are needed a new image and instance are created instead and older VMs/Containers are replaced.

By baking images you are altering the creation dependency chain for getting VMs and Containers stood up, moving the configuration to earlier in the process and therefore reducing required changes following resource creation. By having the images available earlier you allow time to do thorough testing, providing confidence in the resources being deployed using the company specific images.

This means having an image bakery for creating these images via a standardised process can be an exciting prospect, reducing overall deployment time and helping to keep resources inline with the organisation’s policies.

The Baking Process

The process of a bakery tends to follow a similar pattern:

  1. A base image is selected
  2. A VM or container instance of the selected base image is created
  3. The instance is configured to the required level
  4. The instance is validated against the configuration
  5. The Instance state is saved and turned into an image
  6. The image is then made available to be used on VMs or containers in future
  7. Clean up resources used for process

Configuring the instance can consist of a few different tasks:

  • Updating the Operating system
  • Installing required Software / Applications
  • Locking down system settings
  • Creating required Users / files
  • Testing to ensure applications are functioning and that files are where expected

Once an image is created it can either be utilised to create further images, creating an inheritance structure of image dependencies, or it can be used to deploy a resource for its intended use.

Base and Child Images

Images inherit characteristics from their parent images, most obviously the core operating system you are building on top of, but this can be expanded to contain security configuration, applications and anything the parent image is configured with.

By stacking images into an inheritance structure you are able to make use of this inheritance capability and standardise the base configurations across all of your custom images. You can then create child images when requirements between images differ.

In the following example image we have a small structure of VM/Container images, the visual shows how platform images get pulled and utilised to create base and then child images. These platform images are configured using example configuration files to create the custom (base and child) images.

As in the example image, we can utilise platform images to create our standardised base image, inherently baking in the company’s base security requirements and settings into the image.

These base images can then be used to create Domain Controllers / Database servers / Web App server images which can either be used to create servers, or again these images can be built upon creating application specific images in the third layer coming pre-installed with the latest application to use in scale-set or similar resources.

Because the images are created within an inheritance structure they gain the settings and applications of their parents, this helps to ensure all Containers and VMs created by the Application teams abide by company policies.

When things go Stale

Images will get old and stale just like cakes sat in our real bakery, and so we need to put policies and processes in place to keep our images fresh — up to date with OS updates and application updates, and we’ll need to remove older images when they should no longer be used.

By setting up an Image bakery, instead of baking images one by one, we are putting the required automation in place to help us with the creation of these images, and by combining this with lifecycle policies and versioning we get a neat solution for empowering users to select only the newest and freshest images for deployment into the organisations platform.

Now that we’ve discussed how a Bakery solution can help you ensure you are using the newest images, that they follow organisational policies and that they are tested and proven to work, let’s discuss what tools you need to start looking into.

We don’t need Cake Tins and Whisks where we’re going

Baking images is obviously different to baking cakes, but we still need a set of tools that are going to help us automate and manage the image creation process.

We’ll be looking for tools that cover:

  • Creating the initial image instance to configure and turn into the image
  • Configuration tools to actually install the configuration on the instance
  • Pipeline and scripting tools to create the overall Bakery process

Now let’s browse the utensil shelves and see what we find…

Image Creation tool

Hashicorp’s Packer has been a leader in the image creation tooling field for many years. Packer lets you create identical machine images for multiple platforms from a single source configuration. A common use case is creating golden images for organisations to use in cloud infrastructure.

Coming from Hashicorp, the same company that has provided industry with tools such as Terraform, Vault and Nomad, It is understandable that Packer has gained quite a following and has become a leader in this field.

Packer uses a fairly simple setup of configuration files to define the source image, image destination, build instance settings and required system configuration. Packer then uses these configuration files to create all the required infrastructure resources to host an image in the specified cloud provider, creating a temporary image to configure and turn into a nicely baked image (golden some might say..)

All the main cloud providers (AWS, Azure, GCP) have their own cloud specific image building tools and processes to help you create personalised images for their cloud platform. The cloud specific tools have the benefit of being able to maximise features aligned to the hosting providers, Whereas Packer is cloud agnostic and can be used across all three.

For me being cloud agnostic outweighs the small benefits the specialisms get you but of course the choice on image builder will be down to your own organisation’s requirements. In the meantime, Packer is the tool we’ll be using and talking about for the rest of the blog.

Now we have our tool for baking the solution in a manageable way, but we need to get our configuration in the image otherwise what’s the point in creating our own image at all?

Gently fold your configuration into the mixture

There are many tools out there for managing the configuration of IT resources, and so we won’t focus too much on which you should choose. The choice of configuration tool will be down to organisational requirements and choices, but for my scenario we are trying to keep with the agnostic and low cost approach and so we’ve picked Ansible.

“Ansible — the simple, yet powerful IT automation engine that thousands of companies are using to drive complexity out of their environments and accelerate DevOps initiatives.”

Ansible works on the idea of configuration ‘playbooks’ and hosts that they are deployed to. Each ‘playbook’ consists of a collection of ‘plays’ and each ‘play’ contains variables, roles and an ordered list of tasks. The machine running the packer command will be considered as the control node and will configure the instance created by Packer.

In a more complex Ansible controlled environment you may see Inventory files containing multiple hosts, but as we’re only building a single image at a time we don’t need to be too concerned with these. Just know that the host created by packer will be contained in an inventory file and configured using the ansible playbook provided.

So in our bakery we want these Playbooks to consist of all the tasks needed to get that image configured to the agreed level, So installing applications, adjusting registry settings, and creating files. We can even include testing within our ansible playbook so that we can be sure that everything is in the right place and that our image is ready for use before we package it back up.

We call our configuration tool (Ansible) or any configuration scripts from within the Packer configuration files via a provisioner, so in this case we trigger an Ansible playbook, and get it to configure our image, installing all necessary applications and settings and then testing to ensure all is well.

Once the instance is configured Packer converts the instance into an image and deletes the remaining resources.

There you have it, our first Golden Image all ready for use! However, this process has been very hands-on up to this point. Even though Packer has automated the creation of our image and Ansible has managed the configuration of that image, we’ve still only managed a single image. So let’s get this whole process orchestrated in a pipeline so that we can kick back and let the system handle the creation of images itself.

Now let’s make a baker’s dozen!

Most Organisations will already have chosen which pipeline tool and repository tools that they are aligning to as an organisation and a lot of them are tied together these days hence lumping them into the same pan. It’d be foolish of me to assume my mere article is going to change those strategic decisions, so again I’ll not spend too much time focusing on which specific tool is best, but will focus on why they are important.

Starting off with a repository for our bakery, we’ll need somewhere to store our packer code, pipeline configuration, ansible configuration and other required resource IaC and so having a central place to put it is a good idea and having version control keep a track of our changes to configuration can be helpful over the lifecycle of an image.

Some decent code repository tools are GitHub, GitLab, BitBucket and Azure DevOps but some orgs will be aligned to older technologies such as Team Foundation Services (TFS) or Subversion (SVN) or one of so many other options! What you choose doesn’t really matter as long as we have somewhere to put the bakery code.

These days a lot of repositories come with Pipeline tools included , GitHub Actions, Azure DevOps Pipelines, GitLab CI/CD to name a few, but there are also tools that specialise such as Octopus Deploy and Jenkins. Again choice of tool is fairly inconsequential here they will all be capable of supporting a bakery set of pipelines.

Now for the actual process that we will be putting into the pipeline tool, we’ll be setting up a process that aligns to ‘The Baking Process’ as outlined earlier.

  1. A base image is selected — Parameters / Triggered selection of images
  2. A VM or container instance of the selected base image is created — Packer Build
  3. The instance is configured to the required level — Ansible run from Packer
  4. The instance is validated against the configuration — Ansible run from Packer
  5. The Instance state is saved and turned into an image — Packer
  6. The image is then made available to be used on VMs or containers in future — Packer
  7. Clean up resources used for process — Packer

What this process boils down to is :

  • Image Identification — which images are you going to build in this specific run
  • Run Packer build — Run Packer build against all selected images

It’s really that simple, Packer does an amazing job at managing the temporary build instance, launching our configuration scripts, and tidying up after itself.

Now the complexity around the bakery comes from trying to do the following:

  • Manage Child / Parent image dependencies, especially when trying to build in parallel
  • Checking whether the version about to be built is already available or not
  • Adding triggers/ Automating builds when patches or changes are deployed
  • Actually getting all the configuration you require in your images so they can be successfully deployed

It wouldn’t be much fun if I solved all these problems for you now would it 😉

But that’s our basic pipeline! All the ingredients have come together and our bakery has taken shape.

A little burnt around the edges

So on to my experience creating an Azure image bakery and some learnings I had making an image bakery in Azure for the first time, in the hopes that you can avoid them.

We were aiming to build a Proof of Concept (PoC) Azure bakery to get a better understanding of the specific cloud provider eccentricities and pull together a repository that could be used as an example to build upon in future.

For our bakery we wanted the Azure resources to be more long lived and used multiple times as if they were deployed into a customer estate, and so we paired our Packer configuration up with resources built using Terraform, referencing the resource IDs for network and Resource groups for whether the packer resources should be deployed.

Unsurprisingly being that this was my first image bakery, some of the challenges come from a lack of initial understanding, But I also found that usage of Packer in Azure is not widely documented on the web and so finding help was a challenge at times. This is one of the main reasons why I decided to write this blog — to help others potentially avoid the confusion I hit in building this bakery.

When building an image bakery in Azure we needed to setup few resources first:

  • Azure Compute Gallery (previously known as Shared Image Gallery)
  • Image definitions

Azure Compute Gallery and Azure Image Definitions

The Azure Compute Gallery is a resource that helps you build structure and organisation around your images and applications. It helps you manage security and access around the images, global replication of images and versioning and grouping of images.

In our bakery we set up the gallery as the storage for our custom images. There are other options for image storage, but the gallery’s features around access control suited it towards the organisational design our PoC was aligning with.

We then needed to create image definitions per image being deployed.

“Image definitions are a logical grouping for versions of an image. The image definition holds information about why the image was created and also contains Image metadata such as, what OS it is for, features it supports and other information about using the image”

This is where my problems started. For anyone who has worked in Azure you’ll be familiar that VMs need to be provided with an OS_SKU, Publisher and Offer to select the operating system in which to install.

I believed when we set up the image definitions in Terraform that the OS_SKU, Publisher and Offer needed to align to the Base operating system we were using for creating the first custom images.

identifier {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "18.04-LTS"
}

This was making it difficult to create child images because the compute gallery will only allow image definitions to share one or two of the values for OS_SKU, Publisher and Offer, but not all three values. (But a child image is based off of the same original OS ?! — Cue confusion)

This was incorrect.

The OS_SKU, Publisher and Offer defined in your image definition are YOUR values.

So you can have the Publisher be your company name, the Offer be the team and the OS SKU be the Image Type, or whatever values you like!

identifier {
publisher = "ContinoBakery"
offer = "Base_Ubuntu"
sku = "18_04_secure_base"
}

You actually define what the image is created from in the Packer values, and as long as the image definition is aligned on the overall type of OS (Windows / Linux) then it can be linked to the image.

Maybe it was just me that got confused but it wasn’t clear to me when I read through documentation, so here we are Plain and clear — Use your own values, just make sure you align the naming strategy with everyone else building images in your gallery!

It’s a New Generation

On to the next issue, After Packer had built the temporary instances into Azure it was occasionally hitting issues with saving the images. It wasn’t occurring on all of our images but just a couple of images did not want to be created. It was odd as the VMs spun up quite happily, so their parent image was obviously available, however Packer did not want to save the image.

It turns out we had misaligned our VM generations between the values being sent to Packer to build, and those defined in our image template.

If you’re unaware of the differences in Hypervisor generation then Microsoft has a good article discussing the differences and on which VM families that they are available on.

Another one to remember for future, you cannot create a generation 2 VM and then try to save it in a generation 1 image definition. And of course if your estate is relying on some of the Image sizes that are specific to a generation then you’ll need to think about creating generation specific images.

WinRM Connectivity

The last little burnt crust on our otherwise pristine cake, Configuring Windows systems… Configuring our custom Linux images was fairly simple, SSH connectivity allowed the remote commands to be run against the instances out of the box with minimal additional configuration.

Our issue arose when it was time to work on the Windows VMs. Allowing access to WinRM on the instance so that it can be configured remotely using the configuration scripts run from the Packer VM.

We needed to enable WinRM straight after the VM was created, allowing the connections in to configure the server. Looking at code online we found Packer examples for AWS that were using a feature called user_data_file where you can provide a script to run on instance at creation time — That’s perfect!

So we found a script that would set up the required WinRM settings, and we added it as a user_data_file, assuming it would be run on creation… This is not the case in Azure, just passing a user_data_file doesn’t work.. It’s still not allowing connections to the VM.

So after some investigation we found the problem. We needed to pass both a user_data_file containing the WinRM settings along with a custom_script command to run the user_data_file on the instance.

  user_data_file = "winrm_bootstrap.ps1"
custom_script = "powershell -ExecutionPolicy Unrestricted -NoProfile -NonInteractive -Command \"$userData = (Invoke-RestMethod -Headers @{Metadata=$true} -Method GET -Uri http://169.254.169.254/metadata/instance/compute/userData?api-version=2021-01-01$([char]38)format=text); $contents = [System.Text.Encoding]::UTF8.GetString([Convert]::FromBase64String($userData)); set-content -path c:\\Windows\\Temp\\userdata.ps1 -value $contents; . c:\\Windows\\Temp\\userdata.ps1;\""

And we can connect! Huzzah!

Again this is maybe something others are already aware of but it’s one of those potholes I got tripped up by that now you know about it you can avoid more easily.

Finishing Touch

So there we go, the cakes are ready and it’s time to review what we’ve baked.

We’ve discussed Image Bakeries and how creating images can help to speed up the deployment of applications and help to standardise Operating systems across an estate, looking at the process and why you might want inheritance based structures across your images.

Searching through our Utensils shelves we found the tooling that you might use to create your bakery such as industry leading tools like Packer and Ansible and pipeline tools like GitHub or Azure Devops amongst many others. There’s many ways to make a cake. If you wanted a Whisk but your organisation could only provide a Fork then you’ll still be able to make a half decent bakery, it may just be more work to get it started.

And finally I’ve told you my burnt edges, soggy bottoms and downright disgusting tales of woe from when I tried to do all of this myself, and hot issues along the way so that you don’t struggle as I did.

I hope this blog has been helpful. If you are interested in more detailed personalised baking expertise then feel free to reach out to me on LinkedIn or to our Contino Team for more assistance.

But for now — Thanks for reading, and happy baking!

--

--