GIB — The Golden Image Builder

Riaan Nolan
8 min readNov 11, 2023

--

In this post I will help you go from Zero to Hero building Golden Images or Standard Operating Environments. This Topic is complex by nature, since it spans, multiple Operating Systems, Multiple Clouds and Multiple Technologies, but, if you walk this through, you will come out the other side a Top Gun Engineer, able to build Any Operating System, on Any Cloud with extreme confidence. If you get stuck, please reach out to me on my Linkedin profile https://www.linkedin.com/in/riaannolan/ and I will help you, just like others have helped me before.

Ok, let’s dive in.

The common theme across many DevOps / Infrastructure as Code / Configuration as Code / Automation projects is versioning.

We want to create a versioned artefact that we can deploy, and our Infrastructure or Golden images are no different.

Enter versioned artefacts.

Simply put, we want to move fast, develop fast, while keeping versioned checkpoints (artefacts) so to speak, that we can deploy, test and if necessary roll back or roll forward.

Golden images / Standard Operating Environments (SOEs) are much the same. Gone are the days of manualley maintaining Golden Images or Standard Operating Environments SOEs.

TL;DR The code can be found here:

https://github.com/star3am/golden-image-builder

GIB me Hooman!!! GIB is a good Doggo ❤

I’d like to introduce you to my take on this whole Golden Image thing, building versioned Standard Operating Environments SOEs.

But before I delve into it, there is a concept I’d like to run past you.

  • Day 0 — Building an Image (SOE) This Image is not “live” or “running” yet.
  • Day 1 — The Image is now booted into a VM / Container. It is now live and needs to register to end points e.g Monitoring / Domain join / OU Transition for Group Policies to take place.
  • Day 2 onwards — Something needs to change in this running Image VM or Container e.g Yum, Apt or Windows Updates, configuration changes, Certificate rotations, anything at all.

Candidates for this project are: Build Agent VMs, POS systems, Kubernetes Worker Nodes, VMs that run workloads, Point of Sale machines, really any VM or Container that has to be spun up in the 100s or 1000s of times.

In addition to this Day 0, Day 1 and Day 2 dilemma, we need to consider complexity. The more complex something is, the more error prone it becomes the harder it is to manage.

For that reason, we want to keep our development toolchain standardised and settle on the minimum languages to use, for that reason, I’ve picked:

  • HCL — Hashicorp Configuration Language (With HCL you can write Terraform Modules and Packer Templates)
  • YAML — Yet Another Markup Language (With YAML you can write Automation Pipelines (Github Actions, Gitlab, Azure DevOps, Circle CI etc.) and Ansible Roles and Playbooks)

Enter Automation Pipelines + Packer + Ansible + Terraform + AWX Ansible Tower.

Build RedHat 7.9 8.3, Windows 2016 2019 2022, Ubuntu 18.04 20.04 22.04 on Azure, GCP and AWS at once.

Packer, Ansible, Terraform — This Trio of Technology can configure and build any Operating System, on any Cloud, and can be used locally as a Top Gun Development Environment! (See my post about Top Gun Terraform Development Environments)

So what is HashiCorp Packer?

Packer is an open source tool for creating identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs on every major operating system, and is highly performant, creating machine images for multiple platforms in parallel. Packer does not replace configuration management like Chef or Puppet. In fact, when building images, Packer is able to use tools like Chef or Puppet to install software onto the image.

A machine image is a single static unit that contains a pre-configured operating system and installed software which is used to quickly create new running machines. Machine image formats change for each platform. Some examples include AMIs for EC2, VMDK/VMX files for VMware, OVF exports for VirtualBox, etc.

What is RedHat Ansible?

Red Hat® Ansible® Automation Platform is an end-to-end automation platform to configure systems, deploy software, and orchestrate advanced workflows.

What is HashiCorp Terraform?

Terraform is an infrastructure-as-code software tool created by HashiCorp. Users define and provide data center infrastructure using a declarative configuration language known as HashiCorp Configuration Language.

Going back to our Day 0, Day 1 and Day 2 concept, let’s explain where we use what tool.

  • Day 0 — Automation Pipelines + Packer + Ansible
  • Day 1 — Automation Pipelines + Terraform + Ansible
  • Day 2 onwards — Automation Pipelines + Terraform + AWX Ansible Tower

For Autiditng, Compliance and Visibility I am using Terraform Cloud and AWX Ansible Tower. Let’s quickly address what those tools are.

What is Terraform Cloud

Terraform Cloud is a Web Interface for Terraform. It has great features such as Single Sign on, Remote AND ENCRYPTED state storage, Workspaces, Explorer (to see which workspaces failed, and which versions of modules the workspaces are using, and MUCH, MUCH more)

What is AWX Ansible Tower

AWX Ansible Tower is a Web Interface for Ansible. It has great features such as Single Sign on, a Web interface into your Ansible Jobs, Smart Inventories, Credentials, Projects and Secret Engines.

Ok! So enough of the theory! Let’s Do this!

  1. We Use VSCode Dev Containers (Meaning you don’t need to install Ansible, Packer and Terraform, it’s installed via a Dockerfile and VSCode will drop you into that container.)
Inside the DevContainer of our project

2. We use Github Actions for our CI/CD Automation Pipeline and you can see the whole pipeline file here: https://github.com/star3am/golden-image-builder/blob/master/.github/workflows/pipeline.yml

Our Gi0thub Actions Pipeline

3. We use the same Dockerfile for our Build Agent and Developer Environment, meaning, we get consistent results every time. You can view the Dockerfile here: https://github.com/star3am/golden-image-builder/blob/master/Dockerfile

We use a EVERYTHING in CODE mentality.

4. Our Packer Templates can build a Golden Image / Standard Operating Environment SOE on any popular Cloud and you can see my Ubuntu 22.04 Packer Template here:
https://github.com/star3am/golden-image-builder/blob/master/packer/linux/ubuntu/ubuntu-2204.pkr.hcl

5. Our Ansible Roles are “Pulled Down” or “Installed” with Ansible Galaxy

      - name: Ansible Galaxy install roles
run: ansible-galaxy install -f -r ansible/roles/requirements.yml -p ansible/roles/

and the requirements.txt looks like

---
- src: 'https://github.com/ansible-lockdown/RHEL8-CIS'
version: '2.5.1'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/RHEL7-CIS'
version: '1.2.3'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/UBUNTU22-CIS'
version: '1.2.0'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/UBUNTU20-CIS'
version: '2.1.1'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/UBUNTU18-CIS'
version: '1.4.0'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/Windows-2016-CIS'
version: '1.2.1'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/Windows-2019-CIS'
version: '1.3.0'
scm: 'git'

- src: 'https://github.com/ansible-lockdown/Windows-2022-CIS'
version: '1.0.0'
scm: 'git'

- src: 'https://github.com/star3am/ansible-role-win_openssh'
version: 'ssh-playbook-test'
scm: 'git'

- src: 'https://github.com/star3am/ansible-role-example-role'
version: 'master'
scm: 'git'

6. My Ansible-role-example-role supports, Windows, Yum and Debian based systems, and basically just add an ansible user to the system and it writes a “Fingerprint” to the Image, meaning that running VMs can be tracked back to the original build and versions.

ubuntu@hashiqube-aws:~$ cat /build-202311100026.json  | jq
{
"inventory_file": "/tmp/packer-provisioner-ansible4259354685",
"inventory_dir": "/tmp",
"ansible_host": "127.0.0.1",
"ansible_user": "ubuntu",
"ansible_port": 41081,
"inventory_hostname": "none",
"inventory_hostname_short": "none",
"group_names": [
"ungrouped"
],
"ansible_facts": {
"system": "Linux",
"kernel": "6.2.0-1015-aws",
"kernel_version": "#15~22.04.1-Ubuntu SMP Fri Oct 6 21:37:24 UTC 2023",
"machine": "x86_64",
"python_version": "3.10.12",
"fqdn": "ip-172-31-5-254.ap-southeast-2.compute.internal",
"hostname": "ip-172-31-5-254",
"nodename": "ip-172-31-5-254",
"domain": "ap-southeast-2.compute.internal",
"userspace_bits": "64",
"architecture": "x86_64",
"userspace_architecture": "x86_64",
"machine_id": "6f039c13c0ac4848a55d23f834de86b8",
"is_chroot": false,
"virtualization_type": "xen",
"virtualization_role": "guest",
"virtualization_tech_guest": [
"xen"
],
"virtualization_tech_host": [],
"distribution": "Ubuntu",
"distribution_release": "jammy",
"distribution_version": "22.04",
"distribution_major_version": "22",
"distribution_file_path": "/etc/os-release",
"distribution_file_variety": "Debian",
"distribution_file_parsed": true,
"os_family": "Debian",
....

7. Once the Image has been built I use Terraform and Hashiqube’s Multi Cloud Terraform https://hashiqube.com/ and https://registry.terraform.io/modules/star3am/hashiqube/hashicorp/latest module to spin up HashiQube on this new Base AMI (Our Golden Image) — I do this for a few reasons:

module "awx_ansible_tower" {
source = "star3am/hashiqube/hashicorp"
deploy_to_aws = true
aws_instance_type = "t2.large"
use_packer_image = var.use_packer_image
deploy_to_azure = false
deploy_to_gcp = false
debug_user_data = true
ssh_public_key = var.ssh_public_key
ssh_private_key = var.ssh_private_key
debug_allow_ssh_cidr_range = "0.0.0.0/0"
whitelist_cidr = "101.189.198.17/32"
vagrant_provisioners = "basetools,docker,minikube,ansible-tower"
}

8. I use Terraform Cloud for my remote Terraform Runs, it’s free and blazing fast! https://app.terraform.io

Terraform Cloud Workspace apply

9. I use AWX Ansible Tower to do my Ansible Runs and you can log into the AWX Ansible Tower web interface on the http://Instance-IP:8043 with the username: admin and the password which is outputted on the Terraform Run

Your AWX Ansible Tower username and Password

And once we log in, we can see our AWX Ansible Tower Dashboard and our successful job run, which was triggered by Terraform

AWX Ansible Tower Job Output

Example Terraform Code to trigger an Ansible Run via a Null_Resource

variable "tower_cli_remote" {
type = string
default = "~/.local/bin/awx"
}

variable "tower_cli_local" {
type = string
default = "/Users/riaannolan/bin/awx"
}

variable "tower_host" {
type = string
default = "https://10.9.99.10:8043/"
}

data "external" "tower_token" {
program = ["/bin/bash", "-c", "${var.tower_cli_local} login --conf.host ${var.tower_host} --conf.insecure --conf.username admin --conf.password \"password\""]
}

locals {
timestamp = timestamp()
}

resource "null_resource" "awx_cli" {
triggers = {
timestamp = local.timestamp
}

provisioner "remote-exec" {
inline = [
"${var.tower_cli_remote} --conf.host ${var.tower_host} -f human job_templates launch 9 --monitor --filter status --conf.insecure --conf.token ${data.external.tower_token.result.token}",
]

connection {
type = "ssh"
user = "vagrant"
password = "vagrant"
host = "10.9.99.10"
}
}

provisioner "local-exec" {
command = "${var.tower_cli_local} --conf.host ${var.tower_host} -f human job_templates launch 9 --monitor --filter status --conf.insecure --conf.token ${data.external.tower_token.result.token}"
}
}

10. You can now build Versioned Golden Images / Standard Operating Environments SOEs — ALL in CODE!!!

Thank you for your time, to go through this how-to, I hope that you have enjoyed it as much as I did putting it together.

You are welcome to connect with me on Linkedin https://www.linkedin.com/in/riaannolan/
Credly profile: https://www.credly.com/users/riaan-nolan.e657145c

--

--

Riaan Nolan

My head is in the clouds and my feet are in the beach sand, I’m working on a dream ❤