Update GCP Managed Instance Groups using Cloud Asset Feeds

How to use Cloud Asset Feeds together with Pub/Sub and Cloud Functions to trigger the rebuild of Managed Instance Group VMs as soon as a new Golden Image is available.

McKinsey Digital
McKinsey Digital Insights
7 min readOct 21, 2022

--

by Marco Marulli — Principal Lead II, Cloud Delivery, McKinsey & Company

This article is part of a series that focuses on VM-based compute solutions in GCP. In particular, we describe how to build and deploy Golden Images and update Managed Instance Group’s VMs as new Golden Images are available:

What are Cloud Asset Feeds?

As your organization deploys resources to GCP, it would be nice to have a tool that allows your teams (e.g., IT ops, Compliance, Security, etc.) to view, monitor, and understand all these assets across projects and services. Luckily, this tool exists, and it’s called Cloud Asset Inventory. One of the tool’s central features is monitoring asset changes and notifying interested subscribers through Cloud Asset Feeds.

Landing Zone Structure

The organization’s Landing Zone has the following characteristics:

  • Hosts multiple applications, and each application has one GCP project per lifecycle environment.
  • Uses Shared VPC for foundational networking components.
  • Has a GCP “Golden Images Project” reserved to host Golden Images

These are some details related to the Golden Images process in place:

  • A dedicated GIT (Golden Images Team) to build, test, and deploy Golden Images to the Golden Images Project.
  • Application teams are responsible for their Projects’ compliance with the latest published Golden Images.
  • No cross-project notification mechanisms are currently available.
  • The application teams do not have a scheduling tool that would allow scheduling the rebuild of Managed Instance Groups’ VMs regularly.

In the first article of the series, we looked at how to build and deploy Golden Images to GCP using Packer and Cloud Build.

We concluded Part 1 with the Golden Images team publishing the first NGINX Golden Image and AT1, the Application Team for Application 1, using that Golden Image to build an Instance Template and then a MIG (Managed Instance Group).

Because new NGINX Golden Images will be released regularly, the AT1 team would like to find a way to update its NGINX MIG VMs automatically.

How can we automatically update the MIG Virtual Machines as a new version of the Golden Image released?

The diagram below shows one way to do it using Cloud Asset Feeds and is followed by a four-step explanation of exactly how this works:

  1. Whenever a GCE Image is created in the Golden Images Project, Cloud Asset Inventory detects the change and triggers Cloud Asset Feed.
  2. The Pub/Sub Topic in the Application Project App1 subscribed to the Asset Feed receives a notification.
  3. Pub/Sub triggers the execution of Cloud Function that will read the notification’s content. If the notification is related to a new NGINX Image, it will create a Cloud Scheduler Job so that VMs can be rebuilt whenever we believe it is most appropriate.
  4. The Cloud Scheduler Job runs and triggers Cloud Build, which will execute the Terraform code that rebuilds the MIG’s VMs.

Now that we have the GCP services and the process steps identified, it’s time to start the build.

Activities in the Golden Images Project

Here we only need to enable the Cloud Asset Inventory API (cloudasset.googleapis.com) and create the Cloud Asset Feed, for which we can use the Terraform code below:

Terraform code to create a Cloud Asset Feed for image creation events

After we create a Service Account for the Cloud Asset Inventory service, we grant it the Pub/Sub publisher role so that it can write to the Pub/Sub topic in the Application Project that needs to be notified of compute image creation events.

Then, the google_cloud_asset_project_feed Terraform resource defines our Cloud Asset Feed:

  • asset_types = [“compute.googleapis.com/Image”] — feed only Compute Engine Image events.
  • feed_output_config = pubsub_destination { topic … } — set the Pub/Sub topic in the Application GCP Project as the target of the feed.
  • condition = !temporal_asset.deleted && google.cloud.asset.v1.TemporalAsset.PriorAssetState.DOES_NOT_EXIST — feed only creation events. Common Expression Language (CEL) is the expression language used to specify a feed condition for one or more Temporal Assets.

Activities in the Application 1 Project

Prerequisites

Make sure that the following APIs are enabled:

  • Pub/Sub: pubsub.googleapis.com
  • Cloud Functions: cloudfunctions.googleapis.com
  • Cloud Build: cloudbuild.googleapis.com

Cloud Function

Below is a sample Python Cloud Function that parses incoming Asset Feeds and schedules a Cloud Scheduler Job if the feed is related to the creation of an NGINX Golden Image. The Pub/Sub topic will receive image creation events for any image that is created:

The Cloud Scheduler job will schedule the execution of a Cloud Build Manual Trigger at a time that we can specify. In this way, the Terraform code used to deploy all the resources in the project will be re-executed, and both the Instance Template and the Manage Instance Group will be updated. Because the Instance Template was set to use the latest Image in the image family, Terraform will detect the new Image and force the Instance Template replacement. The Managed Instance Group that uses the Instance Template will not be replaced but updated instead. This will ensure no service disruption if the NGINX VMs serve incoming web traffic. The MIG VMs will be updated one at a time, thus guaranteeing service continuity.

Before deploying the function, we generate the requirements.txt file with all our Python packages and run the “gcloud functions deploy” command from the terminal.

Time for a test

Now that all the necessary GCP resources have been deployed, it’s time for a test. If we go to the GCE Images menu, we see only one NGINX Image in the Golden Images Family. This is our initial situation:

Initial situation: only one NGINX Golden Image

We then run the image build pipeline as described in part one of this series, and we see a new Golden Image is added to the list of available GCE Images:

A new NGINX Golden Image is built and deployed

As expected, the function processed a Pub/Sub publish event related to the creation of a GCE Image part of the NGINX family. This means that if we go to Cloud Scheduler, we should see a job scheduled to start a Cloud Build Trigger, and here it is:

Cloud Scheduler Job added by the Cloud Function

As we do not want to wait for this to run on a schedule, we’ll click on the “run now” button to run the job. The status of the job is immediately set to “success” as Cloud Scheduler only cares about starting the Cloud Build Manual Trigger and not about the Cloud Build results. Luckily Cloud Build ran successfully, and, from the Instance Template screen, we can see the new Instance Template created off the latest deployed Golden Image.

The newly created Instance Template references the latest Golden Image in the image family

And finally, if we go to the Managed Instance Group, we see that the Manage Instance Group has been updated with the latest Instance Template:

The Managed Instance Group references the latest Instance Template

Conclusion

This article described how to use Cloud Asset Feeds to update Managed Instance Group’s VMs as new Golden Images are released. However, it’s important to remember that:

  • Automatically rebuilding VMs may not be always recommended, especially if an application team does not have a way to automate the software installation after the VMs are updated. In the case above, automation was possible as it just required installing NGINX and copying its configuration file.
  • If rebuilding VMs automatically is not an option, alternatives include using GCP VM Manager or third-party tools like Ansible.
  • If you use a single Terraform code base to deploy your entire project and possibly hundreds of resources, you may want to consider separating the Terraform state of the MIG. Because the Terraform plan will always automatically be applied, this will guarantee that only changes to the MIG are deployed.

Another topic worth discussing in the Golden Images journey is continuous monitoring. More specifically, two questions should be answered:

  • How do we prevent deploying VMs that are planning to use an image not among those approved by the organization?
  • How do we scan all the VMs running in the Landing Zone to identify those that are not Golden Images compliant?

These could be topics for upcoming articles, so stay tuned.

--

--