Coordinating VM Clusters with Google Compute Engine’s Metadata Server

David Allen
Google Cloud - Community
5 min readFeb 9, 2018

If you’re new to Google’s Compute Engine, one of the features you may have noticed is the ability to tag custom metadata to a running image. In the dialog to edit an individual VM, it looks like this:

This article covers what you can do with this and why it’s useful. We’ll work through what’s behind it, and then cover a very useful coordination use case that you can accomplish by using this feature — namely getting multiple VMs all participating in the same cluster to recognize each other at startup.

Background on the Metadata Server

Compute Engine’s documentation includes a section on storing and retrieving instance metadata, where you can see that:

Every instance stores its metadata on a metadata server. You can query this metadata server programmatically, from within the instance and from the Compute Engine API, for information about the instance, such as the instance’s host name, instance ID, startup and shutdown scripts, custom metadata, and service account information. (…) When you make a request to get information from the metadata server, your request and the subsequent metadata response never leaves the physical host running the virtual machine instance. Metadata information is also encrypted on the way to the virtual machine host, so you can be sure that your metadata information is always secure.

In the screenshot above, I set niftySetting to be 5. Within the VM, we can fetch this value using regular curl. Remember to include the Metadata-Flavor header, it is required to fetch the value, and makes the server return the result as application/text.

$ curl -s -H "Metadata-Flavor: Google" http://metadata.google.internal/computeMetadata/v1/instance/attributes/niftySetting5

In addition to being able to fetch custom metadata like this, importantly a bunch of things you’d want to know about the VM are already provided to you. For example:

Aside from clicking “edit” on a VM and adding key/values directly, you can also use the gcloud tool like so:

gcloud compute instances add-metadata my-instance \
--metadata niftySetting=5

As a second option, it’s also possible to define project-level metadata, which propagates down to all instances within the project, like so:

gcloud config set-project my-project-id
gcloud compute project-info add-metadata --metadata projectSetting=42

If done in this way, all VMs will see that projectSetting=42.

VM Coordination

A common use case for this metadata are configuration parameters that you can pass to VMs that start up, from the outside. In the container world, it’s more typical to pass environment variables into a container on startup. Passing environment variables does not work with VMs, but you can accomplish something similar by setting metadata on the VM, and then putting a simple bash script on the VM which at startup uses curl against the metadata server to fetch the value of those variables, and do whatever configuration you need.

Neo4j Causal Clustering Example

In my case, I was working on deploying Neo4j Causal Cluster instances, which are simply 3 different VMs that participate in a highly-available and fault-tolerant graph database cluster. When first starting a Neo4j cluster though, the discovery protocol needs to be told about one another so that they can discover each other on the network and form the cluster. Normally you would do this by configuring a particular address or DNS name. Neo4j requires this as part of an aptly named setting causal_clustering.initial_discovery_members.

In the case of deploying new VMs though, it’s tough to know what the address should be because it won’t be allocated until the VM starts up.

Coordinating by Internal DNS

In order to deploy a cluster, we use a deployment manager template, which allows us to specify metadata for each VM. Because we are deploying 3 instances, we’ll know what the instance names are in the template. The instance names in turn map to local DNS names inside of google compute, like so:

$ ping -c 1 node1PING node1.c.my-gcp-project.internal (10.142.0.2) 56(84) bytes of data.64 bytes from node1.c.my-gcp-project.internal (10.142.0.2): icmp_seq=1 ttl=64 time=0.019 ms

If you’re curious about how or why that piece works, see the internal DNS documentation.

Deployment Manager Template Metadata

Within the deployment manager template then, a variable is set for each VM, which in the jinja templates, looks like this:

metadata:
items:
- key: causal_clustering_initial_discovery_members
value: {{instanceName}}-1:5000,{{instanceName}}-2:5000,{{instanceName}}-3:5000

The VMs themselves are created by using the vm_multiple_instances.py module provided by google as a sample, inside of the jinja templates.

On each VM, a shell script picks that variable up from the metadata server and plugs it into the standard Neo4j configuration file, ensuring that the cluster members communicate properly even without knowing their IPs or hostnames ahead of time. This knowledge is provided by the deployment manager templates at deploy time.

This could have been done with a project-level setting too, but notice that if we might deploy more than one set of inter-linked resources within a project, you’ll prefer to do it with VM level settings so as not to confuse the multiple resource sets.

When Not to use Instance Metadata

Probably the best use cases are for dynamic configuration (as we covered above) and also use cases like static description of VMs (i.e. this VM runs version 3.3.2 of Neo4j Enterprise)

For a variety of reasons, it’s not a good idea to use metadata as a communication channel between VMs. For example, multiple VMs setting project metadata and querying it. This approach would be high complexity, high latency, and basically boils down to using a set of global variables to coordinate state, which is a notorious source of problems in software.

The metadata server has a “wait for updates” feature that allows VMs to be notified when a value changes, but it can be easy to abuse. Of course it depends on your use case, but if you find yourself needing this functionality extensively, chances are good that you might be better off with a better solution such as a message queue or other similar component, best to avoid using these as a publish/subscribe channel.

--

--