Google Cloud Deployment Manager

“The Missing Tutorials” series

Daz Wilkin
Google Cloud - Community
9 min readSep 27, 2017

--

Decided to revisit an earlier post Deploying Docker Engine in swarm mode to GCP to try to take advantage of Google’s Runtime Config. I’d like to coordinate the generation of multiple workers by having them obtain the swarm join-token from a Runtime Config variable.

This gives me the opportunity to expand my knowledge of Deployment Manager as I’ll need to use Service Management service to enable APIs, IAM to generate a Service Account (with enhanced scope) for the docker node VMs, Cloud Resource Manager to revise the project’s policy and, of course, Runtime Config to create and manage the tokens.

This post will be a stream-of-writing lessons learned as I encounter problems and solve them.

Python not JINA

The Deployment Manager service supports Python scripts and JINJA templates and it supports mixing and matching both. I discourage you from using JINJA templates. It is likely that you’ll need to embrace the superset of functionality provided by Python and it’s likely better to have everything in Python.

Supported Resource Types

Deployment Manager (DM) scripts define an intended state of GCP resources. When you need to use DM with an unfamiliar resource, a good place to start is this page of supported resource types:

Alternatively, you can enumerate the list from the command-line:

There’s a 1:1 mapping services, service endpoints and their resources (types) but it‘s not always an obvious mapping. It’s also not bijective: not all service resource types are mapped to Deployment Manager resource types; the service resource types are a superset. Thus, you can’t use Deployment Manager to manage all GCP’s resources.

The Compute (Engine) v1 service is comprehensively and consistently mapped to Deployment Manager types:

The Deployment Manager v2 service has a a resource type Manifests but this is not accessible as a supported Deployment Manager resource type:

The Deployment Manager v2 endpoint is:

However the IAM service endpoint is:

And the service defines some of its resource types with projects or organizations prefixes, e.g. projects.serviceAccounts and projects.serviceAccounts.keys:

But these map to Deployment Manager resource types without the “projects” prefix:

Creating Service Accounts

The Deployment Manager GitHub examples include the representation of service accounts (types) but this is not documented elsewhere and I will use it as an example of encountering a new resource type. In this case, we may create a service account without an accompanying key because the service account will be used by a Compute Engine VM.

Using the Cloud SDK command-line, the equivalent commands would be:

The documentation for [projects.]serviceAccounts summarizes (some of) the properties associated with this resource. These properties represent an instantiated resource:

But, to create a Service Account, it’s necessary to provide an accoundId and a displayName. How do I know this? The create method document this:

And, trusty API Explorer proves it:

iam.projects.serviceAccounts.create

Deployment Manager (appears to) use(s) a flat set of properties and so the API method’s hierarchy isn’t preserved:

and becomes:

Enabling Services

The IAM service is not enabled by default in projects. It is likely that, if you try to create a Service Account as described above, DM will balk with an error:

We need to use the Service Management service to enable the IAM API and then we need to add a dependency from Deployment Manager serviceAccount type that we created previously to depend on IAM being enabled.

You won’t find a Deployment Manager type for the Service Management service:

Once again the GitHub samples provide the solution. I found this inadvertently when I was searching the way to create Service Accounts. It’s not obvious though and there’s a singular result for this method:

This code works! I don’t know why it works but it does. I don’t understand the reference to deploymentmanager.v2.virtual.X but I am able to make sense of the properties. Once again, API Explorer helps:

servicemanagement.services.enable

Flattening the API method’s required properties (serviceName, consumerId) yields the code provided in the Google sample. NB I’m also use context.env to grab the ‘project’ (== GCP Project ID) from the runtime environment as this is required for the value of the consumerId. I turned the code into a function and, in this case, I’m calling the function and passing $API == “iam” to enable iam.googleapis.com

This is equivalent to the following Cloud SDK command:

Mutating (IAM) Policies

Okay “changing policies”, “updating policies” but “mutating” just sounds so much more fun! I’m waiting on some sample code on the GitHub site to show how to perform this. It’s complicated by the way this service works.

Customarily, the way to mutate an IAM policy is to:

  • Get the current policy
  • Mutate it
  • Put the revised policy back to the service

The policy document not only includes the current list of policy bindings for the e.g. project but it also includes an etag which is used as a concurrency mechanism. It’s effectively a hash of the policy. If, when you put the policy back to the service, the hash of the service’s stored policy differs from the etag in the document you put to the service, the service knows that the policy was revised and that your changes aren’t applied and will be rejected.

This is a challenging mechanism to represent in Deployment Manager and the GitHub samples include a helper function that merges a service account into a policy. I understand from the Deployment Manager team that an alternative solution should be available soon.

The Cloud SDK includes a convenience method add-iam-policy that does the work behind the scenes:

I‘m told by the Deployment Manager team that this functionality is imminent and in the form of a new capability called ‘actions’. I’ll update this content when the feature’s released.

Runtime Config(urator)

Deployment Manager includes a feature called Runtime Config(urator). It provides functionality that particularly relevant to Deployment Manager but, it is in fact, a standalone service. I described a way to use Runtime Config in combination with Global Scope as a way to pass configuration data to Cloud Functions.

Runtime Config provides a key improvement to deploying Docker swarm. When a swarm is initialized, the 1st (Docker Engine) node generates a manager token and a worker token that must be provided by other nodes to prove themselves when joining the cluster.

1st (genesis) node:

It’s easiest to obtain the token(s) from this genesis node by requerying it and, as a hint to how we’ll proceed, assigning the value to an environment variable:

These commands return the token(s) only.

With these tokens, from other (Docker Engine) nodes (on other VMs), we could then:

So, two outstanding questions, how do we:

  • make the tokens available to the other nodes?
  • block creation of the other swarm nodes on the tokens’ availability?

This is what Runtime Config provides us. Runtime Config is scoped to a single project and permits further scoping through namspaces. Let’s firstly create a namespace called ‘swarm’ for the Docker swarm mode tokens:

We can then create variables arbitrarily within this ‘swarm’ namespace. I chose to create a variable called ‘worker’ and another called ‘manager’ but to put these in a hierarchy under ‘token’. The ‘token’ prefix is redundant but..

I’m using the “ — is-text” flag. The tokens are plaintext (alphanumeric) and this saves having to base64 decode the values when retrieved.

Runtime Config provides mechanisms for watching and waiting on variables but it was not immediately clear to me that either of these provides the functionality needed here. Instead — and somewhat unhappily — I decided to ‘hack’ a solution (please comment-ping me with improvements).

The Deployment Manager script creates token/manager and token/worker variables with a $DUMMY setting *before* it creates any VMs (obviously including the genesis node). The VMs all know that, if a value from the variables if $DUMMY, the genesis node is not yet ready and they block and retry after one minute.

Advantage(s)

  • Simple

Disadvantages

  • Less elegant
  • Potentially infinite blocking
  • Requires shared knowledge of the $DUMMY value

NB The Runtime Config service must be enabled and this is what’s checked in the dependsOn when the Config is created.

Then, in the startup script for the 1st (genesis) node, after the swarm init, the worker and manager tokens are requested and are used to replace the $DUMMY values:

So that the startup script for a worker can pull the value from the Runtime Config variable for the token. It may be $DUMMY but, when it’s isn’t, it will be the correct worker token value:

And, clearly, the startup script for a manager flips the variables:

Let’s test it:

And, ssh’ing into swarm-master:

NB I’ve hacked the output to make it more presentable here: swarm-master is annotated as “Leader” and the three masters are all marked “Reachable”

Conclusions

Deployment Manager is powerful but would benefit from more comprehensive documentation for noobs like me. The service is well-designed but it’s not always intuitive (consistent).

--

--