Are Terraform’s days numbered?

Alistair Grew

Published in

Appsbroker CTS Google Cloud Tech Blog

10 min readApr 4, 2023

An exploration of Kubernetes Resource Manager and the Google Config Connector

Introduction

Are Terraform’s days numbered? Quite a bold question when you consider its prevalence in the orchestration of infrastructure, especially in public cloud and my preferred flavour of Google Cloud where Google’s own Cloud Deployment Manager is practically unused (and yes I appreciate Cloud Formation and Azure Bicep both have a following in their respective clouds). I’m not going to go into detail on how Terraform works other than to say it defines infrastructure in a declarative way and to point people at a video of one of the original authors Armon Dadgar explaining it:

The key from my perspective though is that it is declarative with your desired configuration defined as code and your current known (I make that distinction now as a precursor of what’s to come) defined in a ‘state’ stored somewhere, when you want to make a change you update the code and execute the changes before updating the state to reflect how your infrastructure now is configured.

Kubernetes by comparison was originally conceived by Google as an open-source version of its internal Borg container orchestration tool. This has fairly quickly gained market dominance as the way to orchestrate containers, with each of the key three public clouds having their own Kubernetes services (GKE, EKS, and AKS). Kubernetes is often also managed in a declarative way by the use of YAML manifests defining resources such as services, deployments, and stateful sets to name but a few. These manifests can then be applied to the cluster using the kube-apiserver component which writes the cluster state in the etcd database, this is then picked up by the kube-controller-manager which then enacts this desired state and ensures that this remains the case.

Source: https://kubernetes.io/docs/concepts/overview/components/

What has made Kubernetes increasingly powerful though is the concept of Custom Resource Definitions (CRDs) which allows you to break Kubernetes outside of its original purpose of keeping containers in check. This brings us nicely round to Config Connector which is an add-on to Kubernetes which allows you to manage Google Cloud resources through Kubernetes using CRDs. In theory, this could allow at least some degree of replacement for the role Terraform plays in the workflow here at CTS so I set about conducting an experiment, can I provision a Google Cloud landing zone purely using Kubernetes and config connector?

Can I provision a Google Cloud landing zone purely using Kubernetes and config connector?

Bootstrapping

Before we can get going I first need to bootstrap an initial environment. This is composed of two things, a ‘landing-zone’ folder and a project at the same level.

Within the config-connector-test project, I then set up the infrastructure for GKE and the config connector before I can proceed further. Details on how to do this are documented but basically involve standing up a GKE cluster with Workload identity and config connector turned on. The main differences I made were to provision as a regional cluster for availability reasons and as a private cluster (with a public admin endpoint) for security reasons.

One of the next steps I had to do was add the IP of the cloud console I was using to the authorised networks list of the GKE cluster this can be determined using a CURL:

curl ifconfig.me/ip

When I got to the scoping step I opted to scope to the landing-zone folder (as opposed to a project) where I had already granted the owner, shared VPC admin, and project creator roles.

Finally, the final step of the Google documentation is to check everything is working!

For reference, the namespace I created for any subsequently created resources is the somewhat unimaginatively name ‘landingzone’.

First Test, Project Creation

From here on I followed a slightly different Google guide to the previous one to focus on resource creation. That though looks at pubsub whereas I want to create a project in my folder. Whilst it is possible to do this from terminal commands (and I did initially) the Google reference documentation is very good and I was able to quickly find the project resource and example code. Let's add some configuration and deploy this…

Applying it against the cluster

Now unfortunately that didn’t work although this does give us a helpful opportunity to demonstrate troubleshooting this. First by getting all the projects which sure enough confirms that the update has failed.

List all the project resources in the landingzone namespace

Next, I want to find out why by describing the resource.

Sure enough, it didn’t take long to find the error I forgot to enable the resource manager API.

Don’t forget to enable your APIs people!

Because of the way I had scoped the service account the cluster doesn’t have the required permission to do this itself so I will enable it manually. I also fixed a couple of subsequent errors linked to the billing account API and finally a billing account permissions issue. The finished result though was a project with the correct billing account:

So the theory at least works but manually running kubectl apply -f for the YAML is quickly going to get boring, let’s build this into a pipeline.

Automating Deployment

The good news with automating deployment is that plenty of tools already exist to do this for Kubernetes but for this example, I am going to use ArgoCD which is a tool we like here at CTS. Again setup for ArgoCD is a well-documented process so I am not going to detail it here but in true Blue Peter fashion here is an application that I made earlier…

Now that I have done that I need a target landing zone structure, thankfully I recently designed a basic one as part of a pre-sales opportunity so I will reuse that:

So the first step is to create the folders which are pretty straightforward as the parent folder ID is well known and I can just drop it in and give the folders the names above. After some ArgoCD tweaking (enabling directory recursion and auto sync) it springs into life and deploys the folders as requested.

With the folders created it is then time to do the projects. This is where I hit my first real challenge, the project resource is expecting the parent folder ID, which isn’t known until the folder itself is created.

The project resource is expecting the parent folder ID, which isn’t known until the folder itself is created.

In Terraform this would be as simple as referring to the output of the folder resource as per the example but after much Googling, looking through both documentation and example code I couldn’t find an equivalent within the config connector world. This, in my opinion greatly reduces Config Connectors' usefulness in hierarchical structures or simply where interdependencies exist based on values unknown until the creation of a resource. Needless to say, if someone has found a good way to do this please let me know!

This, in my opinion greatly reduces Config Connectors’ usefulness in hierarchical structures or simply where interdependencies exist based on values unknown until the creation of a resource.

Ignoring the above for a moment and defining the projects by manually putting in the folder ID worked fine with one caveat, the metadata name is the project ID and therefore needs to be globally unique within Google.

Now that I have stood up some projects I am going to shift focus toward defining resources. Let’s start with one we commonly see in Landing Zones, a shared VPC. To do this we need to define three separate blocks of code, starting with the network itself:

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeNetwork
metadata:
  annotations:
    cnrm.cloud.google.com/project-id: "<NETWORKING PROJECT>"
  name: sharedvpc
spec:
  autoCreateSubnetworks: true

One crucial element of the above is that I have had to annotate it with the project the resource is going into. Config connector has the concept of ‘scopes’ either at an organisation, folder, or project level. When I originally set up this config connector I scoped the landingzone namespace to the top-level landing zone folder which means by default resources will try and build there. Without this annotation, it tries to deploy the network into a non-existent landingzone project resource.

Staying on scopes a little longer the other method to separate scopes out is to have multiple namespaces, I do like this very much in the context of developer enablement where you could have projects scoped to individual developer teams that are isolated within their own namespaces.

Anyway moving swiftly on the code for the host project configuration and service project configuration was also pretty straightforward:

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeSharedVPCHostProject
metadata:
  annotations:
    cnrm.cloud.google.com/project-id: "<NETWORKING PROJECT>"
  name: sharedvpc-host

apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeSharedVPCServiceProject
metadata:
  annotations:
    cnrm.cloud.google.com/project-id: "<NETWORKING PROJECT>"
  name: serviceproject-applications
spec:
  projectRef:
    name: <SERVICE PROJECT>

With the configuration working nicely as defined:

Screenshot showing shared VPC subnets shared with the Applications Project

Config Connector vs Terraform

From this point you probably get the idea, you can define every resource defined with Google’s documentation, with resource support equivalent to the terraform provider. Fundamentally both can support declarative infrastructure which in my opinion is where the management gains are made. Then it becomes more of a question of differentiating features, do you prefer HCL or YAML? Do you like Config Connector’s ability to automatically remediate drift to ensure your infrastructure stays as you have defined? Do you have a lot of interdependencies based on created values that suit Terraform’s graphing functionality? Do you want to see your proposed infrastructure changed before applying them? Are you intending to use Kubernetes as your compute platform anyway? These and plenty of others are all valid questions to think about before making your choice.

So what are my thoughts? Well to answer my original question, no I don’t think Terraform’s days are numbered. Its flexibility and provider support make it a useful common denominator for organisations operating in multiple clouds. In this post I set out to do a common task here at CTS, deploying the basis of a landing zone which I was able to achieve with some caveats. Do I feel Config Connector was better at doing this than Terraform? No, not really, Terraform is pretty mature these days and our typical workflow incorporates numerous pre-defined modules written by Google which speeds us up further through greater abstraction. As mentioned above I also found the seeming inability to pass information from one object to another a frustrating hurdle in more complex and hierarchal configurations. But it isn’t all bad news for Config Connector, I like its YAML-based syntax and for application-centric infrastructure provisioning within an isolated project. For example, say I need a GCS bucket for a Kubernetes-based microservice a developer could be enabled to self-serve. With tooling like Anthos’ Policy Controller, you could even define constraints on how this infrastructure should be configured in order to provide strong safeguards against misconfiguration.

Config Controller

Whilst in the example above I demonstrated KRM using a Kubernetes cluster and config connector. Config Controller is a tool within the Anthos family that basically allows you to do away with creating your own Kubernetes cluster as per their helpful diagram:

Looking to the future

So Crystal Ball gazing time, what do I feel the future holds? Well, I quite like the idea of a hybrid approach which is how I see this slotting into workflows. I would suggest that Terraform is currently better for platform-centric deployments like deploying landing zones or other infrastructure that changes infrequently. Config Connector I feel however has a really compelling place in application-centric deployments where you might be deploying small amounts of infrastructure in support of a Kubernetes-based service. I am already seeing some organisations take this step in the name of developer enablement which can only be a good thing (with appropriate guardrails!) to deliver services to users faster.

Conclusion

Hopefully, this post has been helpful to anyone considering Config Connector in understanding what it can and can’t do, its strengths and weaknesses along with some troubleshooting sprinkled in. Certainly, I have enjoyed going on the journey of experimenting with it. Until next time keep it Googley :)

CTS is the largest dedicated Google Cloud practice in Europe and one of the world’s leading Google Cloud experts, winning 2020 Google Partner of the Year Awards for both Workspace and GCP.

We offer a unique full stack Google Cloud solution for businesses, encompassing cloud migration and infrastructure modernisation. Our data practice focuses on analysis and visualisation, providing industry specific solutions for; Retail, Financial Services, Media and Entertainment.

We’re building talented teams ready to change the world using Google technologies. So if you’re passionate, curious and keen to get stuck in — take a look at our Careers Page and join us for the ride!