Terraform Lasagna - How to Layer your Deployments

Nicolas Jomeau
ELCA IT
Published in
4 min readApr 28, 2023

While many resources exist online to learn about Terraform, few actually explain how to use it in an industrial setup. In this article, we will cover how we use Terraform in Data Platform projects at ELCA for robust and fully featured deployments where multiple environments and different inter-connected Providers are required.

Terraform logo displayed on a mouth-watering lasagna. Yummy!
Credits to Paul Pan’s beautiful lasagna shot

Terraform is an Infrastructure-as-Code (IaC) tool that lets you define cloud and on-prem resources using a a domain specific configuration language. Since it is code-based, it is versionable with tools like Git, enabling the use of DevOps best practices to manage the infrastructure throughout its lifecycle. Just as a lasagna has multiple layers of pasta, cheese, and sauce, deploying new resources with Terraform requires layering environments and Providers.

Layering environments

As an industry standard for software development, using separated environments for development, testing, and production infrastructure is mandatory to avoid bugs creeping in customer facing services. As such, deploying new resources should follow the same pattern, splitting configurations in environment-specific layers:

  • New resource is added and tested on Development environment.
  • If successful, it is then deployed on Test environment where it receives production-like traffic and use.
  • Finally, and after approval, it is deployed on Production infrastructure.

In itself, such process follows the same steps that would be encountered with an application’s CI/CD. Minor differences comes from the statefullness of Terraform as it requires the state of the previous deployment to avoid overwriting existing resources.

The aforementioned process can be implemented using Terraform paired with a CI/CD tool such as Azure DevOps, Jenkins, or GitHub Actions.

Terraform Variables are key to environment switching. By using a variable containing the environment name for naming or tagging resources, we ensure every resource is unique within a cloud or on-prem Provider. This in turn facilitates resource management, accounting, and monitoring of the infrastructure across environments. Naming conventions and definition are specified in a single file within a locals block.

variable "project" {
type = string
description = "(Required) Project's name"
}

variable "environment" {
type = string
description = "(Required) Environment"
}

# Unique names ensure easier monitoring and management
locals {
prefix = "${var.project}-${var.environment}"

virtual_network_name = "${local.prefix}-vnet"
compute_subnet_name = "${local.virtual_network_name}-sub-compute"
}

If using a CI/CD tool, a deployment pipeline can provision these variables depending on which git branch the pipeline is running on and thus automate the whole infrastructure management. The tool (or user) must provide a dedicated state file for each environment to avoid corrupting their state. Ideally, deployments should be gated behind deployment windows (e.g. work hours, only Monday to Thursday, …), manual approvals, and locks to avoid costly mistakes or deployment during running data jobs.

Variables allow switching between environment configurations. Distinct state files should be used for each environment!

Layering Providers

When deploying our data platform with Terraform, we encountered a series of strange errors: after some time, deployments would start failing randomly with the `default auth: cannot configure default credentials` error message. Looking at our code (such as below) and the environment variables we were passing to our script, nothing unusual should be happening…

provider "azurerm" {
subscription_id = ...
alias = "main"
features {}
}

resource "azurerm_databricks_workspace" "this" {
provider = azurerm.main
param1 = ...
param2 = ...
}

provider "databricks" {
azure_workspace_resource_id = azurerm_databricks_workspace.this.id
host = azurerm_databricks_workspace.this.workspace_url
}

As it turns out, data and managed resources have the same dependency resolution, which means they are evaluated after the Provider setups. While it is possible to declare Provider which may depend on another in a single Terraform project, doing so will inevitably lead to some cryptic errors with the dependent Provider not having any credentials. And of course these errors will appear at the worst possible time! As of this article’s time of writing, Terraform doesn’t have an easy fix to this situation.

Two solutions are however available, with one adding another tasty layer:

  • Adding depends_on keywords to every resource using a dependent Provider. It reduces the frequency of the errors but also add a lot of redundant code that is easy-to-forget.
  • Splitting the project in 2 separate deployments for the parent Provider and the dependent Provider. It adds a bit of code such as copying locals/variables definition but it also removes any dependency error that may still happen.

We retained and easily implemented the second solution. Moreover, it also allows fine-grained administration within DevOps tools by running each deployment with a different technical account and adapted rights on the infrastructure.

Full deployment is done in two distinct steps, in this case for a Kubernetes Cluster on a cloud provider.

Wrapping it up

With layers in our Terraform pipelines. we found ways to make our deployments more robust and capable of managing multiple environments: using distinct resource names as well as with different state files allows to easily switch between environments, and decoupling linked Providers fixes dependencies issues. Hopefully you now have new techniques to bake your own Terraform Lasagna 😊

Article co-written and reviewed by Benjamin Audren & Antoine Hue

--

--

Nicolas Jomeau
ELCA IT
Writer for

Data Engineer at ELCA. I love experimenting new ML approaches and data engineering techniques!