Running Terraform CICD pipelines On GCP with Gitlab

Kodanda Rama
Google Cloud - Community
5 min readNov 21, 2022

Infrastructure as a code has quickly emerged as the default process for provisioning and managing resources on the cloud. It is fast, consistent, easy to replicate and combined with GitOps, easy to control, track and audit the changes made right from the source code repository. More teams have been embracing IaC and CICD for their infrastructure.

In this article, let’s look at leveraging GitOps and terraform to automate the deployment and management of resources on Google Cloud Platform.

Prerequisites

  1. A basic understanding of IaC, fundamental Terraform concepts and commands
  2. Knowledge of Git and general CICD mechanisms
  3. A GCP project and user account with access to create Instances and a Service Account
  4. Enterprise or a SaaS Gitlab account with access to create repositories and a pipeline

Setup

  1. A Gitlab instance running on a Compute Engine instance. However, the same will work (with a few networking changes) on the public Gitlab as well as on-premises hosted version
  2. A Gitlab repository that will contain the terraform code
  3. A Gitlab runner hosted on docker in a Compute Engine instance. It is to be configured to work with the repository which supports CICD. This runner instance should have the network connectivity to the Gitlab instance
  4. Terraform code will be executed on this runner, so it is recommended to host it on a GCP Compute Engine instance. This allows the usage of underlying service account associated with the instance to authorise and create resources in GCP. The instance should have the required API scopes enabled and has the necessary permissions
  5. If the runner is not on GCP, then it’d require creating a service account key and passing it as a secret variable to the pipeline to authenticate for terraform, which is not ideal. We would have to maintain and rotate the key in that case

Terraform Code

Code repository link

In this example, we’ll create compute images from persistent disks using a snapshot. This is a use case that comes up often during cloning migrations.

As a best practice, the terraform state is managed on a GCS bucket as remote backend which supports state locking out of the box.

Workflow & Pipeline Overview

The below example uses the Github branching strategy for the CICD workflow. More information regarding this and various git branching strategies in detail here.

In short, the Github flow works with two branches. A main branch and a feature branch based off the main branch. Any code changes are committed to a local feature branch and pushed to to its remote. A merge request (a.k.a pull request) to the main branch is opened once the work is ready to be deployed. It will be reviewed and merged to main branch upon approval. Terraform apply is executed on a merge to main branch through the merge request only. To protect it against direct code pushes, the main branch should be a protected branch.

The Gitlab pipeline job triggers, defining sequence of execution and stages for the pipeline are declared in .gitlab-ci.yaml configuration file which is located in the root of the repository.

Below is a sample configuration file

# Workflow image
image:
name: hashicorp/terraform:0.13.2
entrypoint:
- "/usr/bin/env"
- "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"

# Workflow variables. They can be overwritten by passing pipeline Variables in Gitlab repository
variables:
TF_ROOT: $CI_PROJECT_DIR/image-creation
TF_LOG: WARN
TF_TIMEOUT: "-lock-timeout=600s"
TF_PLAN_NAME: plan.tfplan
TF_PLAN_JSON: plan.json
REFRESH: -refresh=true
ENVIRONMENT_NAME: "prod"

# Provides the Terraform version and reconfigures the backend state during init
# Note: The leading dot (.) ignores this as a "job" while the ampersand (&) is an Anchor declaring the script as a variable to use elsewhere
.terraform-ver-init: &terraform-ver-init
- cd $TF_ROOT
- terraform version
- terraform init --upgrade=True

#terraform init is run before any stage jobs
before_script:
- *terraform-ver-init

# Cache files between jobs
cache:
key: "$CI_COMMIT_SHA"
# Globally caches the .terraform folder across each job in this workflow
paths:
- $TF_ROOT/.terraform

# Provides a list of stages for this GitLab workflow
stages:
- validate
- plan
- apply

#Job: tf-fmt | Stage: validate
#Purpose: check the format (fmt) as a sort of linting test
tf-fmt:
stage: validate
script:
- terraform fmt -recursive -check
only:
changes:
- "*.tf"
- "**/*.tf"

# Job: Validate | Stage: Validate
# Purpose: Syntax Validation for the Terraform configuration files
validate:
stage: validate
script:
- terraform validate
only:
changes:
- "*.tf"
- "**/*.tf"
- "**/*.tfvars"

#Job: plan | Stage: Plan
#Runs terraform plan and outputs the plan and a json summary to
#local files which are later made available as artifacts.
plan:
stage: plan
dependencies:
- validate
before_script:
- *terraform-ver-init
- apk --no-cache add jq
- alias convert_report="jq -r '([.resource_changes[]?.change.actions?]|flatten)|{\"create\":(map(select(.==\"create\"))|length),\"update\":(map(select(.==\"update\"))|length),\"delete\":(map(select(.==\"delete\"))|length)}'"
script:
- cd $TF_ROOT
- terraform plan -out=$TF_PLAN_NAME $REFRESH
- terraform show --json $TF_PLAN_NAME | convert_report > $TF_PLAN_JSON

only:
changes:
- "*.tf"
- "**/*.tf"
- "**/*.tfvars"

artifacts:
reports:
terraform: ${TF_ROOT}/$TF_PLAN_JSON
paths:
- ${TF_ROOT}/$TF_PLAN_NAME
- ${TF_ROOT}/$TF_PLAN_JSON
expire_in: 7 days #optional. Gitlab stores artifacts of successful pipelines for the most recent commit on each ref. If needed, enable "Keep artifacts from most recent successful jobs" in CI/CD settings of the repository.

#Stage:apply | job: apply
# purpose: executes the plan from the file created in the plan stage
apply:
stage: apply
dependencies:
- plan
script:
- cd $TF_ROOT
- terraform apply -auto-approve $TF_PLAN_NAME
only:
- main

Let’s look at some of the important components of the above file that controls the workflow.

Running terraform init job under before_script ensures that it is executed before any of the other stage defined jobs.

.terraform-ver-init: &terraform-ver-init
- cd $TF_ROOT
- terraform version
- terraform init --upgrade=True

before_script:
- *terraform-ver-init

The above block defines that the jobs are triggered only on changes to the files ending with .tf, .tfvars files. This is useful because we won’t be triggering the pipeline for unrelated changes like documentation updates.

only:
changes:
- "*.tf"
- "**/*.tf"
- "**/*.tfvars"

The terraform plan output and a simplified json summary of it are accessible as reports and artifacts.

artifacts:
reports:
terraform: ${TF_ROOT}/$TF_PLAN_JSON
paths:
- ${TF_ROOT}/$TF_PLAN_NAME
- ${TF_ROOT}/$TF_PLAN_JSON
expire_in: 7 days

A detailed explanation the supported keywords for the configuration file is here

Implementation

A typical end-to-end workflow for the above configuration file would look like this :

  1. Changes to the codebase are committed to the feature branch
  2. Pipeline triggers on a push to a feature branch and runs terraform init and plan after syntax validation and formatting checks
Pipeline runs validate and plan on commit to feature branch
Pipeline runs validate and plan stages on feature branch

3. Once the plan output is verified and ready to be deployed, a merge request is raised to main branch. The plan execution output and artifacts containing plan file and json summary are displayed on the merge request

Terraform plan logs and artifacts are displayed on the merge request

4. Once the merge request is approved and the code is merged to the main branch, the pipeline runs validate, plan and also runs terraform apply on the plan generated and creates the specified resources, in this case snapshots and images from persistent disks

Triggers apply stage on the main branch and creates resources

Next Steps

Now that we have a working pipeline, there are a few things to be considered to make it robust enough and optimised for production environments.

  1. A branching or a folder structure strategy for multi environment pipelines
  2. Including terraform tests as a part of the pipeline
  3. Integrating with static code analysis tools like checkov, SonarQube

--

--

Kodanda Rama
Google Cloud - Community

Cloud Engineer at Google | DevOps | Python | Kubernetes | Containers