Deploying Azure Infrastructure in Terraform through a YAML Azure DevOps Pipeline

10 min readOct 4, 2023

Using a pipeline for Terraform code offers a multitude of advantages that streamline and enhance the infrastructure provisioning and management process. By automating the deployment of infrastructure as code (IaC) through a well-structured pipeline, organizations can achieve greater efficiency, reliability, and collaboration in their cloud infrastructure management. In this discussion, we will explore these advantages in detail, highlighting how implementing a Terraform pipeline can lead to improved consistency, scalability, and overall operational excellence in cloud infrastructure management.

Overall Architecture in Azure

Overall Architecture Infra in Azure DevOps (ADO)

Overall Git Architecture in Azure DevOps (ADO)

First it is necessary to store the state file in a backend.

Why using a backend in Terraform?

The Terraform backend, a vital component of the Terraform workflow, is responsible for storing and managing infrastructure state. It stores the state remotely in Azure blob storage to facilitate collaboration, locking, versioning, and security.

- Security: Terraform backends store the state file remotely, preventing it from residing on local workstations or version control systems, enhancing Security. Backends offer access control and encryption features to secure the state data, ensuring only authorised users can access or modify it.

- Concurrency and Locking: Backends provide a locking mechanism to prevent multiple users from simultaneously modifying the infrastructure, reducing the risk of conflicting changes.

- Collaboration: Remote backends enable team members to collaborate effectively and avoid the manual distribution of state files.

- State History and Versioning: Many backends support state versioning and history tracking, allowing users to review changes over time and revert to previous states.

Steps to create and use a terraform backend

- Create a dedicated resource-group and storage account for test and production

- Enable Soft Delete (90 days)

- Disable public blob access

- Create a container called tfstate (it will contain the backend)

- Add a delete Lock

Backend Documentation

- Create your resources with AZ command line

- Securing Terraform State in Azure

- Terraform on Azure Pipelines Best Practices

- Managing Terraform State in Azure: Best Practices for Multiple Environments

- Running Terraform in an Azure DevOps Pipeline: A Comprehensive Guide

- Azure DevOps Pipelines with Terraform and Stages

- Set up Backend (az storage accounts)

Once the backend has been created, the storage account has to be securely configured.

Backend Security

- Connect to your backend using a Service-Principal (see next step)

- Encryption: Enable Azure Storage Service Encryption (SSE)

- Access control: Implement role-based access control (RBAC) for your Azure Blob Storage using Azure Active Directory (Azure AD)

- Firewall: Limit access to the storage account by configuring a Service Principal

- Public network access: Enabled from selected virtual networks and IP addresses (e.g. your local IP for local testing)

- Network Routing: Internet routing

- Set Authentication method to Access key

Once the backend has been securely setup, it is time to access it from Azure DevOps (ADO)

ADO will use what is called a Service Connection that will connect to a Service Principal (SP) in Azure.

What is a SP and why do you need it?

- A SP is an identity that is used to perform tasks in Azure on behalf of our pipeline

- It is setup in Azure Active Directory (now Microsoft Entra ID) in app registration

- With it, the pipeline can connect to Azure and create/delete resources automatically

Advantages of SP

- Non-Interactive Authentication: SP provides an automatic authentication to Azure without having to go through any logins

- Granular Permissions: every SP can be granted a granular scope of permission on a resource, resource-group or subscription level

- Long-Term Access: there won’t be any tokens or password to update

A Service Principal will be automatically created when creating a service connection in ADO.

Creating a Service Connection in ADO

- Project settings > Create service connection > azure resource manager > service principal > keep rg blank > enter a name > select ‘grant access permission to all pipelines’

- e.g. Name in ADO: DionysosAzureServiceConnection

- e.g. Name in AAD: WineApp-******** — MPN in AAD

- Assign the Storage Blob Data Contributor role in Azure Active Directory to the service principal used to connect ADO to the azure container where the backend is

How to Create an Azure Remote Backend for Terraform

Then, Install Terraform Extension

- Organization Settings > extensions > market place > search for terraform > select org > install

Now we can create the yaml pipeline, the terraform pipeline is made of 3 parts:

- the main file (azure-pipeline.yml) which is going to pass variables (e.g. dev or prd) and call the two other files

- the plan.yml file which will screen the terraform code for errors and create a plan terraform file

- the deploy.yml which will use the plan file to then deploy resources

Here is the organisation of the azure-pipeline.yml

- trigger: upon pushing to the mentioned branch, the pipeline will automatically start (for the first time the trigger branch wont work)

- variables: we pass the service connection used in the pipeline and the backend configuration (please note the backend key is actually the name of backend file, the name is determined in the pipeline)

pool: the OS of the VM (agents) that will run the pipeline. Can be either Ubuntu or Windows. It matters because Ubuntu is faster but Windows has tools that can be used in the pipeline that Ubuntu doesn’t have. Also in Linux, file path are with \ whilst in windows they are with /

name: $(BuildDefinitionName)$(SourceBranchName)$(date:yyyyMMdd)$(rev:.r)

trigger:
  branches:
    include:
    - master
    # - development

variables: # terraform variables
 ServiceConnectionName: 'anynameofsp'
 bk-rg-name: 'rg-name'
 bk-str-account-name: 'strname'
 bk-container-name: 'containername'
 bk-key: 'terraform.tfstate' # key is actually name of the file, determined here

pool:
  vmImage: ubuntu-latest # This is the default if you don't specify a pool or vmImage.

stages:

  - stage: validate_terraform
    displayName: 'Validate Terraform'

    jobs:
    - template: Terraform/plan.yml
      parameters:
        env: dev
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/master')}}: #if branch is master then execute
      - template: Terraform/plan.yml
        parameters:
          env: prd

  - stage: deploy_terraform
    displayName: 'Deploy Terraform'
    dependsOn:
    - validate_terraform # makes sure validate_terraform stage runs first
    condition: succeeded('validate_terraform') ## stage runs only if validate_terraform is successful

    jobs:
    - template: Terraform/deploy.yml
      parameters:
        env: dev
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/master')}}: #if branch is master then execute
      - template: Terraform/deploy.yml
        parameters:
          env: prd

How a yml pipeline is organised

- Each pipeline has a stage, which has agents, that run jobs, that contain steps, that have tasks

- stage > jobs > steps > tasks

- Key concepts for new Azure Pipelines users

How to ensure that a pipeline runs dev and prd differently and only if pushing to production?

- By adding an if statement: ${{ if eq(variables[‘Build.SourceBranch’], ‘refs/heads/master’)}}

- This will run this step only if the branch is named master (in this case)

How to ensure that a pipeline runs only if the previous step worked?

- By adding condition: succeeded(‘validate_terraform’)

How to use separate files in a pipeline

- In order to use another file (here plan.yml and deploy.yml) you can use template

- E.g. template: Terraform/plan.yml (Terraform is the name of the folder in the directory)

Now we can build our terraform plan.yml

- pass-on the parameters dev or prd using parameters

- first: install terraform (every terraform yml file needs to do so)

- second: initialise terraform (see init command)

- third: validate terraform (optional): checks if tf code is correct

- fourth: run plan (see plan command)

- fith: move plan.tf to artifact directory

- finally: publish plan.tf as an artifact

init command:

- Add the working directory where the main.tf is at (e.g. ‘$(System.DefaultWorkingDirectory)/Infrastructure’)

- Add the backend configuration

plan command:

- Add the working directory where the main.tf is at (e.g. ‘$(System.DefaultWorkingDirectory)/Infrastructure’)

- Add the name of the service connection

- command options: see command options plan

command options plan:

- -lock=false (so the agent running this tage doesn’t lock the backend)

- -var-file=”vars/${{parameters.env}}.tfvars” to provide the tfvars file, without it the pipeline won’t proceed if you’re using variables

- -out=$(System.DefaultWorkingDirectory)/Infrastructure/terraform.tfplan’ to create the plan.tf file that will be saved to artifacts and used by the deploy stage

save artifact:

After plan completes, archive the entire working directory, including the .terraform subdirectory created during init, and save it somewhere where it will be available to the apply step. A common choice is as a “build artifact” within the chosen orchestration tool.

parameters:
  env: ''

jobs:
  - job: ${{parameters.env}}_validate_tf
    displayName: 'Validate ${{parameters.env}} terraform scripts'
    pool:
      vmImage: windows-latest

    steps:
    - task: TerraformInstaller@1 # 1st/ install terraform
      displayName: 'Install Terraform' # install always need to be installed at every stage
      inputs:
        terraformVersion: 'latest'

    - task: TerraformTaskV4@4
      displayName: 'Initialise Terraform' # init needs to be installed at every stage
      inputs:
        provider: 'azurerm'
        command: init
        workingDirectory: '$(System.DefaultWorkingDirectory)/Infrastructure' # where the Terraform code is at
        # fetched from variables (azure-pipeline.yml)
        backendServiceArm: $(ServiceConnectionName)
        backendAzureRmResourceGroupName: '$(bk-rg-name)-${{parameters.env}}'  
        backendAzureRmStorageAccountName: '$(bk-str-account-name)${{parameters.env}}'  
        backendAzureRmContainerName: '$(bk-container-name)-${{parameters.env}}'  
        backendAzureRmKey: '${{parameters.env}}$(bk-key)'  

    - task: TerraformTaskV4@4
      displayName: 'Validate Terraform'
      inputs:
        provider: 'azurerm'
        command: 'validate'   

    - task: TerraformTaskV4@4
      displayName: 'Plan Terraform'
      inputs:
        provider: 'azurerm'
        command: 'plan'
        workingDirectory: '$(System.DefaultWorkingDirectory)/Infrastructure'
        environmentServiceNameAzureRM: $(ServiceConnectionName)
        commandOptions: '-lock=false -var-file="vars/${{parameters.env}}.tfvars" -out=$(System.DefaultWorkingDirectory)/Infrastructure/terraform.tfplan'
        # var file = selecting the tfvars for each environment
        # out = creating the plan file to the Infrastructure folder and call it terraform.tfplan

    - task: CopyFiles@2
      displayName: 'Moving Terraform Code to artifact staging'
      inputs:
        Contents: 'Infrastructure/**'
        TargetFolder: '$(build.ArtifactStagingDirectory)'
        # Plan and apply are on different machines
        # plan state should thus be saved on a file
        # it will be then be loaded by apply

    - task: PublishBuildArtifacts@1
      displayName: 'Making artifact available to apply stage'
      inputs:
        PathtoPublish: '$(Build.ArtifactStagingDirectory)'
        ArtifactName: 'output-${{parameters.env}}'
        publishLocation: 'Container'

Now we can build our terraform deploy.yml

- For this stage we are using a deployment job

- Deployment strategy: runOnce vs Canary vs rolling. The runOnce is the simplest deployment strategy and all steps are executed once (preDeploy, deploy, routeTraffic)

- enviroment: create an environment in ADO (Pipelines > environment) read more about Deployment Environment

- Checkout: if self, the repository will be cloned within the job

- download:

- artifact: Before running apply, obtain the archive created in the previous step and extract it at the same absolute path. This re-creates everything that was present after plan, avoiding strange issues where local files were created during the plan step

- install terraform

- run terraform apply

Command options terraform apply:

- -lock=true we want to lock the backend

- -lock-timeout=5m we want to realse the backend after the step has been run

$(Pipeline.Workspace)/output-${{ parameters.env }}/Infrastructure/terraform.tfplan’

parameters:
 env: ''

jobs:
- deployment: deploy_infrastructure_${{ parameters.env }}
  displayName: 'Deploy infrastructure for ${{ parameters.env }}'
  pool:
    vmImage: 'windows-latest'
  environment: 'deploy_infrastructure_${{ parameters.env }}' # Pipeline Environment (ADO), benefit??
      # First, you need to make sure you are Creator in the Security of environment to solve below issue:
      # Job deploy_infrastructure_dev: Environment dev could not be found. => needs to be first created
      # The environment does not exist or has not been authorized for use.

  strategy:
    runOnce: ## RunOnce vs Canary Deployment
      deploy:
        steps:
          - checkout: none #or self (clone repo in current job)# getting code from the repo # TO DO: try without
          - download: current # get latest artifact
            artifact: 'output-${{ parameters.env }}' #fetch the output file from the plan phase

          - task: TerraformInstaller@1 # 1st/ install terraform
            displayName: 'Install Terraform' # install always need to be installed at every stage
            inputs:
              terraformVersion: 'latest'

          - task: TerraformTaskV4@4
            displayName: 'Apply Terraform' # init needs to be installed at every stage
            inputs:
              provider: 'azurerm'
              command: 'apply'
              workingDirectory: '$(Pipeline.Workspace)/output-${{ parameters.env }}/Infrastructure'
              environmentServiceNameAzureRM: $(ServiceConnectionName)
              commandOptions: '-lock=true -lock-timeout=5m $(Pipeline.Workspace)/output-${{ parameters.env }}/Infrastructure/terraform.tfplan'
              # lock timeout so lease is unlocked after 5 minutes

Build a separate Terraform Destroy Pipeline

name: $(BuildDefinitionName)$(SourceBranchName)$(date:yyyyMMdd)$(rev:.r)

trigger: none

variables: # terraform variables
 ServiceConnectionName: 'nameofserviceconnection'
 bk-rg-name: 'rg-name'
 bk-str-account-name: 'sracountname'
 bk-container-name: 'tfstate'
 bk-key: 'terraform.tfstate' # key is actually name of the file, determined here

pool:
  vmImage: ubuntu-latest # This is the default if you don't specify a pool or vmImage.

stages:

  - stage: validate_terraform
    displayName: 'Validate Terraform'

    jobs:
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/development')}}: #if branch is development then execute
      - template: Terraform/plan.yml
        parameters:
          env: dev
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/master')}}: #if branch is master then execute
      - template: Terraform/plan.yml
        parameters:
          env: prd

  - stage: destroy_terraform
    displayName: 'Destroy Terraform'

    jobs:
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/development')}}: #if branch is development then execute
      - template: Terraform/plan.yml
        parameters:
          env: dev
    - ${{ if eq(variables['Build.SourceBranch'], 'refs/heads/master')}}: #if branch is master then execute
      - template: Terraform/destroy.yml
        parameters:
          env: prd

parameters:
  env: ''

jobs:
- deployment: destroy_infrastructure_${{ parameters.env }}
  displayName: 'Destroy infrastructure for ${{ parameters.env }}'
  pool:
    vmImage: 'windows-latest'
  environment: 'deploy_infrastructure_${{ parameters.env }}' # Pipeline Environment (ADO), benefit??
      # First, you need to make sure you are Creator in the Security of environment:
      # Job deploy_infrastructure_dev: Environment dev could not be found. => needs to be first created
      # The environment does not exist or has not been authorized for use.

  strategy:
    runOnce: ## RunOnce vs Canary Deployment
      deploy:
        steps:
          - checkout: none #or self (clone repo in current job)
          - download: current # get latest artifact
            artifact: 'output-${{ parameters.env }}' #fetch the output file from the plan phase

          - task: TerraformInstaller@1 # 1st/ install terraform
            displayName: 'Install Terraform' # install always need to be installed at every stage
            inputs:
              terraformVersion: 'latest'

          - task: TerraformTaskV4@4
            inputs:
              provider: 'azurerm'
              command: 'destroy'
              workingDirectory: '$(Pipeline.Workspace)/output-${{ parameters.env }}/Infrastructure'
              environmentServiceNameAzureRM: $(ServiceConnectionName)
              commandOptions: '-lock=true -var-file="vars/${{parameters.env}}.tfvars"'

Tips to help you build terraform yaml

- Use templates: in ADO go to Repo > Set up build > show assistant > get terraform suggestions, then get the yaml code and copy it in a .yml file

- Give a custom name to your pipeline

- Select the right Agent Pool

- How to set up Triggers trigger can be a branch (e.g. branches: include: -name of branch) or a folder (e.g. paths: include: -name of folder)

- Build the step validate

- Move the terraform plan to an artifiact as specified by hashicorp documentation

- apply in a deployment job

- Automated Terraform CLI Workflow

- Edit system environment variables > environment variables > path > new > add path where terraform is at > new command line

Solving errors

Error while loading schemas for plugin components: Failed to obtain

│ provider schema: Could not load the schema for provider

│ registry.terraform.io/hashicorp/azurerm: failed to instantiate provider

│ “registry.terraform.io/hashicorp/azurerm” to obtain schema: fork/exec

│ .terraform/providers/registry.terraform.io/hashicorp/azurerm/3.72.0/linux_amd64/terraform-provider-azurerm_v3.72.0_x5:

│ permission denied..

Solution for ADO

- using Ubuntu agents, (linux uses ‘/’ instead of ‘\’), Terraform couldn’t find the file. Switching to the correct agent (in my case Windows) solved it

Solution locally

- hashicorp_documentation

- In your terraform folder run the following command to find your permissions: ls -l .terraform\providers\registry.terraform.io\hashicorp\azurerm\3.72.0\windows_386

- Then run the PowerShell command to setup permissions: icacls .terraform\providers\registry.terraform.io\hashicorp\azurerm\3.72.0\windows_386

- Solution in pipeline = give the service principal, in Azure AD app registration, contributor rights to the backend blob container

Giving permissions to the environment

- Checks and manual validations for deploy Terraform: Permission Environment <nameofenv> permission needed: permit ==> Granting permission here will enable the use of Environment ‘deploy_infrastructure_prd’ for all waiting and future runs of this pipeline. ==> Permission granted