Building End-to-End MLOps Pipelines for Sentiment Analysis on Azure with Terraform, Kubeflow v2, Mlflow, and Seldon: Part 2

11 min readJun 5, 2023

Part 1: Introduction and Architecture planning

Part 2: Developing workflows for infrastructure deployment via CI-CD

Part 4: Setup Kubeflow and Seldon on AKS

Part 5: End-to-End training pipeline and Inference Deployment using Seldon

This section of the blog post focuses on the process of creating infrastructure from scratch in a reusable manner. The objective is to establish a centralised code repository where code can be pushed, triggering an Azure pipeline that provisions the infrastructure based on the code changes. It is assumed that the user has already set up an account in the Azure portal and Azure DevOps, although assistance for setting up these accounts can be found at this link: https://learn.microsoft.com/en-us/azure/devops/get-started/?view=azure-devops.
The complete code for this implementation is available Github: https://github.com/rahuja23/Blog-Post-Infrastructure/tree/main .

1. Infrastructure as Code (IaC)

Infrastructure as code (IaC) is a rapidly emerging trend in the industry, gaining significant momentum. It entails managing and provisioning infrastructure using code rather than relying on manual processes. IaC represents a crucial practice within the DevOps realm and is commonly employed in conjunction with continuous delivery practices. Adopting IaC offers numerous advantages, including increased speed, enhanced reliability, and improved scalability. However, one of the primary reasons for implementing IaC in an industrial setting is to mitigate errors and discrepancies that may arise during the provisioning or re-provisioning of infrastructure. By leveraging IaC, organisations can minimize manual intervention, reduce the likelihood of human errors, and ensure consistent and standardised infrastructure deployments.

1.1 Azure DevOps Setup

In this section we will setup our Azure DevOps organisation and create a repository for our infrastructure code:

First step is to setup an organisation in AzureDevOps (here). In our case we will call it “ Blog-Post”.
Second step is to create a project inside this organisation. In our case we will call the project “Blog-Post-Infra”.

Now we will have to connect our Azure DevOps account with the account we have in Azure Portal. This is done by linking the Active Directory created in our Azure Portal to Azure DevOps Organisation that we just created. We can do this by clicking on Organisation settings -> Azure Active Directory-> Connect directory.

Now once we have connected everything we will create a repository where we will push our infrastructure code. We will name this repository “Infrastructure”. This will be the central repository for all our infrastructure we will provision.

Now we have an Azure DevOps organization with a project and a repository to contain our infrastructure code. Now what we need to setup is an authorized method for Azure DevOps to provision resources on Azure Portal. This is specifically needed because the code that provisions the infrastructure might be in a repository in Azure DevOps but the actual infrastructure won’t be provisioned on Azure DevOps itself. Since we don’t want to provision the infrastructure locally but we want to have Azure Pipelines create it for us, the best practice to do this is using Azure Service Connections. To setup service connections we would go to: Project Settings → Service connections → New service connection. For this implementation we named our service connection “terraform-basic-testing-azure-connection”. This is the service connection that our azure pipeline will use when using the terraform code in our repository to actually provision infrastructure on our behalf.

The last step to complete our initial setup is to somehow manage to install terraform for our azure’s CI-CD pipelines to use. For this purpose we will install terraform as an Azure Extension. To do this we go to: Organization Settings → Extensions → Browse marketplace. For this blog-post the extension we installed is by Microsoft DevLabs.

1.2 Infrastructure

In this section we will start implementing the actual code for our infrastructure. To make the implementation as clean as possible we will initially create terraform modules for each infrastructure component that we want to provision and inside our environment directory we would have the environment specific terraform files where we will call these modules and specify the configuration with which we want our infrastructure to be provisioned. The final structure of our directory will look as follows:

Infrastructure/
├─ .pipelines/
│  ├─ azure-pipeline.yml
├─ Base_Infra/
│  ├─ az-remote-backend-main.tf
│  ├─ az-remote-backend-output.tf
│  ├─ az-remote-backend-variables.tf
├─ Dev/
│  ├─ backend.tf
│  ├─ main.tf
│  ├─ outputs.tf
│  ├─ providers.tf
│  ├─ terraform.tfvars
│  ├─ variables.tf
Infrastructure_modules/
├─ kubernetes-cluster/
│  ├─ variables.tf
│  ├─ main.tf
├─ container-registry/
│  ├─ main.tf
│  ├─ variables.tf
├─ resource-group/
│  ├─ main.tf
│  ├─ variables.tf

The sub-directories inside our main repository “Infrastructure” are as follow:

.pipelines: This directory will contain our azure pipeline which we will use to provision our infrastructure.
Base_Infra: This directory contains the base infrastructure over which additional infrastructure would be provisioned. The terraform code inside will provision an Azure Resource group which will be used to link our whole infrastructure and a Storage account inside the resource group where we would save all our terraform state files. The provision of this base infrastructure can also be automated but in specifically our implementation we will provision base infra manually.
Dev: This directory will contain the terraform files which would be provisioned via our Azure pipeline. This directory will contain environment specific configuration which would only apply for our dev environment. In future if we want to extend to multi-environment infrastructure (e.g staging) then we would have an additional directory “Staging” with terraform files containing configurations for this specific environment.
Infrastructure_modules: This directory contains terraform files for infrastructure components as modules.

We will start with the “Base_Infra” directory which contains the code for the base infrastructure i.e. a resource group and a storage to save our terraform state files. Again, for the purpose of this blogpost we have built the base infrastructure manually i.e. doing terraform init, terraform plan and terraform apply locally and not using CI-CD pipelines to do it.

Base_Infra/az-remote-backend-main.tfprovider "azurerm" {
  features {}
}

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.78.0"
    }
  }
}
# Generate a random storage name
resource "random_string" "tf-name" {
  length = 8
  upper = false
  number = true
  lower = true
  special = false
}
# Create a Resource Group for the Terraform State File
resource "azurerm_resource_group" "state-rg" {
  name = "${lower(var.company)}-tfstate-rg"
  location = var.location

  lifecycle {
    prevent_destroy = false
  }
  tags = {
    environment = var.environment
  }
}
# Create a Storage Account for the Terraform State File
resource "azurerm_storage_account" "state-sta" {
  depends_on = [azurerm_resource_group.state-rg]
  name = "${lower(var.company)}tf${random_string.tf-name.result}"
  resource_group_name = azurerm_resource_group.state-rg.name
  location = azurerm_resource_group.state-rg.location
  account_kind = "StorageV2"
  account_tier = "Standard"
  access_tier = "Hot"
  account_replication_type = "ZRS"
  enable_https_traffic_only = true

  lifecycle {
    prevent_destroy = true
  }
  tags = {
    environment = var.environment
  }
}
# Create a Storage Container for the Core State File
resource "azurerm_storage_container" "core-container" {
  depends_on = [azurerm_storage_account.state-sta]
  name = "core-tfstate"
  storage_account_name = azurerm_storage_account.state-sta.name
}

The variables used for this configuration can be stored in “variables.tf” file.

Base_Infra/az-remote-backend-variables.tf
____________________________________________________________________# company
variable "company" {
  type = string
  default = "blogpost"
  description = "This variable defines the name of the company"
}
# environment
variable "environment" {
  type = string
  default = "dev"
  description = "This variable defines the environment to be built"
}
# azure region
variable "location" {
  type = string
  description = "Azure region where resources will be created"
  default = "West Europe"
}Base_Infra/az-remote-backend-output.tf
____________________________________________________________________output "terraform_state_resource_group_name" {
  value = azurerm_resource_group.state-rg.name
}
output "terraform_state_storage_account" {
  value = azurerm_storage_account.state-sta.name
}
output "terraform_state_storage_container_core" {
  value = azurerm_storage_container.core-container.name
}

Now we will go through the infrastructure modules i.e. the sub-directories inside directory “Infrastructure modules” specifically with “kubernetes-cluster” subdirectory.

Infrastructure_modules/kubernetes-cluster/main.tf
____________________________________________________________________terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.78.0"
    }
  }
}
resource "azurerm_kubernetes_cluster" "aks" {
  name                = var.cluster_name
  kubernetes_version  = var.kubernetes_version
  location            = var.location
  resource_group_name = var.resource_group_name
  dns_prefix          = var.cluster_name
  node_resource_group = var.node_resource_group

  default_node_pool {
    name                = "system"
    node_count          = var.system_node_count
    vm_size             = "Standard_DS2_v2"
    type                = "VirtualMachineScaleSets"
    availability_zones  = [1, 2, 3]
    enable_auto_scaling = false
  }

  identity {
    type = "SystemAssigned"
  }

  network_profile {
    load_balancer_sku = "Standard"
    network_plugin    = "kubenet" # azure (CNI)
  }

}
____________________________________________________________________Infrastructure_modules/kubernetes-cluster/variables.tf
____________________________________________________________________
variable "resource_group_name" {
  type        = string
  description = "RG name in Azure"
}

variable "location" {
  type        = string
  description = "Resources location in Azure"
}

variable "cluster_name" {
  type        = string
  description = "AKS name in Azure"
}

variable "kubernetes_version" {
  type        = string
  description = "Kubernetes version"
}

variable "system_node_count" {
  type        = number
  description = "Number of AKS worker nodes"
}

variable "node_resource_group" {
  type        = string
  description = "RG name for cluster resources in Azure"
}
____________________________________________________________________Infrastructure_modules/container-registry/main.tf
____________________________________________________________________terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.78.0"
    }
  }
}
resource "azurerm_container_registry" "acr" {
  name                = var.name
  resource_group_name = var.resource_group_name
  location            = var.location
  sku                 = "Premium"
  admin_enabled       = false
  georeplications = [
    {
      location                = "westeurope"
      zone_redundancy_enabled = true
      tags                    = {}
  }]
}
____________________________________________________________________Infrastructure_modules/container-registry/variables.tf
____________________________________________________________________
 variable "resource_group_name" {
  type        = string
  description = "RG name in Azure"
}

variable "location" {
  type        = string
  description = "Resources location in Azure"
}
variable "name" {
  type        = string
  description = "Name for container registry"
}

Currently, we only require these two specific infrastructure modules. Moving forward, we will explore the environment-specific Terraform files, where we will invoke these modules to provision infrastructure tailored to our respective environments.

Dev/backend.tf
____________________________________________________________________terraform {
  // backend "azurerm" {
  //   resource_group_name  = "tf_state"
  //   storage_account_name = "tfstate019"
  //   container_name       = "tfstate"
  //   key                  = "terraform.tfstate"
  // }
}
____________________________________________________________________Dev/main.tf
____________________________________________________________________
module "azure_container_registry" {
  source = "../Infrastructure/container-registry"
  resource_group_name = var.resource_group_name
  location = var.location
  name = var.registry_name
}


module "azurerm_kubernetes_cluster" {
  resource_group_name = var.resource_group_name
  location            = var.location
  cluster_name        = var.cluster_name
  kubernetes_version  = var.kubernetes_version
  system_node_count   = var.system_node_count
  node_resource_group = var.node_resource_group
  source              = "../Infrastructure/kubernetes-cluster"
}
____________________________________________________________________Dev/providers.tf
____________________________________________________________________
provider "azurerm" {
  features {}
}

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "2.78.0"
    }
  }
  backend "azurerm" {
  }
}
____________________________________________________________________Dev/vairables.tf
____________________________________________________________________variable "resource_group_name" {
  type        = string
  description = "RG name in Azure"
}

variable "location" {
  type        = string
  description = "Resources location in Azure"
}

variable "cluster_name" {
  type        = string
  description = "AKS name in Azure"
}

variable "kubernetes_version" {
  type        = string
  description = "Kubernetes version"
}
variable "registry_name" {
  type        = string
  description = "Name for our azure container registry"
}

variable "system_node_count" {
  type        = number
  description = "Number of AKS worker nodes"
}

variable "node_resource_group" {
  type        = string
  description = "RG name for cluster resources in Azure"
}

One notable aspect evident in the aforementioned Terraform files is the extensive utilization of variables to define configurations, such as the resource group names or the container registry names. In Terraform, there are two primary methods for providing or setting the values of these variables. The first approach involves storing all the values in a “.tfvars” file, which is then supplied during the execution of Terraform scripts. Alternatively, the values can be directly specified when invoking “terraform apply.”

In our specific case, we will generate a “.tfvars” file that will be accessed by our Azure pipelines when provisioning the infrastructure. This allows for seamless integration and automation of the provisioning process, as the necessary variable values can be conveniently supplied from the designated “.tfvars” file.

Dev/terraform.tfvars
____________________________________________________________________
resource_group_name = "blogpost-terraform"
location            = "West Europe"
cluster_name        = "terraform-aks"
kubernetes_version  = "1.24.3"
system_node_count   = 3
node_resource_group = "blogpost-tfstate-resources-rg"
terraform_backend_storage= "blogposttfhzf8zk75"
terraform_backend_container = "core-tfstate"
registry_name = "blogpost-tfstate-registry"

With the completion of our Terraform scripts for infrastructure provisioning, the subsequent step involves building an Azure pipeline. This pipeline will be responsible for retrieving and executing these scripts to effectively provision the desired infrastructure.

2. Azure Pipelines to provision infrastructure

In this section, our focus shifts towards creating an Azure pipeline that will actively retrieve the infrastructure scripts from the previous section. This pipeline will play a pivotal role in executing these scripts, resulting in the actual provisioning of the desired resources.

.pipelines/azure-pipelines.yml
____________________________________________________________________
trigger:
  branches:
    include:
      - '*'variables:
- group: BlogPost-Infra-vg
- name: terraformWorkingDirectory
  value: '$(System.DefaultWorkingDirectory)/Dev'
- name: backendAzureRmResourceGroupName
  value: 'blogpost-tfstate-rg'
- name: backendAzureRmStorageAccountName
  value: 'blogposttfhzf8zk75'
- name: backendAzureRmContainerName
  value: 'core-tfstate'
- name: backendAzureRmKey
  value: 'terraform.tfstate'
pool:
  vmImage: ubuntu-lateststages:
- stage: TerraformContinuousIntegration
  displayName: Terraform Module - CI
  jobs:
   - job: TerraformContinuousIntegrationJob
     displayName: TerraformContinuousIntegration - CI Job
     pool:
        vmImage: ubuntu-20.04
     steps:
      # Step 2: Install Az cli and login via the service principal
     - task: AzureCLI@2
       inputs:
        scriptType: bash
        azureSubscription: '$(serviceConnection)'
        scriptLocation: inlineScript
        workingDirectory:  $(terraformWorkingDirectory)
        inlineScript: |
        sudo curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash 
        az --version
        az login --service-principal -u $(AppId) -p $(password) 
        --tenant $(Tenant_id)\
     # Step 3: Install Terraform on Azure Pipelines agent
     - task: TerraformInstaller@0
       displayName: 'install'
       inputs:
         terraformVersion: $(terraformVersion)     # Step 4: run Terraform init to initialize the workspace
     - task: TerraformTaskV2@2
       displayName: 'Run Terraform init'
       inputs:
        command: init
        workingDirectory: $(terraformWorkingDirectory)
        backendServiceArm: $(serviceConnection)
        backendAzureRmResourceGroupName:  
        '${{variables.backendAzureRmResourceGroupName}}'
        backendAzureRmStorageAccountName: '${{ 
        variables.backendAzureRmStorageAccountName}}'
        backendAzureRmContainerName: '${{
        variables.backendAzureRmContainerName }}'
        backendAzureRmKey: 'terraform.tfstate'     # Step 5: Run Terraform validate to validate the HCL syntax
     - task: TerraformTaskV2@2
       displayName: 'Run Terraform validate'
       inputs:
        command: validate
        workingDirectory: $(terraformWorkingDirectory)

     # Step 6: run Terraform plan to validate HCL syntax
     - task: TerraformTaskV2@2
       displayName: 'Run Terraform plan'
       inputs:
        command: plan
        workingDirectory: $(terraformWorkingDirectory)
        environmentServiceNameAzureRM: '$(serviceConnection)'
        commandOptions: -var location=$(azureLocation)
     # Step 7: run Terraform apply to validate HCL syntax
     - task: TerraformTaskV2@2
       displayName: 'Run Terraform apply'
       inputs:
        command: apply
        workingDirectory: $(terraformWorkingDirectory)
        environmentServiceNameAzureRM: '$(serviceConnection)'
        commandOptions: -var location=$(azureLocation)

The script mentioned above serves the purpose of installing Terraform, retrieving Terraform files from a designated variable named “terraformWorkingDirectory,” and subsequently provisioning our infrastructure while securely storing the state files in a blob storage. However, upon closer examination, it becomes apparent that sensitive information, such as our service connection and Terraform’s working directory, are supplied as environment variables within a variable group. In our case, this variable group is named “BlogPost-infra-vg”.

Azure offers a reliable option to create a secure variable group where all the required environment variables can be safely stored. This approach ensures the protection of sensitive data. Additionally, Azure provides granular control over the visibility of these variables, enabling you to determine which profiles, users, or identities can access and view the values associated with these variables. This additional level of security further safeguards the confidentiality and integrity of the environment variables within the Azure ecosystem.

Once this setup is completed the pipeline will be triggered as soon as the developer will push this terraform code on azure repository.

Building End-to-End MLOps Pipelines for Sentiment Analysis on Azure with Terraform, Kubeflow v2, Mlflow, and Seldon: Part 2

1. Infrastructure as Code (IaC)

1.1 Azure DevOps Setup

2. Azure Pipelines to provision infrastructure

Written by Rachit Ahuja