Policy As Code — Open Policy Agent In Cloud Native Stack

Syed Saad Ahmed
Nerd For Tech
Published in
6 min readJul 3, 2021

When we think of automating any process end-to-end, we think of automating each and every end of that particular process. A term pops up into our mind is that we need “No Human Intervention” in the whole process or pipeline. As a DevOps and Cloud enthusiast let’s think of deploying an Infrastructure either on any cloud provider (AWS, Google Cloud Platform or Microsoft Azure) or on any cluster of bare-metal servers or machines. There are a lot of tools and technologies currently for automation and provisioning of infrastructure neatly and in an optimized manner.

Furthermore we have the concept of Infrastructure As Code which is a well-defined process of management and provisioning cloud based systems or such infrastructures through declarative configuration files, rather than physical hardware configuration or some other configuration management tools. We have couple of tools for implementing automation and orchestration on different platforms. We have different tool range from open-source solutions to Enterprise level SaaS based applications. In this blog we will be having a look at OPA (Open Policy Agent) which is the open source “Policy As Code” testing tool.

Open Policy Agent (OPA)

It is an open source tool giving us the concept of Policy-As-Code helping us in making the process of testing the policy and rules defined specifically for any infrastructure speedy by evaluating the infrastructure code before it goes in to the production environment. For example we can see it as a pre-deployment step for checking the policies and regulations first and then sending the execution command for deployment of that particular Infrastructure. OPA simply gives us a Policy-based control for cloud native environments and across the similar stack. Usage of OPA can be done as a unified toolset and framework for policy across the cloud native stack.

If you are thinking only about one of your service or for all your services in the complete stack, One can use OPA to decouple policy from the service’s code so you can release, analyze, and review policies without compromising availability or performance of your infrastructure.

In this blog, we will be looking at implementing OPA with different provisioning and automation tools.

Merging OPA With Terraform Provisioning Workflow

Terraform is a famous open-source tool which uses the concept of Infrastructure As Code for the automation and provisioning of many cloud, Infrastructure or such services. It was developed by HashiCorp, it delivers consistent workflows to provision, secure, connect, and run any infrastructure for any application.

Fig-1 : Typical Infrastructure Resource Lifecycle from AWS Docs

With the help of OPA in our terraform configuration, we can easily write policies that test the changes Terraform is about to make before it makes them. These particular tests can help us in checking the sanity of the changes to be applied, Moreover it can reduce the burden of peer-review making the process fully automated and can help catch problems that arise when applying Terraform to production after applying it to staging.

Testing your Terraform Configuration with OPA

OPA acts as a gate before it actually provisions any infrastructure, So it will be very easy for teams to identify compliance issues at the very earliest opportunity. Here’s a workflow that uses OPA to determine whether the Terraform (TF) code given is valid or not in a pipeline.

Pipeline for Terraform configuration execution with Open Policy Agent.

Example of Testing Microsoft Azure Terraform Configuration

Here we can take a sample configuration of terraform which is creating few resources which are required to create a virtual machine, for looking completely at terraform configuration with Azure, you can have a look here.

1) First creating a sample terraform code in order to provision a Virtual Machine in Azure environment.

# Configure the Microsoft Azure Providerterraform {
required_providers {
azurerm = {
# Configure the Microsoft Azure Providerprovider "azurerm" {
features {}
}
# Create virtual machine
resource "azurerm_linux_virtual_machine" "myterraformvm" {
name = "myVM"
location = "eastus"
resource_group_name = "test-resource-group"
network_interface_ids = "xxxxx"
size = "Standard_A0"
os_disk {
name = "myOsDisk"
caching = "ReadWrite"
storage_account_type = "Standard_LRS"
disk_size_gb = "30"
}
source_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "18.04-LTS"
version = "latest"
}
computer_name = "myvm"
admin_username = "azureuser"
disable_password_authentication = true
admin_ssh_key {
username = "azureuser"
public_key = file("")
}
tags = {
environment = "Demo"
}
}

Here using this terraform script an Azure Virtual Machine, specifically Ubuntu 18.04-LTS is being created.

2) After setting up the configuration, we will perform terraform plan with output in a binary file.

terraform init
terraform plan --out tfplan.binary

3) Moving forward and Converting the terraform plan to JSON, there are several options for that, here is one :

tfjson tfplan.binary > tfplan.json

4) Now comes the part, where we have to write our policy checking thing.

For example we can checking if the resource type (vm_size parameter in particular). we have created the VM above using the terraform code contains our desired VM size, which is or not which is Standard_A0 . For this purpose we will be writing a .rego file which can be instance_check.rego.

What is Rego, It is the language used to write the OPA policy. More can be learned about rego here. OPA has provided an online interactive environment, where we can test policies. Here you can have a look.

# Multi proivder rule to enforce instance type/sizepackage terraform.analysisimport input.tfplan as tfplan# Allowed sizes by provider
allowed_types = {
"azurerm": ["Standard_A0", "Standard_A1"],
}
# Attribute name for instance type/size by provider
instance_type_key = {
"azurerm": "vm_size",
}
array_contains(arr, elem) {
arr[_] = elem
}
get_basename(path) = basename{
arr := split(path, "/")
basename:= arr[count(arr)-1]
}
# Extracts the instance type/size
get_instance_type(resource) = instance_type {
provider_name := get_basename(resource.provider_name)
instance_type := resource.change.after[instance_type_key[provider_name]]
}
deny[reason] {
resource := tfplan.resource_changes[_]
instance_type := get_instance_type(resource)
provider_name := get_basename(resource.provider_name)
not array_contains(allowed_types[provider_name], instance_type)
reason := sprintf(
"instance type %q is not allowed",
[instance_type]
)
}

Here, Now you can run the following command on your Mac or any other OS, in order to evaluate the policy you have written;

opa eval — format pretty — data instance_check.rego — input tfplan.json "data.terraform.analysis"

5) Output:

The output will be the reason, because we have set print command for the reasonin the instance_type.rego file. It will print like this if the instance type will not be matched;

"instance type Standard_B1s is not allowed"

In A Nutshell

After seeing the power of using OPA with terraform, one can easily think of using it as a pre-deployment check or using OPA before applying the terraform plan and check all the policies specific to a certain project in order to have a good control over the infrastructure. It is a powerful combo that can be used for provisioning cloud infrastructure and for setting up cloud instances. One of the advantage of using OPA is that it is open source tool and free to use. Furthermore there are a lot of other options available like using OPA with Kubernetes, Docker, Prometheus and other such tools and technologies. Hope this can helps your research or your interest on how to automate policy enforcement with out any manual or human intervention.

There are a lot of OPA sample policies already developed and listed here. Although one can also write his manual policy checking .rego file. So keep automating the pipeline of creation and provisioning of your infrastructure.

--

--