OPA Series Part 2: OPA Logic and Structure for Scalr

Published in

scalr

5 min readApr 12, 2023

By Ryan Fee

Open Policy Agent (OPA) is a declarative policy language that can be used across your cloud ecosystem to ensure controlled deployments. It has increased in popularity with the Terraform community as a way to check Terraform plans and ensure DevOps teams are deploying according to organizational standards.

Part one of this series provided an overview of developing and testing OPA policies. This article provides a more detailed guide to writing OPA policies for Terraform for use with Scalr. It provides commonly used OPA expressions and explains the specific implementation of OPA used by Scalr.

How OPA Works

The basic concept of OPA is as follows.

“Evaluate expressions against some input data and generate some output”

In Scalr’s use of OPA the input is the tfplan and tfrun data, and the required output Scalr is looking for is an array of strings generated by rules named “deny”. e.g.

You can read more about OPA inputs here. There is an overview of the tfplan and tfrun data in the Scalr documentation, and the next article in this series will provide a detailed description of this data and its use with OPA.

If an OPA policy produces any output then Scalr considers the policy check to have been violated.

To be clear, OPA itself does not decide whether a policy has been violated or not. OPA simply evaluates the policy and returns any output from the rules. It is up to the calling system to decide how to interpret that output, and in Scalr any output from a rule called “deny” means the policy has been violated.

This means that all policies used in Scalr must contain at least one “deny” rule as shown in this pseudo example.

The input data (JSON):

{
  "demo_data": {
    "messages": [
      {
        "language": "english",
        "salutation": "Hello",
        "target": "world"
      },
      {
        "language": "italian",
        "salutation": "Ciao",
        "target": "mondo"
      }
    ]
  }
}

The policy:

This simple example is a good guide to how OPA works. When a policy is run/evaluated each rule called “deny” is processed. Any rule where all expressions are TRUE will return the text in the header. If we evaluate this policy the output is as follows.

$ opa eval --format pretty --data FILENAME.rego -i FILENAME.json data.terraform
{
  "deny": [
    "Hello world"
  ]
}

Only the first “deny” rule produced output because it is the only rule where every expression has been evaluated to TRUE, i.e. there is no message in the array for “french”. Processing of a rule terminates as soon as an expression result is FALSE or UNDEFINED.

Note that the output text is assigned to the header of the rule within the body using sprintf(). The reason text can be hard coded into the header [“Hello world”] but this approach allows for dynamic reason text to be generated.

Terraform Example

We will now step through a complete example that uses Terraform Plan input data (tfplan) and evaluates that all instances are using one of the allowed IAM Instance Profiles.

The input data (abbreviated) looks like this.

{
  "tfplan": {
    "format_version": "0.1",
    "terraform_version": "0.12.28",
    "planned_values": {
      ...
    },
    "resource_changes": [
      {
        "address": "aws_instance.example",
        ...
        "change": {
          "actions": [
            "create"
          ],
          "before": null,
          "after": {
            "ami": "ami-05cf2c352da0bfb2e",
            ...
            "iam_instance_profile": "my_iam_profile_x",
            ...
          },
          "after_unknown": {
            ...
          }
        }
      }
    ],
    "prior_state": {
      ...
    },
    "configuration": {
      ...
    }
  }
}

Within tfplan is an array of resource_changes. There would be an element in the array for each instance the plan and change.after.iam_instance_profile is the planned value. The complete policy is as follows.

package terraform
 
import input.tfplan as tfplan
import input.tfrun as tfrun
 
allowed_iam_profiles = [
  "my_iam_profile",
  "my_iam_profile_2",
  "my_iam_profile_3"
]
 
array_contains(arr, elem) {
  arr[_] = elem
}
 
deny[reason] {
  resource := tfplan.resource_changes[_]
  iam := resource.change.after.iam_instance_profile
  not array_contains(allowed_iam_profiles, iam)
 
  reason := sprintf(
  "%-40s :: iam_instance_profile %s is not allowed.",
  [resource.address, iam]
  )
}

Arrays and Loops

As seen above the “tfplan” data will usually contain arrays, especially if multiple resources exist in the Terraform configuration. Arrays can also be declared in the OPA document.

For example, in the tfplan data the resource changes are an array (one element for each resource), as denoted by the [.

"resource_changes": [
  {
    "address": "aws_instance.example",
    "mode": "managed",
    "type": "aws_instance",
  },
  ...
]

You might also declare arrays in the policy, for example, to provide a list of allowed values to be checked against.

allowed_iam_profiles = [
  "my_iam_profile",
  "my_iam_profile_2",
  "my_iam_profile_3"
]

OPA has a construct you can use to process all elements of an array where the start of a loop is implied and all the array index handling is done under the covers by OPA. This is done using the [_] array index.

profile := allowed_iam_profiles[_]

This means iterating through allowed_iam_profiles assigning each value to variable profile in turn and then processing all the following expressions in the rule for each value of the profile.

Note that this construct will process ALL elements in the array. If any iteration hits a FALSE/UNDEFINED expression the next iteration will start. However, if none of the loop iterations yields TRUE for all following expressions rule not return any output.

In this example, the policy needs to iterate through all the aws_instances in the tfplan data to check the instance types. This extract from the data looks like this.

"resource_changes": [
  {
    "address": "aws_instance.i1",
    "change": {
      "after": {
        "instance_type": "t1.small",
      }
    }
  },
  {
    "address": "aws_instance.i2",
    "change": {
      "after": {
        "instance_type": "t2.small",
      }
    }
  },
  {
    "address": "aws_instance.i3",
    "change": {
      "after": {
        "instance_type": "t3.micro",
      }
    }
  },
  {
    "address": "aws_instance.i4",
    "change": {
      "after": {
        "instance_type": "t4.large",
      }
    }
  }
]

The OPA Policy

package terraform
 
import input.tfplan as tfplan
 
array_contains(arr, elem) {
  arr[_] = elem
}
 
it_types = [
  "t2",
  "t3",
]
 
deny[reason] {
  it := split(tfplan.resource_changes[_].change.after.instance_type,.)[0]
  not array_contains(it_types,it)
  reason := sprintf("Instance type %s not allowed.",[it])
}

The output

$ opa eval --format pretty --data FILENAME.rego -i FILENAME.json data.terraform.deny
[
  "Instance type 't1' not allowed.",
  "Instance type 't4' not allowed."
]

Only 2 elements of the array are printed due to the call to array_contains() only matching “t2” and “t3”

Multidimensional Arrays

The tfrun data can contain multidimensional arrays.

In this example, the tfplan data has an array of “resource_changes” and within that, there are instances that have arrays of “ebs_block_devices”.

"resource_changes": [
  {
    "address": "aws_instance.db",
    "change": {
      "after": {
        "ebs_block_device": [
          {
            "delete_on_termination": false,
            "device_name": "/dev/sda7"
          },
          {
            "delete_on_termination": true,
            "device_name": "/dev/sda8"
          }
        ],
      }
    }
  },
  ...
]

Multiple [_] array indexes can be specified in the same expression or rule which will cause loops within loops for each array in the structure.

package terraform
import input.tfplan as tfplan
deny[reason] {
  r := tfplan.resource_changes[_]
  ebs := r.change.after.ebs_block_device[_]
  device := ebs.device_name
  reason := device
}

Output:

$ opa eval --format pretty --data FILENAME.rego -i FILENAME.json data.terraform.deny
[
  "/dev/sda7",
  "/dev/sda8",
  "/dev/sda2",
  "/dev/sda3"
]

Comprehensions

Comprehensions are used to build arrays from a series of expressions that look like this.

some_array := [ <item> | <expressions assign value to item> ]

Of necessity, the expressions are usually some form of iteration.

t3_types := [ type | r := tfplan.resource_changes[_] startswith(r.change.after.instance_type, "t3.") type := r.change.after.instance_type ]

This example looks for instance types starting with “t3” and adds them to the array. As with rules, all expressions must be TRUE for a value to be added to the array.

This article explained OPA in a level of detail that will allow you to start writing policies for Terraform. In the next article in the series, we take a deep dive into the Terraform plan JSON and how each section of the data can be used within OPA.

Learn more about Scalr and how we can help here.