Automating Terraform Policy Enforcement with Sentinel and ServiceNow
Introduction
Terraform is an incredible tool for automating the provisioning of infrastructure via the concept of Infrastructure-as-Code (IaC) and Providers. IaC allows practitioners to declare the desired state for an environment in a way that is repeatable, testable, and even idempotent in many cases. Once the environment has been declared, Terraform pulls from hundreds of Open Source providers to provision VMs, firewalls, load-balancers, disks, and more to cloud platforms like AWS, Azure, GCP, and even to on-premise infrastructure like VMware.
This declarative approach to infrastructure allows teams to move faster and provides an agnostic workflow that can be used across any environment. However, organizations grow and the probability of accidentally attaching an external (public) IP to an oversized VM becomes more likely.
Let’s face it; we’ve all left that m5.2xlarge running over the weekend and cursed the clouds for their quick-to-deploy, easy-to-forget nature.
In this post, we’ll follow the story of Spa Ghetti, Las Agna, Fett Uccine, and Pen Ne as they integrate Sentinel and ServiceNow with Terraform Cloud, which is a SaaS offering from HashiCorp that offers an effortless way to get started with Terraform and other enterprise features like Sentinel, Cost Estimation, and Workspaces.
The CarbCrew will automate previously manual approval processes and enable their team to remove blockers while still having safeguards in place.
What
HashiCorp recently created a Terraform ServiceNow Integration, which provides essential building blocks for integrating Terraform Cloud and ServiceNow. While we implemented this blog post on Terraform Cloud, the integration is primarily intended for use with Terraform Enterprise, which is the self-hosted implementation of Terraform Cloud.
The GIF above depicts the end result and the functionality of this post. A description of the GIF is below.
- Creating a Workspace with ServiceNow
- Adding AWS Credentials and an instance_type in Terraform Cloud
- Manually triggering a Terraform Run
- A Sentinel policy failure requires an override due to the Terraform plan having a monthly cost of more than $20/month
- A manual approval request is created for Spa Ghetti in ServiceNow
- We manually approve the request in ServiceNow
- A REST request is made to the Terraform Policy Override API which overrides the failed Sentinel policy
If you want to follow along or implement this yourself, utilize this GitHub repo.
Why
Most large enterprises have some ticketing system that allows them to handle requests for infrastructure, support, and more. ServiceNow happens to be one of these systems and provides a heavily customizable interface. Now, if you’ve used ServiceNow, you’ll probably be asking yourself how a manual ticketing system for humans integrates with an automated provisioning and policy enforcement tool like Sentinel.
Sentinel was created to allow organizations to inject policy enforcement before infrastructure gets provisioned from Terraform. Similar to IaC, Sentinel introduces the idea of Policy-as-Code (PaC), which enables you to declare policies that understand their native environments. This allows you to enforce security (external IPs), definition (instance types), cost, and even best practice (forgotten VMs) policies.
Combining Sentinel with ServiceNow allows us to reduce the overall number of manual approvals required when deploying infrastructure. This combination also provides companies with a transition step between no automation and full automation.
Highlights
From Manual to Automated Policies
As we talked about above, Sentinel is a great way to codify and automate policies that already exist in spreadsheets and manual ticketing processes. In our project, we’re able to remove a lot of manual approvals that would typically be in place, and then inject them back into ServiceNow on an as-needed basis.
Cost
Spa Ghetti wants to enable his team to utilize cloud resources but in a reasonable way. Sentinel allows him to have provisioning policies without having to approve every request manually.
Below, you’ll see the cost_rule.sentinel policy used in this project. This policy will inspect Terraform Cloud’s Cost Estimates and will inject a manual approval into the Terraform Run if the proposed monthly cost is above $20. As we will see a little later on, we’ll use the Terraform API to then forward that approval to ServiceNow. Cost Estimation also exposes metrics around the delta monthly cost (change in monthly cost between Terraform plans) and on hourly costs.
import "tfrun"import "decimal"monthly_cost = decimal.new(tfrun.cost_estimate.proposed_monthly_cost)main = rule { monthly_cost.less_than(20)}
Security
Fett Uccine needs to ensure that specific security policies are enforced for audit purposes. In our example, we’re using Sentinel to look for public IPs being attached to instances in a dev environment.
Below, the public_ip.sentinel policy shows how we iterate through all of the resources defined in our Terraform plan and check whether associate_public_ip_address is set to true. If so, then Sentinel will block the deployment and force a manual approval.
import "tfplan/v2" as tfplanviolations = filter tfplan.planned_values.resources as _, r { r.type is "aws_instance" and r.values.associate_public_ip_address == true}main = rule { length(violations) == 0 }
Best Practices
Las Agna likes to make sure that best practices are being followed when deploying cloud resources. In this example, we’re using Sentinel to check for approved instance types.
Our last policy, instance_type.sentinel, is shown below, where we once again iterate through our resources. Then we filter that list of resources for just aws_instances and check if the instance_type is contained in our permitted list of instance types.
import "tfplan/v2" as tfplaninstance_types = [ "m5.large",]violations = filter tfplan.planned_values.resources as _, r { r.type is "aws_instance" and r.values["instance_type"] not in instance_types}main = rule { length(violations) == 0 }
The Building Blocks
The Terraform ServiceNow Integration builds on top of a few generic constructs from ServiceNow’s API.
Catalog: a Catalog is a top-level collection that provides an entry point for users to interact with the integration.
Variables and Variable Sets: Variables and Variable Sets are embedded within Forms and provide fields for users to enter data. This could be anything from AWS IAM credentials to Workspace names.
Workflows: Workflows allow users to embed an ECMA5 script that can access variables from the invoking object. The two types of workflows utilized in this integration are scheduled on a specified interval and triggered manually. This allows users to interact with the Terraform API by periodic polling and by explicit manual triggers.
Polling allows us to continuously grab the latest status of a workspace and synchronize that with ServiceNow. Similarly, manual triggers allow users to create workspaces and, in our case, trigger policy overrides.
Scripts: Scripts are pieces of code that need to be shared across workflows. In our case, the integration is utilizing Scripts to wrap the Terraform API interactions in functions that are called from running Workflows.
REST Messages: REST Messages are the last major block and define the available REST requests. This includes setting Headers, Parameters, the Method, and more.
Feedback Loop with ServiceNow
The Terraform ServiceNow integration provides an excellent foundation for getting started. However, it primarily focuses on one-way interaction between Terraform Cloud and ServiceNow. This blog post and the resulting GitHub repo are focused on extending that functionality to integrate features like Cost Estimation and Sentinel. In addition, I wanted to explain how the building blocks provided can be combined to create a feedback loop with the Terraform API and create a richer experience.
The code below shows how we poll the Terraform API to check for our workspace to be in a policy_override state. Once we detect this state, we use the ServiceNow Glide API to generate an Approval for Spa Ghetti. Finally, we’ll continue polling for this approval to be granted and then close the loop by using the Terraform API to grant the policy override and allow our infrastructure to be deployed.
Here is a screenshot showing that a run in the production-environment workspace has been blocked by a failed Sentinel policy that requires a policy override.
Conclusion
Terraform Cloud and Sentinel are powerful tools for enabling teams to quickly deploy infrastructure while ensuring that necessary safeguards are in place. A large number of enterprises utilize ServiceNow as a ticketing system. Enabling rich integration between these two tools means that your teams can spend less time reviewing requests and more time doing what they enjoy.