Continuous compliance testing using InSpec on Google Cloud Platform

Svetlin Zamfirov
Oct 14, 2020 · 8 min read

Evidencing compliance is never an easy task especially when standing up complex infrastructure and networking. Whether this is happening in the cloud or on-premise, we always have to adhere to a set of guidelines which tend to be both external (information security standards such as “PCI DSS” [1], “CIS Controls and CIS Benchmarks” [2] and others) as well as those internal to the company. Often times the validation of these external or internal criteria involves long manual process and, depending on the size of the organization, multiple teams can be involved in it. This increases the complexity of the value stream and with each hand-off between teams, you include some wait time, context switching time, and an overall reduction of flow of value. When we started the journey of improving our compliance story, we quickly faced the problem of hand-offs and wait time but we didn’t want to slow down delivering our infrastructure in a compliant way, so we needed to find an automated way of evidencing compliance in the best way possible.

Applications, systems and infrastructure do change over time and we didn’t want this process to be a one-off or require manual intervention. With the team doing steady iterations and introducing more changes and improvements to our platform as part of our 0–60 vision, we wanted to be able to continuously validate our work from a compliance perspective and demonstrate an effective security posture. That slow but steady growth of our infrastructure made us think about how to properly scale up our compliance automation work and address various problems such as — how to have testing coverage for all of our infrastructure at all times, how to avoid losing compliance reports and how to consolidate compliance reports from different parts of our platform to serve for purposes such as analytics and visualization.

Infrastructure Automation meets InSpec

We are always using infrastructure-as-code (IaC) processes to automate the management and provisioning of all of our infrastructure built on top of Google Cloud Platform (GCP). We make use of tools such as Terraform and GCP-native products such as Google Cloud Build to run our pipelines. We’ve created custom, re-usable Terraform modules and each of these modules represents a piece of logic that stands up a particular infrastructure component in GCP (or a set of tightly coupled components). This helps us in achieving an effective, fully-automated pipeline for managing our infrastructure.

While researching how we can easily add complementary compliance tests to our infrastructure-as-code implementation, we discovered InSpec, which is an open-source infrastructure testing framework. It’s a tool aimed at creating, writing and running tests which are validating specifically targeted compliance issues and it enables baking these tests into your release process. With InSpec in place, the compliance process no longer occurs at the end of a release cycle. The tests that are written specifically target compliance issues and can be enriched with metadata required by security and compliance professionals in your organization.

We’ve created a custom implementation which lets us automatically run compliance tests in our GCP organization for all underlying infrastructure and evidence the results in a real time dashboard. By having InSpec as the core of our implementation, it gives us the possibility to run compliance tests based on various use cases. The pluggable nature of the tool allows us to easily incorporate tests based on standards such as CIS Benchmarks, PCI DSS and others, into our test suite and propagate these additions to our infrastructure pipeline with ease. This provides full coverage and visibility in every use case and builds a better collaboration and trust with security and compliance teams.

Early Implementation — Success without Scale

Just to sprinkle the story with some context of our early journey and the learnings from that, we initially focused on writing infrastructure related compliance tests for a lot of our infrastructure-as-code Terraform modules as well as adding tests based on the CIS Benchmarks for GCP [3] (Center for Internet Security benchmarks) for those modules. The reason for choosing the CIS benchmarks is because they have a broad coverage for various compliance related scenarios and avoid the need to ‘invent your own’.

Our idea was to allow other GSK teams using our Terraform modules to benefit from shipping infrastructure which has been battle tested and where security and compliance had already been validated as ‘baked in’. This was a good approach for having coverage for each module but with time, it proved not to be very scalable because the number of IaC modules was growing rapidly and required manually addressing changes to compliance tests in a lot of places as well as adding additional Cloud Build steps in the teams’ infrastructure pipelines so that teams can have full coverage.

Scaling the Implementation — Inspec-Runner

Based on our early implementation, we now had the GCP CIS Benchmarks profile for InSpec being executed against a team’s GCP project which would be hosting all of their infrastructure and consequently all the underlying Terraform modules used by them to build that infrastructure in the first place — everything all at once. To solve for the problems of scale, our team built a small product we’ve called “inspec-runner”. The implementation is monitoring the infrastructure-as-code pipelines across the whole platform and automatically starts compliance tests in case there are changes in our infrastructure.

Cloud Build publishes messages to a Google Cloud Pub/Sub topic out of the box when your build’s state changes. The “inspec-runner” is essentially a Google Cloud Function which is consuming incoming messages from that same topic. This is the core functionality of our “inspec-runner” which is also visualized below.

The “inspec-runner” getting the Cloud Build information from PubSub and scheduling an InSpec scan on infrastructure changes.

As a result of this work we have managed to effectively integrate compliance testing as part of our full infrastructure pipeline. As soon as infrastructure changes happen, a process automatically starts a compliance test to verify that all changes are well made and according to our standards.

In operation, once a message is received, the Cloud Function parses the data in the message and is essentially searching for “terraform apply” - the command which is applying infrastructure changes with Terraform. If it finds the command, it then triggers a new Cloud Build job which executes the InSpec suite. The impact of this is that once the automated infrastructure changes are successfully applied, InSpec will be executed automatically as a follow-up step and thus running all controls and generating a report which would demonstrate whether the live state (aka current state) of the infrastructure is compliant to your set of criteria. This is essentially ongoing and continuous compliance testing.

Visualizing Compliance — A single pane of glass

InSpec’s built-in reporting is invaluable to security and compliance teams and can be enriched with metadata (such as adding tags, comments, description, impact etc. to your controls) which can help when dealing with a large number of reports. In order to help us visualise the overall compliance status for our GCP environment, we decided to consolidate the growing number of reports in a single location.

As part of our implementation, each of the InSpec reports, which are essentially just text files containing structured data, are automatically uploaded into a single Google Cloud Storage (GCS) bucket as objects. Out of the box, uploading any file to any GCS bucket produces an event. These events are automatically picked up by another Cloud Function which we’ve created in order to pick up a report from the bucket and push it into Google’s data warehouse service called BigQuery. This process can be seen below:

Consolidating the InSpec reports into BigQuery and using Data Studio for visualization.

Once the data is in BigQuery the possibilities are endless — you can perform analytics, custom reporting and dashboards on that data. Google Data Studio is a free tool that turns your data into informative, easy to read, easy to share, and fully customizable dashboards and reports. It integrates with BigQuery quite easily and we are using it for our compliance dashboards. Below is an example of one of our compliance dashboards and how it looks like:

The dashboard serves as a single pane of glass into our compliance posture on GCP.

The dashboard demonstrates where we stand in terms of compliance in real time and we’re working on adding more information which can be helpful for our security and compliance teams so that they can use it as a single pane of glass when it comes to compliance on GCP. This is already saving us a lot of time as opposed to doing hand-offs, context switching and having the need to log into multiple systems to gather all of that information manually.

Outcomes and next steps

We now ensure that any resources that we maintain with infrastructure-as-code or products that we are building are tested and validated on an ongoing basis. The steady growth of our cloud estate has resulted in this increasingly urgent need for compliance testing and reporting in an automated fashion. Without our “inspec-runner” and our visualization dashboards, it would have been impossible to fully visualise and validate compliance across the estate on an ongoing basis. Here’s a summary of what we have achieved:

  • Complete visibility of our security and compliance position across our GCP organization is provided to GSK security teams
  • A complete, interactive compliance dashboard of the whole estate is provided to security teams
  • An easy way to expand the scope of our compliance tests — if we add credit card processing to any of our products, it’s easy to roll out PCI-DSS compliance checking automatically
  • Changes to compliance tests are automatically propagated to anyone/anything running the tests

Are we “finished” with our compliance story? We’re far from it. We are planning to add more comprehensive tests to fill in any additional gaps whether these are based on a public information security standard or our own custom ones. This would be enough to verify the desired state of our implementations and have them set up according to best practice.

We’d love to hear your feedback! Please reach out to us if you have questions or even if you simply want to share your own journey.

Citations

[1] — The Payment Card Industry Data Security Standard (PCI DSS) is an information security standard for organizations that handle branded credit cards from the major card schemes.
[2] — Center for Internet Security benchmarks — consensus-developed secure configuration guidelines for hardening systems and infrastructure; Center for Internet Security controls — prescriptive, prioritized, and simplified set of cybersecurity best practices.
[3] — GCP CIS 1.1.0 Benchmark Inspec Profile — CIS Benchmark security guidelines for Google Cloud Platform written in code with the help of InSpec.

GSK Tech

GSK Tech blog