How to validate compliance of existing GCP resources from a CAI export using Cloud Scheduler and Cloud Batch?

Evaluating your existing GCP resources

Use Cloud Batch and Cloud Scheduler to perform one-shot or periodic compliance checks

Trinquier Vannick
Google Cloud - Community

--

Detective control on GCP
Detective control on GCP

Compliance control is important for organizations to ensure that they are following security standards, regulations, and organizational policies. They can help to identify and investigate potential compliance violations. By using compliance controls, organizations can quickly identify and address any compliance issues that may arise.

It exists multiples on ways to enforce compliance controls:

  • Preventive Control targets Infrastructure as Code (IaC) CI/CD pipeline before the changes are applied to the infrastructure. Terraform Vet is tool that can be used for organization using Terraform
  • Real Time Control is another way that can be used to detect whether changes that have been recently done are violating existing compliance controls. This article explained very well how it can be done using Cloud Asset Inventory feed and Config Validator

Preventive control enforces that resources that are going to be deployed using IaC are compliant. Realtime control ensures that resources that have just been created are also compliant. However, to have a complete vision of the current state of an existing organization, it is also needed to assess existing infrastructure.

This article will focus on executing Detective Control using Cloud Batch jobs. Our goal will be to ensure that buckets created in a project are created in a specific zone.

The different steps in this can article can be accessed also from this Git repository https://github.com/vannicktrinquier/gcp-detective-control.

The code present in the repository and in this article is for demonstration only, not for production use.

Design Overview

Overall design is based on Google Cloud Batch executing a command line application named CFT Scorecard to perform detective control. The checks are based on Cloud Asset Inventory export to a Google Cloud Storage bucket. Finally, Cloud Scheduler is used to periodically create an execution and ensure the infrastructure stays compliant.

The following diagram presents the high level flow:

High level flow

Components

Various components are used to process detective control. Here is a brief list and explanation of all the tools, libraries and services used.

Cloud Batch

Google Cloud Batch is a managed service that makes it easy to run large-scale batch jobs on Google Cloud Platform. Cloud Batch abstracts away the complexities of managing and scheduling batch jobs, so you can focus on your application code. CAI Export and execution of CFT Scorecard command-line is a long running task that is a great use case for Cloud Batch.

CFT Scorecard

CFT Scorecard is a command-line tool that can be used to scan Google Cloud Platform (GCP) environments for resources and IAM policies. It uses data from Cloud Asset Inventory (CAI) exports to test policies based on constraints and constraint templates from the Config Validator policy library.

Config Validator

Config Validator is a tool that checks the syntax and correctness of Google Cloud Platform (GCP) resources against Rego based policies. It can be used to check for compliance with GCP’s security policies, as well as to enforce specific requirements for your environment.

Config Validator Policy Library

Config Validator Policy library is a repository with a collection of policies that can be used to scan Google Cloud Platform (GCP) resources for compliance with security standards. The policies are written in Rego, a policy language that is supported by the Open Policy Agent (OPA) framework.

Policy library can be used to scan GCP resources for a variety of compliance standards, including:

  • CIS Benchmarks
  • Google Cloud Security Best Practices
  • And even your own internal security policies

Cloud Asset Inventory Export

Cloud Asset Inventory Export to Google Cloud Storage bucket allows you to export all the asset metadata to files in a Cloud Storage bucket. This can be used to check GCP resources against policies. CAI is able to export various content types such as resources, iam policies, organizational policies and access policies.

Performing validation with CFT Scorecard

We will use gcloud to create a Cloud Batch job and a script is executed to perform detective control. The script is quite simple and is doing the following things:

  1. Download CFT Scorecard command line tool
  2. Clone the Config Validator policy library from Github.
    In a typical customer case, the clone might be more complex as the Git repository will not be public and Git authentication will have to be handled.
  3. Copy a constraint from the samples folder. The constraint chosen storage_location.yaml is ensuring that every Cloud Storage buckets are created are in asia-southeast1
  4. Execute CFT Scorecard using list of constraint and policies downloaded previously for the organization SCAN_ORGANIZATION_ID
  5. Copy the result to the bucket for further analysis
#!/bin/sh

echo 'Downloading CFT Scorecard'
curl -o cft https://storage.googleapis.com/cft-cli/latest/cft-linux-amd64
chmod +x cft

echo 'Downloading sample policy library'
git clone https://github.com/GoogleCloudPlatform/policy-library.git
cp policy-library/samples/storage_location.yaml policy-library/policies/constraints/
mkdir -p ./reports

./cft scorecard --policy-path ./policy-library --bucket ${CAI_BUCKET} --target-organization ${SCAN_ORGANIZATION_ID} --output-format json --output-path ./reports --refresh
gsutil cp ./reports/scorecard.json gs://${CAI_BUCKET}

Following constraint storage_location.yaml is used as a proof of concept in this article. This constraint is based on the following policy template. This constraint ensures that only Google Cloud Storage bucket created in asia-southeast1 are allowed. In this article, we will create a bucket in a different region and we will see that the CFT Scorecard tool is reporting it as a violation.

File storage_location.yaml

apiVersion: constraints.gatekeeper.sh/v1alpha1
kind: GCPStorageLocationConstraintV1
metadata:
name: allow_some_storage_location
annotations:
description: Checks Cloud Storage bucket locations against allowed or disallowed
locations.
bundles.validator.forsetisecurity.org/healthcare-baseline-v1: security
spec:
severity: high
match:
ancestries:
- "organizations/**"
parameters:
mode: "allowlist"
locations:
- asia-southeast1
exemptions: []

Configuring the Cloud Batch job

The gcloud command is quite limited and we need to use a JSON file to provide the configuration of the batch to create. The following JSON file contains:

  • A bash script automatically injected inside a container during the execution
  • A service account associated with the job used to queries GCP APIs.

File batch.json


{
"taskGroups": [
{
"taskSpec": {
"runnables": [
{
"script": {
"text": "#!/bin/sh\n\necho 'Downloading CFT Scorecard'\ncurl -o cft https://storage.googleapis.com/cft-cli/latest/cft-linux-amd64\nchmod +x cft\n\necho 'Downloading sample policy library'\ngit clone https://github.com/GoogleCloudPlatform/policy-library.git\ncp policy-library/samples/storage_location.yaml policy-library/policies/constraints/\nmkdir -p ./reports\n\n./cft scorecard --policy-path ./policy-library --bucket ${CAI_BUCKET} --target-organization ${SCAN_ORGANIZATION_ID} --output-format json --output-path ./reports --refresh\ngsutil cp ./reports/scorecard.json gs://${CAI_BUCKET}"
},
"environment": {
"variables": {
"SCAN_ORGANIZATION_ID": "<ORGANIZATION_ID_TO_REPLACE>",
"CAI_BUCKET": "<CAI_BUCKET_TO_REPLACE>"
}
}
}
],
"computeResource": {
"cpuMilli": 8000,
"memoryMib": 32768
},
"maxRetryCount": 1,
"maxRunDuration": "36000s"
},
"taskCount": 1,
"parallelism": 1
}
],
"logsPolicy": {
"destination": "CLOUD_LOGGING"
},
"allocationPolicy": {
"serviceAccount": {
"email": "batch-detective-sa@<PROJECT_ID_TO_REPLACE>.iam.gserviceaccount.com"
}
}
}

Before execution and creation of the Cloud Batch job, some specific information have to be updated such as:

  • the project where compliance checks are performed <ORGANIZATION_ID_TO_REPLACE>
  • the Cloud Storage bucket used to store CAI export and that is used by Config Validator to performance checks <CAI_BUCKET_TO_REPLACE>
  • the Cloud Batch service account used to perform actions to Google APIs. This service account required various permissions assigned to be able to execute compliance checks. This service account is assigned to VM created to perform the execution

Service account permissions

To get the permissions needed to execute a job, following IAM roles have to be granted to the job service account:

  • Batch Agent Reporter (roles/batch.agentReporter) to allow the job to report its status to the control plane.
  • Logs Writer (roles/logging.logWriter) to log Cloud Batch job output and ensure everything is happening as expected/
  • Storage Admin (roles/storage.admin) to store files on Cloud Storage bucket.
  • Cloud Asset Viewer (roles/cloudasset.viewer) at the organization level to export CAI assets.

Additionally, the Cloud Asset Service Agent needs to be granted the following role to be export CAI assets to a bucket:

  • Storage Admin (roles/storage.admin)

Performing detective control step by step

Set environment variables and initialize the environment

export SCAN_ORGANIZATION_ID=<ORGANIZATION_ID_TO_REPLACE>
export PROJECT_ID=<PROJECT_ID_TO_REPLACE>
export CAI_BUCKET=<CAI_BUCKET_TO_REPLACE>
export PROJECT_NUMBER=`gcloud projects describe $PROJECT_ID \
--format='value(projectNumber)'`

gcloud config set project $PROJECT_ID

Activate the CAI API on your project

gcloud services enable cloudasset.googleapis.com batch.googleapis.com logging.googleapis.com storage.googleapis.com compute.googleapis.com iam.googleapis.com
gcloud beta services identity create --service=cloudasset.googleapis.com

Assign permission to service account used by Cloud Inventory to store exported files to a GCS bucket

gcloud projects add-iam-policy-binding $PROJECT_ID \
--role roles/storage.admin \
--member \
serviceAccount:service-$PROJECT_NUMBER@gcp-sa-cloudasset.iam.gserviceaccount.com

Create a GCS bucket to store CAI export. This bucket will store the CAI export but also the results of CFT Scorecard execution

gcloud storage buckets create gs://$CAI_BUCKET

Create the service account that will be used to run the Cloud Batch job

BATCH_DETECTIVE_SA=batch-detective-sa

gcloud iam service-accounts create ${BATCH_DETECTIVE_SA} \
--description="Service Account used by Detective Cloud Batch Job"

Assign the good permissions to the Cloud Batch Job

gcloud projects add-iam-policy-binding $PROJECT_ID \
--role roles/batch.agentReporter \
--member \
serviceAccount:$BATCH_DETECTIVE_SA@$PROJECT_ID.iam.gserviceaccount.com

gcloud projects add-iam-policy-binding $PROJECT_ID \
--role roles/logging.logWriter \
--member \
serviceAccount:$BATCH_DETECTIVE_SA@$PROJECT_ID.iam.gserviceaccount.com

gcloud projects add-iam-policy-binding $PROJECT_ID \
--role roles/storage.admin \
--member \
serviceAccount:$BATCH_DETECTIVE_SA@$PROJECT_ID.iam.gserviceaccount.com

gcloud organizations add-iam-policy-binding $SCAN_ORGANIZATION_ID \
--role roles/cloudasset.viewer \
--member \
serviceAccount:$BATCH_DETECTIVE_SA@$PROJECT_ID.iam.gserviceaccount.com

Create a Cloud Batch job and monitor execution of the job. Execution of the job will take a little time as it is needed to wait CAI export for the project id is done and completed. Additionally, execution of CFT Scorecard is needed.

gcloud batch jobs submit detective-job-orga \
--location asia-southeast1 \
--config batch-organization.json

As soon the Cloud Batch job execution is done, the generated report will be present in the bucket folder with the name scorecard.json

Analyzing the results

Cloud Batch job status can be found directly from the console. Few minutes after the creation of the job, the execution will be over and the status of the job will be Succeeded.

One shot Batch job execution

At the end of our job execution, the CFT Scorecard report is copied to the bucket for further analysis. During the example, we can see that only a unique violation have been detected.

$ gsutil cp gs://${CAI_BUCKET}/scorecard.json . 
$ cat scorecard.json
[
{
"Category": "Other",
"Resource": "//storage.googleapis.com/my-bucket",
"Message": "//storage.googleapis.com/my-bucket is in a disallowed location.",
"metadata": {
"ancestry_path": "organizations/1111/folders/2222/folders/3333/folders/4444/projects/5555",
"constraint": {
"annotations": {
"bundles.validator.forsetisecurity.org/healthcare-baseline-v1": "security",
"description": "Checks Cloud Storage bucket locations against allowed or disallowed locations.",
"validation.gcp.forsetisecurity.org/originalName": "allow_some_storage_location",
"validation.gcp.forsetisecurity.org/yamlpath": "policy-library/policies/constraints/storage_location.yaml"
},
"labels": {},
"parameters": {
"exemptions": [],
"locations": [
"asia-southeast1"
],
"mode": "allowlist"
}
},
"details": {
"location": "US",
"resource": "//storage.googleapis.com/my-bucket"
}
}
}
]

Here, we found a violation for the resource //storage.googleapis.com/my-bucket because of the bucket location.

Scheduling the job for periodical execution

Above, we have seen how to perform a one time execution. However, we might want to periodically execute detective control on the organization or on a specific critical project. All of that can be done easily and without any coding using the Cloud Scheduler managed services.

A dedicated service account for the Cloud Scheduler cron job:

  • Batch Job Editor (roles/batch.jobsEditor) on the project to ensure that job can be created by Cloud Scheduler
  • Service Account User (roles/iam.serviceAccountUser) on the service account used by the batch job to ensure that Cloud Scheduler is able to create a job and assigned the Batch service account
REGION=asia-southeast1
SCHEDULER_SA=scheduler-detective-sa

gcloud iam service-accounts create ${SCHEDULER_SA} \
--description="Service Account used by Detective Cloud Scheduler"

gcloud projects add-iam-policy-binding $PROJECT_ID \
--role roles/batch.jobsEditor \
--member \
serviceAccount:$SCHEDULER_SA@$PROJECT_ID.iam.gserviceaccount.com

gcloud iam service-accounts add-iam-policy-binding $BATCH_DETECTIVE_SA@$PROJECT_ID.iam.gserviceaccount.com \
--role roles/iam.serviceAccountUser \
--member \
serviceAccount:$SCHEDULER_SA@$PROJECT_ID.iam.gserviceaccount.com

Creation of the Cloud Batch job is done using Cloud Scheduler with HTTP targeting the Google APIs. Authentication is done using oauth with the associated service account. Here, a cron schedule is created to perform detective control every 3 hours.

gcloud services enable cloudscheduler.googleapis.com

gcloud scheduler jobs create http detective-job --schedule="0 */3 * * *" \
--uri="https://batch.googleapis.com/v1/projects/$PROJECT_NUMBER/locations/$REGION/jobs" \
--http-method=POST \
--location=$REGION \
--headers=User-Agent=Google-Cloud-Scheduler,Content-Type=application/json \
--oauth-service-account-email=$SCHEDULER_SA@$PROJECT_ID.iam.gserviceaccount.com \
--oauth-token-scope="https://www.googleapis.com/auth/cloud-platform" \
--message-body-from-file=batch-organization.json

After a while, it can be seen that regular Cloud Batch jobs are created:

Multiples Batch job executions

Conclusion

That ‘s it, with a few gcloud commands, we have set up a periodic detective controls ready to be used !

Other enhancements can be done such as integrating those findings with Security Command Center or even Big Query. It might involve some coding to be able to integrate with those Google services.

--

--