Setting up secure Pub/Sub flow with Go and Terraform
This article covers fully managed message processing pipeline that handles errors and notifications. The codebase achieves principle of least privilege leaving retry- and deadlettering logic to GCP.
Let’s say I have a use-case where my retail company would like to receive events on updated product information. In this case I’d set up an API-endpoint, expose it to the information provider ensuring authentication and authorization. For each update, I’d like to process that information into my company’s database. In case of failures, I’d save every message into the database where it can be picked up for debugging notifying our developer team via proper channels.
The above scenario can be solved using a combination of Cloud Run (API-layer and processing of messages), Pub/Sub handling the message flow ensuring invocation of defined Cloud Run instances and Cloud Storage acting as a database. To send the notification, we will use Mailgun. The Mailgun API key will be stored inside Google Secret Manager and accessed solely by the notifier API’s service account.
Let’s go briefly through the flow:
- message-generator receives HTTP request to generate 10 messages. Those messages are published to ordinary Pub/Sub topic
- Out of those 10 messages, one of them is hardcoded to fail by returning 406 status code inside message-processor api. Remaining messages will be stored inside pubsub-ok bucket.
- The message that’s failing will be processed five times with exponential backoff before published to deadletter topic
- The processor of deadletter-gs-saver subscription will store the message in pubsub-error bucket. By creating a file inside the error bucket, another message will be triggered by GCS and posted to notification topic.
- The GCS triggered message will be processed by notification service and send an email with the link to the file.
All parts of the solution are coded in a way that ensures separation of duties and principle of least privilege. Publisher’s service account has solely permissions allowing the application to publish messages into one specific topic. The processors have permissions to write file into respective buckets. When taking a look at the IAM page inside Google Cloud Console, you won’t see any project-wide permissions assigned to any service accounts, as each and every necessary permission is applied on the resource level, greatly limiting the attack surface:
Fun! parts about the solution
The solution implements several neat tricks, recommended by Google:
- Each Cloud Run instance run with separate service account whom is assigned a minimal set of permissions on resource level (topic, subscription, bucket). E.g., the service account for ordinary processor doesn’t have IAM roles to write to pubsub-error bucket and vice-versa.
- Cloud Run instances handling processing and notifications are limited to internal ingress only.
- Cloud Run notifier instance is picking up Mailgun token from Secret Manager and automagically injecting it as environmental variable.
- By utilizing multi-stage build inside the Dockerfile, our image is just ~12Mb size and therefore takes almost no time for Cloud Run to start.
Deploy it yourself!
Feel free to clone this repo, build the images using docker-compose, push them to GCR and apply Terraform configuration by providing necessary variables when running Terraform commands.