Sitemap
Building Inventa

Building inventa.shop to help millions of small entrepreneurs to succeed!

Infrastructure Resource Creation with Backstage

--

TL;DR

This article explores AWS resource creation (using SQS as an example) via the Backstage Software Templates plugin and how it helps application and platform developers deploy code faster, cleaner, and safer.

Photo by Danist Soh on Unsplash

Introduction

As a company grows in size, number of collaborators, and system complexity it’s good (or dare I say, essential) as an engineering platform team to look at the Developer Experience as a whole.

One of the aspects of DevEx that flourished over the past couple of years is the implementation of Internal Developer Platforms (IDPs) to make application developers deliver software cleaner, safer, and faster than before. This is done by decreasing the cognitive load on those teams by providing tools like a software catalog, service quality tiers, and software templates.

At Inventa we built our IDP using Backstage as our central piece. In this article, we will explore how we leverage Backstage’s Software Templates plugin (plus some extensions 👀) and Terraform to enable application developers to create AWS resources on our infrastructure easier and faster.

Quick Recap

We already have an awesome article that goes into detail on our Terraform repo file distribution and deployment, but the TLDR is that we have a monorepo where each one of our services declares the necessary resources in the following way:

"product-catalog" (* Project main path *)
├─ "catalog-info.yaml" (* Backstage configuration file *)
├─ "ecr.tf" (* Elastic Container Register definitions *)
├─ "env"
│ ├─ "backends"
│ │ ├─ "dev.tfvars" (* Terraform backend file definitions *)
│ │ └─ "prod.tfvars" (* Terraform backend file definitions *)
│ ├─ "dev.tfvars" (* Terraform variables with values for development environment *)
│ └─ "prod.tfvars" (* Terraform variables with values for production environment *)
├─ "locals.tf" (* Terraform file to create expressions, so you can use the parameters multiple times within a module instead of repeating the expression. *)
├─ "route53.tf" (* Route53 definitions *)
├─ "secrets.tf" (* Secrets Manager definitions *)
├─ "sqs.tf" (* SQS Queue definitions *)
└─ "vars.tf" (* Terraform file to manage all variables *

In the example above we have a product-catalog service that creates ECR, Route53, Secrets Manager, and SQS resources through the files above (ecr.tf, route53.tf, secrets.tf and sqs.tf respectively). Beyond that, specific variables are set either on env/dev.tfvars or env/prod.tfvars.

An important point of this architecture is that the recipe files are rarely (preferably never) changed, the changes are made using env/dev.tfvars or env/prod.tfvars variables, depending on the environment.

For example, the sqs.tf looks something like this:

module "list_queue_deadletter" {
# Module reference
source = "..."
for_each = { for r in var.sqs_list_of_queues : r.name => r }
name = "${each.value.name}-dlq"
...
}
module "list_encrypted" {
# Module reference
source = "..."
for_each = { for r in var.sqs_list_of_queues : r.name => r }
name = each.value.name
sns_sub = each.value.sns_sub
event_sub = each.value.event_sub
s3_sub = each.value.s3_sub
...
depends_on = [
module.list_queue_deadletter
]
}

And the dev.tfvars/prod.tfvars looks like this:

...
# SQS vars
sqs_list_of_queues = [
{
"name": "product-catalog-created-event",
"sns_sub": true,
"event_sub": false,
"s3_sub": null
}
]
...

The being that every time we want to create a new SQS queue for this service we only need to add a new entry to the sqs_list_of_queues array.

Before the IDP…

Now, say that we have a new hire that wants to create a queue for a service that still does not have any queues. The creation flow would be as follows:

  1. Clone the monorepo where we declare the resources for each service;
  2. Find the service folder and create a sqs.tf file (probably copying from other services);
  3. On the env/dev.tfvars and/or env/prod.tfvars create the sqs_list_of_queues array;
  4. Add the queue with the necessary information to the array;
  5. Open a PR so the platform team can review the changes;
  6. If everything in the step above is correct, merge the PR. The resources will be created automatically through the pipeline.

The flow works, but it has its flaws:

  1. Application developers should not necessarily know the ins and outs of how the infrastructure is created. The extra cognitive load that the process above brings takes away valuable time (and brain power) that could be focused on delivering value to the product.
  2. The code quality can not be guaranteed at any level when the PR is opened, leading to more time used by the platform developers on the review process.
  3. The process above has a lot of boilerplate and copy-and-paste, which leads to a more time-consuming and error-prone flow.

To address both points we decided to use Backstage’s Software Templates feature to help us automate all the steps necessary to create an infrastructure resource, but before tackling that we had a roadblock in the way that we needed to address…

Terraform, Backstage, and Updating Files

The Software Templates is a great feature from Backstage, but it had a limitation for our specific use case. You see, it’s all well and good when we want to create new files or delete existing files during the template steps, using built-in actions like fetch:template, fetch:plain and fs:delete, the problem comes when we want to update an existing file surgically.

Using our SQS example, if we wanted to add a new product-catalog-deleted-event we should change the *.tfvars files to something like:

...
# SQS vars
sqs_list_of_queues = [
{
"name": "product-catalog-created-event",
"sns_sub": true,
"event_sub": false,
"s3_sub": null
+ },
+ {
+ "name": "product-catalog-deleted-event",
+ "sns_sub": true,
+ "event_sub": false,
+ "s3_sub": null
}
]
...

Sadly, these changes are not easy to do using just the actions that come with Backstage. Luckily, extensibility is a topic that the Backstage team takes very seriously, and the Software Templates plugin is no exception.

Introducing Roadie’s scaffolder-backend-module-utils actions package which, as the name implies, is an extension of the Software Templates plugin with actions that help work with files, JSON, YAML, etc. For our use case, we used the merge JSON action in conjunction with the fact that Terraform supports JSON files with Terraform variables.

So, for this to work, we migrated our dev.tfvars and prod.tfvars files to dev.tfvars.json and prod.tfvars.json, looking now something like this:

{
...
"sqs_list_of_queues": [
{
"name": "product-catalog-created-event",
"sns_sub": true,
"event_sub": false,
"s3_sub": null
}
],
...
}

No other changes were made in the structure and now we are ready to implement the template on Backstage :)

Template Implementation

With all that we discussed above the idea is to create a template that automatically configures the necessary files depending on what is needed:

  • If the service already has the SQS configuration it just updates the sqs_list_of_queues array with the new queue.
  • If not, it creates the array, and creates the sqs.tf file.

An example of how this template YAML looks like is:

apiVersion: scaffolder.backstage.io/v1beta3
kind: Template

# some metadata about the template itself
metadata:
name: sqs-queue-template
title: SQS Queue Template
description: Template for creating a new SQS queue
spec:
owner: engineering-foundation
type: sqs-queue

# these are the steps which are rendered in the frontend with the form input
parameters:
- title: Service information
required:
- entity
properties:
entity:
title: Service
description: Choose a service to create the queue for
type: string
ui:field: EntityPicker
ui:options:
allowArbitraryValues: false
catalogFilter:
- kind: Component
spec.type: service
defaultKind: Component
- title: SQS queue general information
required:
- queueName
- envs
properties:
queueName:
title: Queue name
type: string
description: Name of the new queue
envs:
title: Envrioments
description: Select environments to create the queue
type: array
minItems: 1
items:
type: string
enum:
- production
- development
uniqueItems: true
ui:widget: checkboxes
createConfiguration:
title: Create initial SQS configuration
description: Should the initial SQS configuration be created for the service?
type: boolean
default: false
- title: SQS queue specific information
properties:
snsSub:
title: SNS subscription
description: Will the queue subrcibe to a SNS topic?
type: boolean
default: false
eventSub:
title: Eventbridge subscription
description: Will the queue subrcibe to a Eventbridge event?
type: boolean
default: false
s3Sub:
title: S3 subscription
description: Set the ARN of the S3 bucket (leave empty if the queue will not subscribe to an S3 bucket)
type: string

# here's the steps that are executed in series in the scaffolder backend
steps:
- id: fetch-vars
name: Fetch service terraform variables
action: fetch:plain
input:
url: <https://github.com/inventa-shop/aws-resources/blob/main/terraform/$>{{ parameters.entity | parseEntityRef | pick('name') }}/env
targetPath: ./output/env
- id: add-queue-dev
name: Add queue to list of queues (development)
if: ${{ parameters.envs.includes('development') }}
action: roadiehq:utils:json:merge
input:
path: ./output/env/dev.tfvars.json
mergeArrays: true
content:
sqs_list_of_queues:
- name: ${{ parameters.queueName }}
sns_sub: ${{ parameters.snsSub }}
event_sub: ${{ parameters.eventSub }}
s3_sub: ${{ parameters.s3Sub if parameters.s3Sub else null }}
- id: add-queue-prod
name: Add queue to list of queues (production)
if: ${{ parameters.envs.includes('production') }}
action: roadiehq:utils:json:merge
input:
path: ./output/env/prod.tfvars.json
mergeArrays: true
content:
sqs_list_of_queues:
- name: ${{ parameters.queueName }}
sns_sub: ${{ parameters.snsSub }}
event_sub: ${{ parameters.eventSub }}
s3_sub: ${{ parameters.s3Sub if parameters.s3Sub else null }}
- id: add-empty-queue-list-dev
name: Add empty queue list (development)
if: ${{ parameters.envs.includes('development') === false and parameters.createConfiguration }}
action: roadiehq:utils:json:merge
input:
path: ./output/env/dev.tfvars.json
mergeArrays: true
content:
sqs_list_of_queues: []
- id: add-empty-queue-list-prod
name: Add empty queue list (production)
if: ${{ parameters.envs.includes('production') === false and parameters.createConfiguration }}
action: roadiehq:utils:json:merge
input:
path: ./output/env/prod.tfvars.json
mergeArrays: true
content:
sqs_list_of_queues: []
- id: fetch-skeleton
name: Fetch new SQS configuration files
if: ${{ parameters.createConfiguration }}
action: fetch:plain
input:
url: ./skeleton
targetPath: ./output
- id: publish-pr
action: publish:github:pull-request
name: Create pull request
input:
repoUrl: github.com?repo=<REPO>&owner=<OWNER>
branchName: feature/add-${{ parameters.queueName }}-queue
targetBranchName: main
title: Create ${{ parameters.queueName }} SQS queue
description: |
This PR adds a new ${{ parameters.queueName }} SQS queue for the ${{ parameters.entity | parseEntityRef | pick('name') }} service.
Author: @${{ user.entity.metadata.name }}
_Created automatically via Backstage_ 🚀
sourcePath: ./output
targetPath: terraform/${{ parameters.entity | parseEntityRef | pick('name') }}
commitMessage: (feature) add ${{ parameters.queueName }} SQS queue
forceEmptyGitAuthor: true

# some outputs which are saved along with the job for use in the frontend
output:
links:
- title: Pull Request
url: ${{ steps['publish-pr'].output.remoteUrl }}

In this example, we get the service that needs the queue via the EntityPicker and, after that, the queue information and which environment it will be created. On the steps, we can see that we use the createConfiguration variable to determine if we are going to use the skeleton or not (where the sqs.tf file resides).

Below we have some screenshots of the actual template that we are using (a little more complex than the one we presented above but the basics are the same):

On this page, we select for which service the queue will be created
Here we set the queue name, envs on which it will be created, and if the initial configuration needs to be done
Here we set some configurations of the queue
Finally, we can check if everything is correct

After the PR is opened the platform team can review it as usual, with the difference that only what is changed by the user on the template needs to be checked since the rest is pre-configured on the template.

This logic can be extended for all kinds of resources in the company and this strategy solves the pain points mentioned at the beginning of the article, making it possible for application teams to easily create resources and focus time and effort on the business while platform teams have to focus only on code that is generated via user input.

--

--

No responses yet