Complex AWS ECS container version management with AWS Systems Manager Parameter Store

Published in

VRT Digital Products

7 min readJul 26, 2022

Combining docker versioning with AWS Parameter Store

Introduction

Version management of software in highly dynamic environments is often a challenge faced by many development teams. It is best practice to encode versions used in a configuration as code system, however this comes with the obvious drawback that your configuration as code needs to be actively maintained to remain up-to-date. If software versions change often, say for example multiple times per day, then this needs to be automated to be viable.

One approach is to do just that, have a continuous integration system that updates values in your configuration files whenever a new version is available. This will naturally result in many automated updates to your repository, which could cause race conditions, conflicts and potential loss of updates.

In this post we will investigate an alternative solution. Instead of storing the versions used directly in the configuration as code files, we will put in place a reference to an external source of truth that will hold the correct current version at all times. To be an effective alternative to the first proposed solution, updating and reading this external source of truth should be relatively easy. At VRT we implemented a system to store our current production/testing docker container image versions in the AWS Systems Manager Parameter Store and a Cloud Formation Custom Resource to extract this information when updating our Cloud Formation Stacks.

Docker Image Versioning

Before diving into the Parameter Store data extraction, I want to briefly touch upon how docker images are versioned. Each image stored in a container registry can be referenced in a multitude of ways. The easiest way is to use its tag. The tag can be found after the (last) colon in the full image string. For example the tag for the image myregistry:5000/group/image:version1.0 is version1.0. From the official docker documentation, a tag name must be valid ASCII and may contain lowercase and uppercase letters, digits, underscores, periods and dashes. A tag name may not start with a period or a dash and may contain a maximum of 128 characters. As you can see, this is still fairly open for your own organizational preferences. Most common however, a tag will be relatively short and indicate a linear versioning order in one way or another. Tags can be overwritten, but this is considered bad practice.

One special tag that is also automatically associated with an image, that is implicitly overwritten by design whenever the image is updated, is the latest tag. This tag can be used to ignore the need to specify a specific image and instead the “latest” version is always fetched whenever a new instance is created. While this approach might seem similar and evidently more simple to the one we will be explaining in this post, it is also inherently more limited. You will be effectively relinquishing any control on when to promote your image version to a specific environment, as the latest tag is automatically updated on image upload. This could cause new container instances, created for example by an automatic scaling event, to unexpectedly run a different version, while at the same time not explicitly communicating this to the developers. It is thus not recommended to use the latest tag in production deployments.

A final possibility is to reference a container image using its specific hash, but this is inconvenient at best and provides no obvious benefits compared to the regular tagging system, so we’ll disregard it besides this brief mention.

The optimal choice to detail your container image versions is thus by their tags. As mentioned in the introduction however, we don’t want to put these strings directly in our Cloud Formation templates, as these will need constant updates. To solve this we will leverage the AWS Systems Manager Parameter Store.

AWS Systems Manager Parameter Store

The AWS Systems Manager Parameter Store, which I will henceforth refer to as just Parameter Store, provides a hierarchical storage platform for configuration and secret management. Values can be stored either as plain text strings or as “secure” strings, which are encrypted. It offers integrations with other AWS services, including Cloud Formation. This integration already sees widespread use when selecting EC2 AMIs. It is an AWS best practice to pass the desired AMI via a Cloud Formation Parameter, which is in fact a key to an entry in the Parameter Store, that is then resolved to the actual AMI identifier. This design pattern promotes convenient centralized management and allows for easier governance implementations.

An example of leveraging the Parameter Store for determining EC2 instance AMIs could be the following:

"Parameters": {
  "ExampleAmiId": {
    "Default": "/path/to/prod/ami/key",
    "Description": "Example AMI",
    "Type": "AWS::SSM::Parameter::Value<AWS::EC2::Image::Id>"
  }
}

Such a parameter, when filled in correctly, would then resolve to the identifier you wish to use. In the Cloud Formation Parameters tab, it would look like this:

+--------------+-----------------------+-------------------+
|     Key      |         Value         |  Resolved Value   |
+--------------+-----------------------+-------------------+
| ExampleAmiId | /path/to/prod/ami/key | ami-01234567abcd  |
+--------------+-----------------------+-------------------+

Referencing this value can then be done quite easily in your Launch Configuration:

“LaunchConfig”: {
  “Type”: “AWS::AutoScaling::LaunchConfiguration”,
  “Properties”: {
    “ImageId”: {
      “Ref”: “ExampleAmiId”
    },
[…]
},

This integration works very well, since the focus of a Launch Configuration is usually on a single AMI. This monolithic design is less common with containers though. Our ECS Task Definitions usually specify multiple container images to run next to each other, one for the main application and several supporting containers (known as sidecars). The sidecar pattern encourages simplifying your individual containers, as they only should focus on their (singular) purpose. Additional functionality, not directly related to the application, is extracted to other, often generic, containers. A common example of this would be a reverse proxy like nginx or a prometheus exporter, exposing metrics for your monitoring systems.

This multitude of containers and corresponding container images introduces additional complexity in our original setup. We can still use the 1-to-1 mapping of parameter to value, but this would result in a very large number of additional parameters (multiplied by any duplication introduced to support multiple environments with different versions). Instead we decided to bundle logically coupled image versions in the same Parameter Store entry. This is done by encoding them in a json string. An example of this would be the following:

{“radio/air-socket-app”:”buster-node-14.18.2–1658150132991",”radio/air-socket-nginx”:”bullseye-1658651524052"}

The previous json example specifies two images, one for an application and one for a reverse proxy, specifically configured for this application. Other sidecars that are generic, will be stored in their own parameter, to improve reusability for other teams and products. Updating these entries is handled by our CI tools. After each image build/test, we can automatically update these entries with the most recent values. We now have our image versions stored behind a collection of Parameter Store keys, wrapped in json structures. No direct support exists to parse these as Cloud Formation Mappings to use directly in our templates, so for this we turn to our Custom Resource, the final piece in the puzzle.

JSON Parsing Custom Resource

The ParseDict Custom Resource class, this as well as other useful Custom Resources can be found on the vrtdev github repository, provides the support we need to take a list of Parameter Store keys and spit out a dictionary that can be accessed by troposphere (for our python based Cloud Formation templates). The underlying lambda implementation can be found here. The lambda function will simply retrieve all the given parameters and combine their json bodies to a single python dictionary, which is then returned. A second property, the Serial, is provided to allow for forced updates, so you have fine control over when the lambda code should be re-evaluated. In normal circumstances, this would be with every Cloud Formation update. Mapping this Serial to a timestamp, is an easy way to achieve this.

Taking things back to our template, we can pass our desired list of parameters as a CommaDelimitedList, for example:

“ServiceImageParamsString”: {
  “Default”: "/containers/prod/team/app,/containers/prod/team/nginx,/containers/prod/generic/consul,/containers/prod/generic/nginx-exporter”,
  “Description”: “The SSM PS keys to read for the service image versions”,
  “Type”: “CommaDelimitedList”
},

The Custom Resource can then be defined as follows:

“ServiceImageParams”: {
  “Properties”: {
    “Names”: {
      “Ref”: “ServiceImageParamsString”
    },
    “Serial”: “123456789”,
    “ServiceToken”: {
      “Fn::ImportValue”: {
        “Fn::Sub”: “${CustomResourcesStack}-SsmParseDictServiceToken”
      }
    }
  },
  “Type”: “Custom::SsmParseDict”
},

Finally, you can extract information from this Custom Resource using the native Cloud Formation function GetAtt:

“Image”: {
  “Fn::Sub”: [
    “customregistry:5000/radio/air-socket-nginx:${image}”,
    {
      “image”: {
        “Fn::GetAtt”: [
          “ServiceImageParams”,
          “radio/air-socket-nginx”
        ]
      }
    }
  ]
},

As you can imagine, it is also possible to reuse this Custom Resource to provide support for extracting any strings from json formatted data spread over potentially multiple Parameter Store entries. In this case it is applied to solve the initial problem of referencing an external source of truth in our configuration as code for determining which docker container image versions to use.

Conclusion

In this post we presented our Custom Resource developed by the VRT, which is capable of parsing a collection of Parameter Store entries, formatted as json strings, and exposing this as a dictionary which can subsequently be queried using native Cloud Formation functions like GetAtt. We leverage this Custom Resource to read which docker image versions to launch in our ECS Task from a central source of truth, in this case the Parameter Store. By decoupling this configuration from our configuration as code, we simplify its management, while maintaining version control. The deployed container versions are strict, clearly reported to the developer and fully configurable per environment. The proposed solution thus promotes correctness in deployments, while still being flexible enough for developers to use during development.