Get your Auto Scaling Groups Automatically Patched on AWS with Terraform and SSM

Gustavo Zanotto
4 min readApr 8, 2023

--

This guide will show you how to use AWS Patch Manager to automate the OS patching of EC2 instances for both Linux and Windows under an Auto Scaling Group. We will be using Terraform to automate the process of provisioning the required AWS resources that will allow us to patch any supported Linux distribution and Windows versions.

When running instances under an Auto Scaling Group, the patching process is not as simple as standalone instances since instances scale up and down at any time. Usually, companies have engineers to perform several manual steps to successfully patch an Auto Scaling Group. Here, we will be automating this procedure so our Auto Scaling Groups can be patched on a defined scheduled without requiring any human action.

Creating the AWS Resources

We will be leveraging IaC with Terraform to create and configure all the required infrastructure for our demonstration.

Module Link on Terraform Registry

Module Usage

This module will be creating a SSM Automation Document to perform the automation steps, IAM roles used by SSM and EC2, SSM Parameters for management of AMIs and Launch Templates and Security Groups for orphan instances.

module "ssm_patch_asg" {
source = "gutozntt/ssm-patch-asg/aws"

name = "ssm-patch-asg"

targets = {
myasg1 = {
asg_name = "myasg1"
initial_ami = "ami-06s22xxxx4e"
schedule = "cron(0 0 1 ? * SUN *)"
subnet_id = "subnet-04xxxxss504"
retention_days = 20
},
myasg2 = {
asg_name = "myasg2"
initial_ami = "ami-06s22xxxx4e"
schedule = "cron(0 0 1 ? * SUN *)"
subnet_id = "subnet-04xxxxss504"
retention_days = 20
}
}

tags = local.tags
}

How it works

Once the module is deployed to your AWS account, for each ASG target the following steps will happen:

  1. Terraform will create two SSM Parameters, one will be initialized with the initial_ami that you set above and the other will be initialized with any string. These SSM Parameters will be updated every automation run with new values.
  2. When the date and time matches with your cron schedule, the maintenance window will trigger a maintenance window task that will be responsible to run the SSM Automation Document.
  3. The SSM Automation Document will start by running a new instance from your initial AMI.
  4. Once the instance is running, it will check if the instance is pingable and SSM managed.
  5. The patches will be installed using the default document AWS-RunPatchBaseline which will apply the default baseline for the OS in matter. If you want to specify a custom baseline, you can simply add patch_baseline = "your_baseline_name" within your target block.
  6. Once the patches are applied, the instance will be stopped and a new AMI will be created.
  7. Once the AMI is created, the instance will be terminated.
  8. Now, the automation will run a python script using the AWS library boto3.
  9. The script will start by creating a new launch template based on the current setup. It will match all the configurations such as Volumes and Security Groups.
  10. The new launch template will have the new AMI created on Step 6.
  11. Once the new launch template is created, the script will update the ASG to use the new launch template and will refresh the instances. Refresh the instances will cycle the running instances one by one as they start getting healthy. Its recommended to have ELB Health Checks enabled in your ASG to use this automation.
  12. Once the ASG is up to date, the script will update the SSM Parameters with the new AMI and the new Launch Template ID.
  13. To finalize, the script will run some steps to clean up old AMIs and Launch Templates to avoid the account to have unlimited AMIs and Launch Templates. The clean up process will be explained below.

AMIs and Launch Templates Clean Up

In order to avoid acumulating AMIs and Launch Templates within your AWS Account, whenever the automation creates a new AMI or a new Launch Template, it will tag the resources with the following tag:

DeleteAfter = mm-dd-yyyy # This value will be today's date + retention_days previously set

When the boto3 script runs the Step 13, it will look for the AMIs and Launch Templates with that tag and if the date set in the DeleteAfter tag has passed already, it will delete it.

Recommendations

  1. Make sure you use ELB Health Checks in your ASGs (if applicable).
  2. Consider having old Launch Templates and AMIs for rollback purposes. For example: if you intend to run the maintenance monthly, consider having a retention of 90 days (30*3) so you can have 3 Launch Templates and AMIs available for rollback. If you run weekly, 21 days of retention (7*3). If you feel 3 is too much, just apply the same rule for 2 weeks(7*2) or 2 months (30*2).

--

--

Gustavo Zanotto

DevOps Engineer | 5x AWS Certified | Terraform | CKA/CKAD