Automate Custom Patch Process of AWS EC2 Instances Using Amazon Systems Manager

7 min readSep 14, 2021

Custom Patch Process using Amazon Systems Manager

In today’s world, security is one of the top priorities in a business. No matter what, it’s obvious we spend a certain amount of time preventing hackers from stealing our sensitive information. Hackers exploit the vulnerabilities found in the system/server and attacks the network which may take an entire company down. Sometimes the most important solution which can prevent hackers to do so is given less importance. This trivial solution is Security Patching.

In this post, I’ll show you a modernized solution to automate the patching process of Linux instances using Amazon Systems Manager (SSM). We will also understand how can we generate various reports using AWS APIs that can be useful for auditing. The same process can be applied for patching Windows instances as well.

Solution Overview

This solution will show you how to orchestrate end to end patching cycle of an AWS EC2 instance(s) without a need to SSH into any Linux instance. I have preferred Python to automate this solution however you can use your supported programming language with which you are comfortable with

Define and automate pre-patching journey
Define and automate patching journey
Define and automate post-patching journey

Pre-requisites

Make sure that AWS Systems Manager is set up and the AWS Systems Manager agent has been installed and updated on all of the instances you want to patch
Create SSM Patch Baselines with appropriate fields for security patching. Please refer AWS Documentation for the same
Defined a tagging strategy for your instances. In my case, I have defined tagging strategy based on environments and applications. Say I have Dev, QA, UAT and Production environments and variety of application say Nginx, Web Servers, Backend Servers etc. In such cases, I will prefer tagging EC2 instances as below

Environemnt = DEV | QA |UAT | PROD
Application = NGINX | WEB_SERVER | BACKEND_SERVER
Patch Group = AMAZON_LINUX1 | AMAZON_LINUX2 | WINDOWS

We can use these tags to select right application in right environment. We can baseline lowest environment to identify missing patches and install same list on all the higher environments. This tagging process can help us to make sure that all the environments of an application have same set of patches installed during a cycle

4. Make sure that your instance is tagged with “Patch Group” key and appropriate value to apply correct patch baseline for your instance’s OS while patching. Please refer AWS Documentation for more information on this

Solutioning

1. Pre-Patching Journey

Activities included in Pre-Patching Journey are

Scan EC2 instance for missing patches
Generate a pre-patch report for auditing purposes
Approve list of patches to be installed

2. Patching Journey

Take an AMI backup of EC2 instance before patching
Install approved patches on EC2 instance(s)

3. Post-Patching Journey

Generate a Post-Patch report for auditing purposes
Perform sanity of EC2 instance(s)

Let’s have a look at following workflow diagram to understand all these steps in a pictorial view

Now let’s understand how can we achieve this automation using various AWS APIs in a Python program. You can use this reference and use relevant APIs in all other supported programming languages e.g. Java, Nodejs or even AWS CLI

Implementation — Pre-Patching Journey

Scan EC2 instance for missing patches

Request — Execute SSM document in Scan Mode, No Reboot with instance id or appropriate tags in filters

response=client.send_command(
InstanceId='string',
Targets=[
  {
    'Key': 'string',
    'Values': [
      'string',
      
    ]
  }  
],
DocumentName='AWS-RunPatchBaseline',
DocumentVersion='$LATEST',
TimeoutSeconds=900,
Comment='<SOME COMMENT>',
Parameters={
  'Operation': [
    'Scan'
  ],
  'RebootOption': [
    'NoReboot'
  ]
},
OutputS3BucketName='<S3_BUCKET_NAME TO WRITE SSM COMMAND LOGS>',
OutputS3KeyPrefix='<S3_BUCKET_KEY_PREFIX TO WRITE SSM COMMAND LOGS>',
)

Response — Retrieve command-id from the response

{
    'Command': {
        'CommandId': 'string',     
}

Wait until command is executed

Request — Execute boto3 waiters to wait until scanning operation is successful

waiter = client.get_waiter('command_executed')
waiter.wait(
    CommandId='string'
)

Response — None

Generate a pre-patch report for auditing purposes

Request — Describe instance patch states

response = client.describe_instance_patch_states(
    InstanceIds=[
        'string',
    ],
    NextToken='string',
    MaxResults=123
)

Response — This gives MissingCount of patches on the instance.

{
    'InstancePatchStates': [
        {
            'InstanceId': 'string',
            'PatchGroup': 'string',
            'BaselineId': 'string',
            'SnapshotId': 'string',
            'InstallOverrideList': 'string',
            'OwnerInformation': 'string',
            'InstalledCount': 123,
            'InstalledOtherCount': 123,
            'InstalledPendingRebootCount': 123,
            'InstalledRejectedCount': 123,
            'MissingCount': 123,
            'FailedCount': 123,
            'UnreportedNotApplicableCount': 123,
            'NotApplicableCount': 123,
            'OperationStartTime': datetime(2015, 1, 1),
            'OperationEndTime': datetime(2015, 1, 1),
            'Operation': 'Scan'|'Install',
            'LastNoRebootInstallOperationTime': datetime(2015, 1, 1),
            'RebootOption': 'RebootIfNeeded'|'NoReboot',
            'CriticalNonCompliantCount': 123,
            'SecurityNonCompliantCount': 123,
            'OtherNonCompliantCount': 123
        },
    ],
    'NextToken': 'string'
}

Request — Describe instance patches either by using instance id or tags in a filter

response = client.describe_instance_patches(
    InstanceId='string',
    Filters=[
        {
            'Key': 'string',
            'Values': [
                'string',
            ]
        },
    ],
    NextToken='string',
    MaxResults=123
)

Response — This will return list of patches against the patch baseline applied to the instance(s)

{
    'Patches': [
        {
            'Title': 'string',
            'KBId': 'string',
            'Classification': 'string',
            'Severity': 'string',
            'State': 'INSTALLED'|'INSTALLED_OTHER'|'INSTALLED_PENDING_REBOOT'|'INSTALLED_REJECTED'|'MISSING'|'NOT_APPLICABLE'|'FAILED',
            'InstalledTime': datetime(2015, 1, 1),
            'CVEIds': 'string'
        },
    ],
    'NextToken': 'string'
}

Include both the responses of above APIs and write it in a file on S3 bucket. This file can be referred as Pre-Patch report

Implementation — Patching Journey

Create SSM Automation Document to create and tag AMI

Follow AWS Document to create AWS SSM Automation Document
You can use readily available document on GitHub

P.S. This is one time setup on your AWS account in a specific region

Take an AMI backup of EC2 instance before patching

Request — Execute automation document with necessary inputs

response=client.start_automation_execution(
DocumentName='<SSM_AUTOMATION_DOCUMENT_NAME>',
DocumentVersion='$LATEST',
{
  "InstanceId": [
    "<INSTANCE_ID>"
  ],
  "NoReboot": [
    "true"
  ],
  "AMINameValue": [
    "AMI_NAME"
  ],
  "DeleteOnValue": [
    "DATE_WITH_STANDARD_FORMAT DD-MM-YYYY"
  ]
})

Response — Read automation execution id from the response

{
    'AutomationExecutionId': 'string'
}

Build your own logic to wait until AMI is available

Request — Check automation status

response = client.describe_automation_executions(
    Filters=[
        {
            'Key': 'ExecutionId'
            'Values': ['<AUTOMATION_EXECUTION_ID>']
        },
    ],
    MaxResults=123,
    NextToken='string'
)

Response — Wait until AutomationExcutionStatus in the below response reach to one of the mentioned state

{
  'AutomationExecutionMetadataList': [
    {
      'AutomationExecutionId': 'string',
      'AutomationExecutionStatus': 'Success'|'TimedOut'|'Cancelled'|'Failed'
    }
  ]
}

Create a yaml file containing approved patches

As part of the SDLC process, we have multiple environments including Development, QA, UAT and Production etc. To match compliance requirements, we may need to install same patch list across environments. To achieve this, build your own logic perform following operations

Identify missing patches from describe_instance_patches API
Filter only approved patches from the list
Generate a yaml file as shown below including id and title of the missing patch

patches:
- id: curl.x86_64
  title: curl.x86_64:0:7.61.1-12.94.amzn1
- id: kernel.x86_64
  title: kernel.x86_64:0:4.14.186-110.268.amzn1
- id: libcurl.x86_64
  title: libcurl.x86_64:0:7.61.1-12.94.amzn1
- id: microcode_ctl.x86_64
  title: microcode_ctl.x86_64:2:2.1-47.39.amzn1
- id: python27.x86_64
  title: python27.x86_64:0:2.7.18-1.138.amzn1
- id: python27-devel.x86_64
  title: python27-devel.x86_64:0:2.7.18-1.138.amzn1
- id: python27-libs.x86_64
  title: python27-libs.x86_64:0:2.7.18-1.138.amzn1

Upload yaml file to some S3 bucket on AWS account

Install approved patches on EC2 instance(s)

Request — Execute SSM document in Scan Mode, No Reboot with instance id or appropriate tags in filters

response=client.send_command(
InstanceId='string',
Targets=[
  {
    'Key': 'string',
    'Values': [
      'string',
      
    ]
  }  
],
DocumentName='AWS-RunPatchBaseline',
DocumentVersion='$LATEST',
TimeoutSeconds=900,
Comment='<SOME COMMENT>',
Parameters={
  'Operation': [
    'Install'
  ],
  'RebootOption': [
    'Reboot'
  ]
},
OutputS3BucketName='<S3_BUCKET_NAME TO WRITE SSM COMMAND LOGS>',
OutputS3KeyPrefix='<S3_BUCKET_KEY_PREFIX TO WRITE SSM COMMAND LOGS>',
)

Response — Retrieve command-id from the response

{
    'Command': {
        'CommandId': 'string',     
}

Wait until command is executed

Request — Execute boto3 waiters to wait until scanning operation is successful

waiter = client.get_waiter('command_executed')
waiter.wait(
    CommandId='string'
)

Response — None

Implementation — Post-Patching Journey

Generate a post-patch report for auditing purposes

Request — Describe instance patch states

response = client.describe_instance_patch_states(
    InstanceIds=[
        'string',
    ],
    NextToken='string',
    MaxResults=123
)

Response — This gives InstalledCount and MissingCount of patches on the instance. Ideally MissingCount in Pre-Patch report should be equal to InstalledCount in Post-Patch report and MissingCount should be equal to 0 in the Post-Patch report.

{
    'InstancePatchStates': [
        {
            'InstanceId': 'string',
            'PatchGroup': 'string',
            'BaselineId': 'string',
            'SnapshotId': 'string',
            'InstallOverrideList': 'string',
            'OwnerInformation': 'string',
            'InstalledCount': 123,
            'InstalledOtherCount': 123,
            'InstalledPendingRebootCount': 123,
            'InstalledRejectedCount': 123,
            'MissingCount': 123,
            'FailedCount': 123,
            'UnreportedNotApplicableCount': 123,
            'NotApplicableCount': 123,
            'OperationStartTime': datetime(2015, 1, 1),
            'OperationEndTime': datetime(2015, 1, 1),
            'Operation': 'Scan'|'Install',
            'LastNoRebootInstallOperationTime': datetime(2015, 1, 1),
            'RebootOption': 'RebootIfNeeded'|'NoReboot',
            'CriticalNonCompliantCount': 123,
            'SecurityNonCompliantCount': 123,
            'OtherNonCompliantCount': 123
        },
    ],
    'NextToken': 'string'
}

Request — Describe instance patches either by using instance id or tags in a filter

response = client.describe_instance_patches(
    InstanceId='string',
    Filters=[
        {
            'Key': 'string',
            'Values': [
                'string',
            ]
        },
    ],
    NextToken='string',
    MaxResults=123
)

Response — This will return list of patches against the patch baseline applied to the instance(s)

{
    'Patches': [
        {
            'Title': 'string',
            'KBId': 'string',
            'Classification': 'string',
            'Severity': 'string',
            'State': 'INSTALLED'|'INSTALLED_OTHER'|'INSTALLED_PENDING_REBOOT'|'INSTALLED_REJECTED'|'MISSING'|'NOT_APPLICABLE'|'FAILED',
            'InstalledTime': datetime(2015, 1, 1),
            'CVEIds': 'string'
        },
    ],
    'NextToken': 'string'
}

Include both the responses of above APIs and write it in a file on S3 bucket. This file can be referred as Post-Patch report

Perform sanity of EC2 instance(s)

This step may vary application by application. Please build your own system to perform application sanity post patching.

e.g. If I am running a Nginx server on EC2 then as a sanity I can perform following commainds through SSM on the EC2 instance

#Expect "Active" string in the o/p of below command for successful #sanity
service nginx status#Expect HTTP code "200" in the o/p of below command for successful #sanity
curl -I http://localhost:80

Here I conclude this blog and I hope this helps you to build your own automation journey of patching instances using AWS Systems Manager. I found this one of the most useful service provided by AWS. Please feel free to post any of your queries on this topic and I would be more than happy to answer those.

Automate Custom Patch Process of AWS EC2 Instances Using Amazon Systems Manager

Solution Overview

Pre-requisites

Solutioning

1. Pre-Patching Journey

2. Patching Journey

3. Post-Patching Journey

Implementation — Pre-Patching Journey

Scan EC2 instance for missing patches

Wait until command is executed

Generate a pre-patch report for auditing purposes

Implementation — Patching Journey

Create SSM Automation Document to create and tag AMI

Take an AMI backup of EC2 instance before patching

Build your own logic to wait until AMI is available

Create a yaml file containing approved patches

Install approved patches on EC2 instance(s)

Wait until command is executed

Implementation — Post-Patching Journey

Generate a post-patch report for auditing purposes

Perform sanity of EC2 instance(s)

Written by Akesh Patil