Automate Custom Patch Process of AWS EC2 Instances Using Amazon Systems Manager
In today’s world, security is one of the top priorities in a business. No matter what, it’s obvious we spend a certain amount of time preventing hackers from stealing our sensitive information. Hackers exploit the vulnerabilities found in the system/server and attacks the network which may take an entire company down. Sometimes the most important solution which can prevent hackers to do so is given less importance. This trivial solution is Security Patching.
In this post, I’ll show you a modernized solution to automate the patching process of Linux instances using Amazon Systems Manager (SSM). We will also understand how can we generate various reports using AWS APIs that can be useful for auditing. The same process can be applied for patching Windows instances as well.
Solution Overview
This solution will show you how to orchestrate end to end patching cycle of an AWS EC2 instance(s) without a need to SSH into any Linux instance. I have preferred Python to automate this solution however you can use your supported programming language with which you are comfortable with
- Define and automate pre-patching journey
- Define and automate patching journey
- Define and automate post-patching journey
Pre-requisites
- Make sure that AWS Systems Manager is set up and the AWS Systems Manager agent has been installed and updated on all of the instances you want to patch
- Create SSM Patch Baselines with appropriate fields for security patching. Please refer AWS Documentation for the same
- Defined a tagging strategy for your instances. In my case, I have defined tagging strategy based on environments and applications. Say I have Dev, QA, UAT and Production environments and variety of application say Nginx, Web Servers, Backend Servers etc. In such cases, I will prefer tagging EC2 instances as below
- Environemnt = DEV | QA |UAT | PROD
- Application = NGINX | WEB_SERVER | BACKEND_SERVER
- Patch Group = AMAZON_LINUX1 | AMAZON_LINUX2 | WINDOWS
We can use these tags to select right application in right environment. We can baseline lowest environment to identify missing patches and install same list on all the higher environments. This tagging process can help us to make sure that all the environments of an application have same set of patches installed during a cycle
4. Make sure that your instance is tagged with “Patch Group” key and appropriate value to apply correct patch baseline for your instance’s OS while patching. Please refer AWS Documentation for more information on this
Solutioning
1. Pre-Patching Journey
Activities included in Pre-Patching Journey are
- Scan EC2 instance for missing patches
- Generate a pre-patch report for auditing purposes
- Approve list of patches to be installed
2. Patching Journey
- Take an AMI backup of EC2 instance before patching
- Install approved patches on EC2 instance(s)
3. Post-Patching Journey
- Generate a Post-Patch report for auditing purposes
- Perform sanity of EC2 instance(s)
Let’s have a look at following workflow diagram to understand all these steps in a pictorial view
Now let’s understand how can we achieve this automation using various AWS APIs in a Python program. You can use this reference and use relevant APIs in all other supported programming languages e.g. Java, Nodejs or even AWS CLI
Implementation — Pre-Patching Journey
Scan EC2 instance for missing patches
- Request — Execute SSM document in Scan Mode, No Reboot with instance id or appropriate tags in filters
response=client.send_command(
InstanceId='string',
Targets=[
{
'Key': 'string',
'Values': [
'string',
]
}
],
DocumentName='AWS-RunPatchBaseline',
DocumentVersion='$LATEST',
TimeoutSeconds=900,
Comment='<SOME COMMENT>',
Parameters={
'Operation': [
'Scan'
],
'RebootOption': [
'NoReboot'
]
},
OutputS3BucketName='<S3_BUCKET_NAME TO WRITE SSM COMMAND LOGS>',
OutputS3KeyPrefix='<S3_BUCKET_KEY_PREFIX TO WRITE SSM COMMAND LOGS>',
)
- Response — Retrieve command-id from the response
{
'Command': {
'CommandId': 'string',
}
Wait until command is executed
Request — Execute boto3 waiters to wait until scanning operation is successful
waiter = client.get_waiter('command_executed')
waiter.wait(
CommandId='string'
)
Response — None
Generate a pre-patch report for auditing purposes
Request — Describe instance patch states
response = client.describe_instance_patch_states(
InstanceIds=[
'string',
],
NextToken='string',
MaxResults=123
)
Response — This gives MissingCount of patches on the instance.
{
'InstancePatchStates': [
{
'InstanceId': 'string',
'PatchGroup': 'string',
'BaselineId': 'string',
'SnapshotId': 'string',
'InstallOverrideList': 'string',
'OwnerInformation': 'string',
'InstalledCount': 123,
'InstalledOtherCount': 123,
'InstalledPendingRebootCount': 123,
'InstalledRejectedCount': 123,
'MissingCount': 123,
'FailedCount': 123,
'UnreportedNotApplicableCount': 123,
'NotApplicableCount': 123,
'OperationStartTime': datetime(2015, 1, 1),
'OperationEndTime': datetime(2015, 1, 1),
'Operation': 'Scan'|'Install',
'LastNoRebootInstallOperationTime': datetime(2015, 1, 1),
'RebootOption': 'RebootIfNeeded'|'NoReboot',
'CriticalNonCompliantCount': 123,
'SecurityNonCompliantCount': 123,
'OtherNonCompliantCount': 123
},
],
'NextToken': 'string'
}
Request — Describe instance patches either by using instance id or tags in a filter
response = client.describe_instance_patches(
InstanceId='string',
Filters=[
{
'Key': 'string',
'Values': [
'string',
]
},
],
NextToken='string',
MaxResults=123
)
Response — This will return list of patches against the patch baseline applied to the instance(s)
{
'Patches': [
{
'Title': 'string',
'KBId': 'string',
'Classification': 'string',
'Severity': 'string',
'State': 'INSTALLED'|'INSTALLED_OTHER'|'INSTALLED_PENDING_REBOOT'|'INSTALLED_REJECTED'|'MISSING'|'NOT_APPLICABLE'|'FAILED',
'InstalledTime': datetime(2015, 1, 1),
'CVEIds': 'string'
},
],
'NextToken': 'string'
}
Include both the responses of above APIs and write it in a file on S3 bucket. This file can be referred as Pre-Patch report
Implementation — Patching Journey
Create SSM Automation Document to create and tag AMI
- Follow AWS Document to create AWS SSM Automation Document
- You can use readily available document on GitHub
P.S. This is one time setup on your AWS account in a specific region
Take an AMI backup of EC2 instance before patching
Request — Execute automation document with necessary inputs
response=client.start_automation_execution(
DocumentName='<SSM_AUTOMATION_DOCUMENT_NAME>',
DocumentVersion='$LATEST',
{
"InstanceId": [
"<INSTANCE_ID>"
],
"NoReboot": [
"true"
],
"AMINameValue": [
"AMI_NAME"
],
"DeleteOnValue": [
"DATE_WITH_STANDARD_FORMAT DD-MM-YYYY"
]
})
Response — Read automation execution id from the response
{
'AutomationExecutionId': 'string'
}
Build your own logic to wait until AMI is available
Request — Check automation status
response = client.describe_automation_executions(
Filters=[
{
'Key': 'ExecutionId'
'Values': ['<AUTOMATION_EXECUTION_ID>']
},
],
MaxResults=123,
NextToken='string'
)
Response — Wait until AutomationExcutionStatus in the below response reach to one of the mentioned state
{
'AutomationExecutionMetadataList': [
{
'AutomationExecutionId': 'string',
'AutomationExecutionStatus': 'Success'|'TimedOut'|'Cancelled'|'Failed'
}
]
}
Create a yaml file containing approved patches
As part of the SDLC process, we have multiple environments including Development, QA, UAT and Production etc. To match compliance requirements, we may need to install same patch list across environments. To achieve this, build your own logic perform following operations
- Identify missing patches from describe_instance_patches API
- Filter only approved patches from the list
- Generate a yaml file as shown below including id and title of the missing patch
patches:
- id: curl.x86_64
title: curl.x86_64:0:7.61.1-12.94.amzn1
- id: kernel.x86_64
title: kernel.x86_64:0:4.14.186-110.268.amzn1
- id: libcurl.x86_64
title: libcurl.x86_64:0:7.61.1-12.94.amzn1
- id: microcode_ctl.x86_64
title: microcode_ctl.x86_64:2:2.1-47.39.amzn1
- id: python27.x86_64
title: python27.x86_64:0:2.7.18-1.138.amzn1
- id: python27-devel.x86_64
title: python27-devel.x86_64:0:2.7.18-1.138.amzn1
- id: python27-libs.x86_64
title: python27-libs.x86_64:0:2.7.18-1.138.amzn1
- Upload yaml file to some S3 bucket on AWS account
Install approved patches on EC2 instance(s)
- Request — Execute SSM document in Scan Mode, No Reboot with instance id or appropriate tags in filters
response=client.send_command(
InstanceId='string',
Targets=[
{
'Key': 'string',
'Values': [
'string',
]
}
],
DocumentName='AWS-RunPatchBaseline',
DocumentVersion='$LATEST',
TimeoutSeconds=900,
Comment='<SOME COMMENT>',
Parameters={
'Operation': [
'Install'
],
'RebootOption': [
'Reboot'
]
},
OutputS3BucketName='<S3_BUCKET_NAME TO WRITE SSM COMMAND LOGS>',
OutputS3KeyPrefix='<S3_BUCKET_KEY_PREFIX TO WRITE SSM COMMAND LOGS>',
)
- Response — Retrieve command-id from the response
{
'Command': {
'CommandId': 'string',
}
Wait until command is executed
Request — Execute boto3 waiters to wait until scanning operation is successful
waiter = client.get_waiter('command_executed')
waiter.wait(
CommandId='string'
)
Response — None
Implementation — Post-Patching Journey
Generate a post-patch report for auditing purposes
Request — Describe instance patch states
response = client.describe_instance_patch_states(
InstanceIds=[
'string',
],
NextToken='string',
MaxResults=123
)
Response — This gives InstalledCount and MissingCount of patches on the instance. Ideally MissingCount in Pre-Patch report should be equal to InstalledCount in Post-Patch report and MissingCount should be equal to 0 in the Post-Patch report.
{
'InstancePatchStates': [
{
'InstanceId': 'string',
'PatchGroup': 'string',
'BaselineId': 'string',
'SnapshotId': 'string',
'InstallOverrideList': 'string',
'OwnerInformation': 'string',
'InstalledCount': 123,
'InstalledOtherCount': 123,
'InstalledPendingRebootCount': 123,
'InstalledRejectedCount': 123,
'MissingCount': 123,
'FailedCount': 123,
'UnreportedNotApplicableCount': 123,
'NotApplicableCount': 123,
'OperationStartTime': datetime(2015, 1, 1),
'OperationEndTime': datetime(2015, 1, 1),
'Operation': 'Scan'|'Install',
'LastNoRebootInstallOperationTime': datetime(2015, 1, 1),
'RebootOption': 'RebootIfNeeded'|'NoReboot',
'CriticalNonCompliantCount': 123,
'SecurityNonCompliantCount': 123,
'OtherNonCompliantCount': 123
},
],
'NextToken': 'string'
}
Request — Describe instance patches either by using instance id or tags in a filter
response = client.describe_instance_patches(
InstanceId='string',
Filters=[
{
'Key': 'string',
'Values': [
'string',
]
},
],
NextToken='string',
MaxResults=123
)
Response — This will return list of patches against the patch baseline applied to the instance(s)
{
'Patches': [
{
'Title': 'string',
'KBId': 'string',
'Classification': 'string',
'Severity': 'string',
'State': 'INSTALLED'|'INSTALLED_OTHER'|'INSTALLED_PENDING_REBOOT'|'INSTALLED_REJECTED'|'MISSING'|'NOT_APPLICABLE'|'FAILED',
'InstalledTime': datetime(2015, 1, 1),
'CVEIds': 'string'
},
],
'NextToken': 'string'
}
Include both the responses of above APIs and write it in a file on S3 bucket. This file can be referred as Post-Patch report
Perform sanity of EC2 instance(s)
This step may vary application by application. Please build your own system to perform application sanity post patching.
e.g. If I am running a Nginx server on EC2 then as a sanity I can perform following commainds through SSM on the EC2 instance
#Expect "Active" string in the o/p of below command for successful #sanity
service nginx status#Expect HTTP code "200" in the o/p of below command for successful #sanity
curl -I http://localhost:80
Here I conclude this blog and I hope this helps you to build your own automation journey of patching instances using AWS Systems Manager. I found this one of the most useful service provided by AWS. Please feel free to post any of your queries on this topic and I would be more than happy to answer those.