AWS Incident Response: Containment.

Alex Groyz
7 min readJul 14, 2023

--

The key elements of a successful cloud Incident Response program are Preparation, Operations, and Post Incident Activity. NIST and SANS have developed frameworks for Incident Response plans that align closely with these three phases. AWS Incident Response follows a similar structure, with Operations covering Detection, Analysis, Containment, Eradication, and Recovery. In addition to adopting concepts and best practices from the NIST guide, the AWS Security Incident Response guide offers various methods for automating Incident Response techniques.

This post specifically focuses on the Containment activity and presents an automated Incident Response solution for AWS EC2 Instances, AWS Lambda Functions, IAM Users, and IAM Roles. The solution involves an AWS Lambda function triggered by an AWS SNS message. Full solution can be found here on GitHub.

As stated in the AWS Security Incident Response guide, preparation is crucial when dealing with incidents. Once an event is detected during the Detection phase and analyzed in the Analysis phase, the automated solution for Containment can be utilized for the supported AWS services.

In this blog post, I explain how to automate the isolation of an EC2 Instance, an IAM User and Role, and a Lambda function using an AWS SNS message triggered by security responders once suspicious entities are identified. The automation of isolation for all entities is achieved through a single Lambda function. This function employs AWS API calls to isolate the entities by performing the following actions. Furthermore, the Incident Response Lambda can isolate entities across different AWS accounts and regions.

Why AWS entity lockdown

Every organization prioritizes security, and AWS follows a shared responsibility model where AWS manages cloud security while customers are responsible for security in the cloud. This model gives customers full control over their security implementation, but it also means that security incident response can be challenging. To address this complexity, adopting Vectra’s entity lockdown automation capabilities can significantly enhance the customer’s incident response operations, leading to faster response times and simplified processes. By leveraging this solution, customers can better prepare themselves for security events.

The primary objective of this solution is to minimize and restrict the impact of a security event.

Here is how we do it:

AWS principle of least privilege: Only need to expose an SNS publish permission. Strongly controlled access.

Automated workflow: The solution can integrate cleanly into existing workflows and be activated automatically to ensure out-of-band control to stop attacks from escalating into impact.

Automated audit trail: The solution enables compliance auditing and logging.

Cross-region and cross-account — Work from a single location across your entire AWS environment to simplify usage.

Specific/Targeted — The solution guarantees that in the event of compromised credentials or service instances, comprehensive measures are taken to lock them out completely. This approach prevents attackers from exploiting alternative attack paths and ensures that only the compromised credentials or service instances are affected. By doing so, the solution minimizes any potential impact on regular business activities, safeguarding business-as-usual operations.

Architecture and design

The solution operates by utilizing an AWS SNS topic message to initiate a Lambda function. The process begins with the publication of an AWS SNS message, which can be done manually or through a third-party tool such as Amazon Security Hub or SOAR. Once the SNS message is published, the Lambda function is triggered and executes subsequent actions based on the content of the message.

To properly process the SNS message, two specific message attributes are required. The first attribute is the ARN (Amazon Resource Name) of the entity that needs to be isolated. This ARN identifies the specific resource within the AWS environment. The second attribute is the ExtranalId, which is set by the user during the deployment of the CloudFormation template. The ExtranalId is a static value that serves as a validation mechanism for the SNS message within the Incident Response Lambda function. It ensures that the message is authentic and can be trusted for further processing.

The solution in this blog post accomplishes these tasks through the following logical flow of AWS services, illustrated in Figure 1.

  1. An AWS SNS message is published with an AWS entity ARN for Containment and ExternalId.
  2. An AWS Lambda function is invoked by the SNS message. The Lambda function executes the Incident Response business logic based on the entity type.
  3. An AWS SNS audit email will be sent out to the user throughout the process. One email for the start of the process, one for success or failure, and one for if the message falls into the dead-letter queue.
Figure 1: High-level diagram

When the Lambda function is invoked, the following business logic is performed:

The incident response business logic is illustrated in Figure 2.

Determine AWS entity type. Supported entities: Lambda, IAM User, IAM Role, and EC2.

The Lambda function will attach a DenyAll AWS managed policy if the entity type is IAM Role or User.

p_arn = 'arn:aws:iam::aws:policy/AWSDenyAll'
client.attach_user_policy(PolicyArn=p_arn, UserName=entity['entity_value'])
or
client.attach_role_policy(PolicyArn=p_arn, RoleName=entity['entity_value'])

If the entity type is an AWS Lambda function, the following logic is applied:

  • Attach DenyAll AWS managed policy to the Lambda execution role.
  • Set Lambda function property function_currency to 0. This will prevent the function from being triggered.
client_iam.attach_role_policy(PolicyArn=p_arn, RoleName=role)
client_lambda.put_function_concurrency(FunctionName=entity['entity_value'],ReservedConcurrentExecutions=0)

The following logic is applied if the entity type is an AWS EC2 instance.

The EC2 isolation solution outlined here is based on the 2020 AWS re:Invent presentation.

The following solution will work for continuously active connections. For example, if the compromised EC2 Instance is being used for crypto mining or is part of a ransomware attack.

When the Incident Response Lambda function is invoked, and the entity type to isolate is an AWS EC2 instance, the following actions are performed.

  1. Attach conditional policy to deny all action to Instance Profile for compromised EC2 Instance.
  2. Check the VPC for existing EC2 instance security groups aimed at untracked connections; if absent, create one. Then, configure the connection tracking idle timeout to 60 seconds for each ENI attached to the EC2 instance. After setting the timeout, apply this security group to the instance. This group should have a single ingress and egress rule set to 0.0.0.0/0, effectively converting tracked connections into untracked ones.
  3. Check the VPC for the existence of an isolation security group. If one does not exist, create one, then apply the security group to the Instance. This security group has no ingress and egress rules. Assigning this security group will completely isolate the EC2 Instance.
enis = client.describe_network_interfaces(Filters=[{'Name': 'attachment.instance-id', 'Values': [entity['entity_value']]}])
for eni in enis['NetworkInterfaces']:
eni_id = eni['NetworkInterfaceId']
client.modify_network_interface_attribute(
NetworkInterfaceId=eni_id,
ConnectionTrackingSpecification={
'TcpEstablishedTimeout': 300

securityGroupsInVpc = client.describe_security_groups(Filters=[{'Name': 'vpc-id','Values': [vpcId]}, {'Name': 'group-name','Values': [untrack_connections_sg]}])['SecurityGroups']
if securityGroupsInVpc:
securityGroupId = securityGroupsInVpc[0]['GroupId']
else:
securityGroupId = _entity_lockdown_ec2_createSecurityGroupUntrackConnections(untrack_connections_sg, untrack_connections_sg_desc, vpcId, client)
print(f"Modifying Instance {entity['entity_value']} with incident response isolation untracking connections security Group: {securityGroupId}")
_entity_lockdown_ec2_modifyInstanceAttribute(entity['entity_value'], securityGroupId, client)

securityGroupsInVpc = client.describe_security_groups(Filters=[{'Name': 'vpc-id','Values': [vpcId]}, {'Name': 'group-name','Values': [isolation_sg]}])['SecurityGroups']
if securityGroupsInVpc:
securityGroupId = securityGroupsInVpc[0]['GroupId']
else:
securityGroupId = _entity_lockdown_ec2_createSecurityGroup(isolation_sg, isolation_sg_desc, vpcId, client)
print(f"Modifying Instance {entity['entity_value']} with incident response isolation security Group: {securityGroupId}")
_entity_lockdown_ec2_modifyInstanceAttribute(entity['entity_value'], securityGroupId, client)

Deployment

To deploy the solution, you need to access two CloudFormation templates. If you have a single AWS account, you only need to deploy the first template, which sets up the Incident Response resources within the account.

However, if you have a multi-account setup and want a cross-account Incident Response solution, you must deploy both templates. The first template is deployed in your AWS security account to host the Incident Response resources. The second template creates a cross-account IAM Role in the additional accounts you want to include in the solution.

By following these steps, you can easily deploy the solution based on your specific AWS account configuration.

The solution README fully documents the AWS CloudFormation templates and stack details. You will find all solution resources in the GitHub repository.

Testing

To begin automating your processes, you need to deploy the solution’s AWS CloudFormation stack. Once that is done, there are a few options for publishing an SNS message:

1. AWS Console: If you prefer using the AWS console, you can publish an SNS message directly from the service resource screen.

2. AWS CLI or SDK: If you prefer using the AWS CLI or the AWS SDK, you can retrieve the Incident Response SNS ARN from the AWS CloudFormation stack’s output tab. This ARN can then be used to publish the SNS message programmatically.

For detailed step-by-step instructions on each approach, you can refer to the README file in the corresponding GitHub repository. It provides a comprehensive walkthrough for publishing SNS messages based on your preferred method.

Summary

To utilize the solution a SOC analyst or a security service like Amazon Security Hub or SOAR would only require SNS publish privileges to effectively respond to an AWS incident. This eliminates the need for the SOC analyst to possess administrative privileges over the suspicious AWS services involved.

The solution leverages AWS native tools and services to establish an effective Incident Response mechanism. Furthermore, it offers support for cross-account and cross-region containment, allowing for comprehensive incident response coverage across multiple AWS environments.

However, it’s important to note that if the CloudFormation stack is deleted in each AWS account, a manual cleanup process will be necessary to remove the security groups created for EC2 isolation. This cleanup step ensures that any residual resources associated with the incident response are properly removed from the environment.

--

--