Active Defense - Dynamically Locking AWS Credentials to Your Environment
In the spirit of security summer camp (BlackHat/Defcon), I wanted to publish some information on my continuation of past research around preventing credential compromise in the cloud. Dynamically locking credentials to your AWS environment is the approach detailed below. Previously I have posted about Preventing Credential Compromise in AWS and this post continues with the same goal of making an exposed credential of little to no value to a threat actor if exposed.
The TL;DR; of the link above is proposing two methods to prevent credential compromise:
- Enforcing where API calls are allowed to originate from.
- Protecting the EC2 Metadata service so that credentials cannot be retrieved via a vulnerability in an application such as Server Side Request Forgery (SSRF).
The method of dynamically locking credentials to your environment aims to address the gap in enforcing where API calls are allowed to originate. When you have a large environment, it can be difficult to understand what your public IP addresses are and these addresses might change rapidly making generating and keeping a policy up to date difficult if not impossible. If you are not able to force egress through a central point for your entire VPC, you have a gap in the previous approach for servers deployed in an external subnet.
The goal of dynamically locking AWS credentials to an environment is to provide an AWS credential to an application(s) that is specifically created for that point in time deployment. These credentials are created with restrictions so that they are not valid outside a set of conditions. This methodology can be applied to credentials served from Elastic Cloud Compute (EC2), Lambda, container orchestration platforms, or any other system with the ability to change/inject credentials.
By dynamically creating credentials, applications can determine where in a cloud environment the credential is going to be used and craft the restrictions to lock the credentials to that deployment.
The approach used in this post is to understand where an application is deployed and provide a means to create new credentials for the application to use. The following is an example of the steps an application might perform in order to dynamically lock its credential in AWS.
- On boot, reach out to determine where the application is deployed. This can be as simple as determining the public IP that AWS services will see when making requests or as complex as reaching out to a boot endpoint or reading environment variables to understand more context of the deployment such as VPC IDs.
- Call AssumeRole and inject a session policy with conditions from 1.
- Inject the credentials into operating system environment variables or serve them from a metadata proxy.
It is possible to perform actions in AWS as another IAM role using the AssumeRole action from the Secure Token Service (STS). STS provides a temporary set of credentials for that role.. The lifetime of these credential is typically anywhere from 15 minutes to one hour. The lifetime can be greater depending on that type of credential that issued the AssumeRole call. AssumeRole calls may specify a session name that can be tracked in the AWS API audit logging service (CloudTrail).
Services will make this call on your behalf when creating credentials to pass into applications. Examples:
- EC2 will make this call and use the server’s instance ID as the session name for the temporary credentials. These credentials become available to the application in the EC2 metadata service.
- Lambda will make this call and use the lambda function name as the session name for the temporary credentials. These credentials become available in the function via environment variables.
A trust relationship in AWS IAM is used to define which service, user, or role can assume a given IAM Role. In order to dynamically lock a credential for an application, the IAM Role Trust Relationship must allow the role to AssumeRole into itself. An example Trust Relationship for a Role with ARN arn:aws:iam::12344567890:role/RoleName operating in EC2 would look like the following:
Session Policy Injection for AssumeRole
The STS AssumeRole call allows an optional JSON policy document that describes what the resulting temporary credentials can/cannot do.
When dynamically locking credentials to an application, the library/metadata proxy performs an AssumeRole action to create restricted credentials to serve to applications and injects a session policy to restrict where the credentials are valid from. The session policy JSON statement might look similar to the following:
The policy above does a few things:
- The first statement is the restriction that Denies any actions where the conditions are not met.
- The second statement allows all actions. Session policy injection provides a scoped subset of what the IAM Role already has. This policy allows ALL original IAM permissions that the role has. This might make you think least privilege is not being used, but least privilege in this case is scoped to the original permissions of the role.
- The last statement is Denying the action of AssumeRole to the instance Role. The reason for this is that only the raw credentials should be able to assume the role that the server is running as. If you do not provide the last statement, then you would be able to use the credentials to do an AssumeRole to the same Role without session policy injection and remove the restrictions. You’d need to do this from the server/environment from statement 1, but we still want to avoid the ability to remove these restrictions.
The above session policy creates credentials that are restricted to the environment described by the policy. This will result in credentials that are restricted to a single instance/lambda host/container host (if deployed in external subnet, talking directly to the internet), or restricted to the internal subnet/availability-zone (routing to the internet goes through a NAT Gateway).
Within EC2, a method for providing a locked credential to your application(s) is to host it using a metadata proxy. This metadata proxy will sit locally on the EC2 instance and proxy all traffic meant for the metadata service except for the security-credentials path where it serves the new locked credentials to the application.
An example boot sequence is as follows:
- The metadata proxy boots and requests boot information from a public endpoint. It sends information about where the server is deployed such as the applicable account ID and region. Information such as instance-id or other metadata on the server can be optionally sent as well.
- The public boot provider responds with the public IP seen by the request as well as information relating to the environment that the server is deployed in. This information can by regional/global to the account/company. This will contain the public IP of the server requesting the information, VPC IDs for the region/account(s), VPC endpoint IDs for the region/account(s), and potentially elastic IPs (static IPs assigned to the account that might be used by running services).
- The metadata proxy reaches out to the metadata service to see what IAM role the server is using.
- The metadata proxy performs an AssumeRole for the role that was found in step 3. When the AssumeRole is called, the metadata proxy uses the information from the boot provider to restrict where the credentials are valid.
- The IAM/STS service provides the restricted credentials back to the metadata proxy to provide to requesting applications on the server.
- Applications on the server using an AWS SDK or custom code will reach out to the metadata proxy for credentials.
- The metadata proxy provides the restricted credentials back to the application for use.
Behind the scenes when invoking a lambda function, AWS is doing an AssumeRole and passing credentials to the lambda function via environment variables. This set of credentials is cached and shared between lambda invocations and passed to each invocation. The credentials, similar to EC2, are valid for multiple hours. The session name in this set of temporary credentials is the name of the function.
The same methodology can be applied here as in EC2. In the following example a library (written in Python) decorates the handler function of the Lambda function so that an AssumeRole is performed before the function executes.
When the lambda function is invoked, lambda_lockdown runs and then calls lambda_handler inside of it. The logic of lambda_lockdown is:
- Reach out and determine public IP address seen by AWS for API calls.
- Optionally read some environment variables to determine VPC and/or VPC Endpoint information to include in the session policy injection
- Call AssumeRole with session policy injection restricting the credentials to this lambda host only. Set the session time to 15 minutes as this is the max time that a lambda function can run.
- Overwrite the AWS credential environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
Some companies might choose to run their own container platform and need to provide credentials into a container via a metadata proxy. This metadata proxy typically will assume a role meant for the container and then provide those credentials just for the single container. Having this infrastructure in place makes this methodology much easier to implement. The approach is similar as EC2 and Lambda.
- Determine where the container host is deployed. This can be accomplished by reaching out to determine public IP and VPC and/or VPC endpoint information from a boot endpoint or source the information from the system itself. Some container hosts might just describe the VPCs and VPC endpoints on boot and cache this info.
- AssumeRole into the role for the contain as usual, but use session policy injection to restrict the credentials.
- Provide the newly restricted credentials to the container for use.
Software Development Kit (SDK)
AWS provides SDKs for most languages to be used in your applications for making API calls to service endpoints within AWS. The SDKs have built in mechanisms for finding credentials to use when making API calls. The SDK will use the credentials to sign the request to be sent to the service endpoint. The SDKs typically look for credentials in the following order:
- Operating system environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN
- Local credential file. This is typically located at ~/.aws/credentials
- EC2 Metadata Service. This is a locally available network endpoint listening on 169.254.169.254 within your EC2 instance.
AWS credentials can work anywhere in the world by default which presents a problem should a credential be exposed outside of your control. The takeaway/call to action here is to be proactive in how you are protecting your credentials in your AWS environment to protect you and/or your company.
The method of dynamically locking credentials to an environment in AWS address a previous gap in prevention techniques and can mitigate the risk of an exposed credential while allowing time and priority to be shifted towards tackling other problems that exist when operating in the cloud.