Investigating CloudTrail Logs

  • Snapshots of running instances will be shared with unknown AWS accounts.
  • Powerful instances will mine Bitcoin on your bill.

Investigation showed a deliberate API-initiated mass-termination of all instances in One More Cloud’s AWS account.

CloudTrail is your most important resource in an AWS breach. See for yourself, from that same article:

We found CloudTrail logs, in correlation with logs from other systems, to be immensely useful in our post-incident security analysis and pursuit of attribution.

This is an incident response reference to understand an AWS breach through your CloudTrail logs.

Get Logs

The CloudTrail logging location is fairly easy to find or enable in the console. Or, run aws cloudtrail describe-trails and it will reveal the S3 buckets being logged to. If IncludeGlobalServiceEvents is true, the CloudTrail bucket will include logs for all regions. Logs land about every 15 minutes.

Problem: Anti-Forensics

Dealing with an attack in progress? You don’t want to spend time taking logs seriously that are about to disappear on you, or were unreliable to begin with.

  • An attacker could also encrypt CloudTrail logs to a key of their own, which would allow logs to continue streaming without alarm.
  • An attacker may also have the ability to disable logging for that region specifically, or kill the trails overall with CloudTrail permissions.

Answer: Protect Your Logs

These steps should make Dan Grzelik’s article irrelevant. Most of this pertains to the permissions your IAM users and roles have access to. Production users and roles should not have access to CloudTrail, and vice versa.

  • Cryptographic integrity logs don’t actually protect anything. They only give you absolute certainty that your logs have been ruined, it won’t stop them from being ruined. Minimize access to the S3 bucket that CloudTrail writes to prevent destruction of logs.
  • Minimize access to the CloudTrail API. If a key or role is compromised with write access to any CloudTrail API actions (DeleteTrail, StopLogging, UpdateTrail)
  • Pull logs into a centralized store like ELK / Splunk / Loggly / Papertrail. Completely segment it from your production environment. If you’re using a Lambda function to process these logs, make sure the Lambda function used to process logs cannot be tampered with by your production users and roles.
Someday I hope to investigate something with an actual magnifying glass

Investigating Logs

CloudTrail is very straightforward until you get into assumed roles or see abuse with cross account activity.

Reference Notes

These fields are all pulled from official documentation here, and here, with commentary for common incident response scenarios. These are written with the assumption that something may be already wrong about a log you’re looking at. None of these indicate bad behavior without a greater context of your incident.

eventName

This is the most important part of the record to glance at, it describes the action taken in the API. There are specific naming conventions to get used to, but generally all changes to an account have an obvious prefix of Create*, Write*, Destroy*, etc. Most of the read only actions are Describe*, Get*, List*, etc.

awsRegion

You’ll want to keep an eye out if bad behavior is happening in a region you don’t normally use. You can create alerts for your common regions and alert for anything happening outside of your primary region. Reminder: have CloudTrail enabled in all of your regions.

sourceIPAddress

This field can have DNS entries for AWS resources, or forward along the relevant IP address when things like the AWS console is used. For instance, if someone logs into the console and makes a bunch of changes, it won’t provide the IP address of whatever host the AWS console is running on, but instead will forward the IP address of the browser accessing the console.

  • If a key has leaked and actions are taken off the network, this host will be likely be from some random ISP.
  • If a key is being used maliciously within your own infrastructure, you’ll see an internal IP and it may indicate another compromised host.

userAgent

The user agent of a malicious client may be a dead giveaway and IOC in a breach. In my decade of security work, most adversaries fail to mimic the look and feel of the clients they spoof. They misspell them, improperly format them, use old versions, none at all, or have a compulsive need to boost their self-esteem and put their hacker handle in them, or mess it up in some other humiliating way.

errorCode

In an environment that manages its errors well, this is a great sign of bad behavior. An adversary without access to a GetUserPolicy call will bang into the error codes AccessDenied and UnauthorizedOperation* when they’re forced to enumerate their access, like a noisy port scan.

requestParameters and responseElements

These vary greatly per service, but it’s basically the parameters submitted to the API, and results received. However, responseElements only appears if something actually changed. This may be a workaround to filter for changes if readOnly is not able to be relied upon in the future, by looking only at CloudTrail logs with responseElements for changes.

readOnly

Don’t bother with this field. There should be a better way to filter for a read or write only action in AWS logs, however, with the readOnly value (since eventVersion 1.01) of a CloudTrail log, which is true if something changed, and false if nothing was modified.

sharedEventID

When a role is assumed in your account from another AWS account, a log is fired off in both accounts, and they’re joined by a sharedEventID. This is a very useful forensic artifact. If you own both accounts, you can continue investigating a suspicious role assumption very quickly by tracking down the sharedEventID in both accounts.

userIdentity

This section of the log relates to the “who” behind the API activity, and is important in identifying what was compromised. This field varies greatly depending on whether the identity was a role or a IAM user.

userIdentity.Type

The type of identity used will dictate the next steps of your investigation. The official documentation is here.

  • IAMUser would mean an IAM user was compromised. This means secret credentials were stolen. You can hopefully track down the systems where this specific users credentials lived and narrow down how they were compromised.
  • AssumedRole means that a role was assumed with the Security Token Service, and an investigation would follow up to discover what EC2 instance, Lambda function, or IAM user had permission to assume that role. The accessKeyId would be temporary, and you’ll have to find the corresponding AssumeRole that granted it. Finding that log would be an important next step in an investigation.
  • A FederatedUser investigation is similar to AssumedRole, except SAML is involved with granting the temporary credentials.
  • If AWSAccount appears, the request came from a different AWS account altogether, hopefully one that you own. Role assumption is complicated to follow in cross account scenarios. If this role was not assumed in a malicious AWS account, you’ll be able to link the role assumption between both accounts with thesharedEventID field. This field will appear in the CloudTrail logs for both accounts. If you are ultimately investigating a backdoor, the other account may be fraudulent or compromised, and you’ll have to get a warrant or AWS support to continue investigating the malicious account. In your own account, you’ll have to find backdoor permissions on the role that was assumed that allowed another account to assume it.
  • SAMLUser when the request was made with SAML assertion. You may need to roll an investigation forward into wherever identity is federated for AWS (Like Okta, OneLogin, etc)
  • WebIdentityUser is used when the request is made by a web identity federation provider, and may involve actual customers or users. For instance, temporary credentials might be given to a user so they’ll have permission to access an S3 bucket. These are highly likely to be very low trust users, like customers or users of an application that have authenticated with Facebook or a mobile OAuth flow.

userIdentity.PrincipalID

This is not as well documented by AWS as it should be, though it’s very important. These are unique identifiers that are affiliated with objects (similar to an ARN). Their prefixes give a hint as to what the object is, for instance AIDA followed by an identifier is a IAM user, or ARON followed by an identifier is a role. I haven’t been able to find public documentation on all of these identifiers.

userIdentity.ARN

These are Amazon Resource Names and are well documented.

userIdentity.accountID

For a company operating out of a single AWS account, this should always be the same in your logs, unless you’re backdoor’d by a malicious AWS account a resource assume a role in your own account. So, it may make sense to alert when new AWS accounts are appearing in your logs. You may see innocent behavior from external services like Evident.io or Cloudsploit that you’ve set up, but you can easily whitelist these after an initial pass.

userIdentity.accessKeyId

These are either permanent credentials for root, an IAM user, or temporary STS credentials being used to assume a role. Permanent keys seem to have a prefix of AKIA, and temporary credentials seem to have the ASIA prefix, but this is not documented and may see changes in the future.

userIdentity.sessionContext

This element should only exist with assumed roles. This has details about the session that will be important in putting together a timeline that continues an investigation, since it will inform you exactly when the session started (and where STS activity should appear in CloudTrail to grant the session)

userIdentity.sessionContext.sessionIssuer

This is more useful data to understand the specific object that gave credentials to assume the role. This is most likely a role, but could also be Root or an IAM user when GetFederationToken was used. The accountID will be important to note here.

userIdentity.invokedBy

This is the name of an AWS backend service that may have triggered the API call. These are rarely malicious and are generally noisy things like AWS Config, Auto Scaling or Elastic Beanstalk being noisy in your logs. It may be interesting to consider what an attacker could accomplish while hiding behind an invoked service, but I haven’t seen this in any incidents I’ve worked.

Other Incident Guides

Conclusion

CloudTrail logs are a useful tool before, during, and after an incident. Turn them on, secure them, and make them accessible for investigations and troubleshooting.


@magoo

I’m a security guy, former Facebook, Coinbase, and currently an advisor and consultant for a handful of startups. Incident Response and security team building is generally my thing, but I’m mostly all over the place.

Starting Up Security

Guides for the growing security team

Ryan McGeehan

Written by

Writing about risk, security, and startups.

Starting Up Security

Guides for the growing security team