Cloud Digital Forensics and Incident Response — AWS IAM Privilege Escalation Leads to EC2 Ransomware Deployment

13 min readAug 5, 2024

This article is the third in a series analyzing cloud Digital Forensics and Incident Response (DFIR) scenarios in AWS. The attack detailed in this write up is only a simulation, but it emulates real techniques utilized by real adversaries with real incident response tactics. In this piece, I’ll trace a threat actor’s steps through ransomware deployment, vertical (lateral) movement via AWS Systems Manager (SSM), and privilege escalation through IAM abuse. Forensics artifacts included in this report include Windows Sysmon logs, the Update Sequence Number (USN) Journal, AWS CloudTrail logs, and AWS Systems Manager command history.

Scenario Background

On the morning of July 29th, 2024, multiple employees of Eager Enigma Enterprises (EEE) reported that the entirety of their file system stored in AWS EC2 instances had been encrypted, and a ransomware note was found in each user’s home directory. EEE administrators maintained offline backups of some critical business files, but a significant amount of data could not be recovered. Concerned with the possibility of further compromise, EEE leadership brought on a third-party DFIR team to assess the situation and answer the following questions:

How did the threat actor gain access to the data on the EC2 servers?
What other AWS resources were impacted by this compromise?
Does the threat actor still have access to the environment?

AWS Incident Response Program

AWS maintains incident response guidelines aligned with NIST SP 800–61 “Computer Security Incident Handling Guide” to orient leaders and responders to best practices for cloud incident response.

Introduction

Security is the top priority at AWS. AWS customers benefit from data centers and network architecture built to help…

docs.aws.amazon.com

The DFIR team was called upon to assist with the Detection & Analysis and Containment, Eradication, & Recovery stages of the AWS incident response lifecycle.

The AWS incident response operations lifecycle.

Analysis

Windows Forensics

Analysis began with the impacted EC2 Windows server that had housed EEE data. As reported by EEE business analysts, all files stored in their home directories had been encrypted, renamed, and appended with “encrypted.”

One of the directories encrypted by the threat actor.

A total of 205 files across four user accounts and one administrator account had been ransomed during the attack. The only unencrypted file left in each user’s home directory was a ransom note detailing the payment requirement — approximately $10,000 worth of Bitcoin delivered to a cryptocurrency wallet. The note also dictated that encryption keys would be delivered to the email address of an EEE executive when the ransom was fulfilled.

The ransom note found in each user’s home directory.

A forensic image of the infected image was downloaded and mounted for analysis. Kroll Artifact Parser and Extractor (KAPE) was ran on the image to extract artifacts including Sysmon logs, the Master File Table (MFT), and the USN Journal. I began the investigation by loading the USN Journal into Timeline Explorer for analysis. The USN Journal is a file that records file system changes on a Windows system for tracking and recovery purposes. It can be used in forensic investigations to identify changes and the renaming of key files. In Timeline Explorer, I was able to quickly filter the results to the encrypted files:

This particular file was encrypted at 04:07:34 on the morning of July 29th.

The USN Journal revealed that the encryption took place around 04:07 AM on July 29th. The threat actor was able to access and manipulate files across all four users and an administrator account with what appeared to be a single command, meaning that they must have used elevated privileges when launching the attack. To investigate this hypothesis, I investigated the Sysmon logs on the system and parsed them with Eric Zimmerman’s EvtxeCmd parsing tool:

PS C:\Users\Administrator\forensics\net6\EvtxeCmd> .\EvtxECmd.exe -f 'C:\Users\Administrator\Desktop\wf2_results\Microsoft-Windows-Sysmon%254Operational.evtx' --csv ..\..\wf2_results\ --csvf wf2_sysmon.csv

Reviewing these logs in Timeline Explorer, I filtered to Event Code 11 (FileCreate) entries. Encrypting hundreds of files generated a large volume of logs:

A portion of the Event Code 11s tied to the encryption of files listed in the TargetFilename column (abbreviated for brevity).

I had expected either an administrator account or system-level account would be responsible for the attack, and it appeared to be the latter. Dozens of entries depicted NT AUTHORITY\SYSTEM executing Powershell to create the encrypted files. Each of these logs was associated with Process ID 6388, confirming that the attack had come from a single command.

In Timeline Explorer, I filtered results to Event Code 1 (ProcessCreate) and Process ID 6388. This generated only a single result:

Process ID 6388 — Powershell execution spawned from ssm-document-worker.exe.

AWS System Manager’s ssm-document-worker.exe process spawned a Powershell process that subsequently encrypted 205 files on the server. AWS Systems Manager is a platform for managing AWS resources, enabling administrators to centralize and automate tasks across a fleet of nodes. ssm-docuent-worker.exe is the executable used by SSM to administer commands on EC2 instances. This process will always run as NT AUTHORITY\SYSTEM, making it a critical attack avenue for adversaries. The Event Code 1 log also specified the exact command line ran by Powershell:

Process ID 6388 (continued) — command line arguments.

SSM Run Command is a feature that enables administrators (or threat actors) to run commands across fleets of EC2 nodes as a system level user. It also leaves a valuable artifact for incident responders — every Run Command Powershell script is stored locally under C:\ProgramData\Amazon\SSM. I opened the script’s directory on the mounted host:

SSM Run Command scripts will produce stderr and stdout output that is saved locally and in an S3 bucket (by default).

And the Powershell script itself:

The ransomware attack was executed by configuration.ps1.

The script’s content revealed the file responsible for the encryption process — configuration.ps1.

To determine from where this file had originated, I opened the MFT in Eric Zimmerman’s MFTExplorer tool and navigated to C:\Windows\Temp.

I noticed that configuration.ps1 and another file, FileCryptography.psm1, had been created at the same moment. Given the Powershell script module’s name and suspicious creation time, I assumed it to be an artifact of the attack.

Both files contained an Alternate Data Stream (ADS) Zone Identifier value of 3. ZoneId 3 is the “Mark of the Web” — a Windows security mechanism that persists on files downloaded from the Internet and indicates their potential security risk. With this, I knew that both files had originated outside of the local system.

I then filtered Sysmon logs for Event Code 11s (FileCreate) with a Target Filename containing ‘configuration.ps1’. This left a single entry:

NT AUTHORITY\SYSTEM created configuration.ps1 via Powershell.

Expanding the search to all Event Codes immediately before and after the file creation revealed a chain of events.

Events surrounding the creation of configuration.ps1.

These logs demonstrated that AWS SSM’s Run Command executed another Powershell script via ssm-document-worker.exe that spawned a DNS request for a Github URL, made a connection to a Github (full URL excluded for privacy), and then created the ransomware script. I confirmed this hypothesis by viewing the SSM Powershell script:

Github URL and script details are censored for privacy.

This SSM command downloaded both configuration.ps1 and FileCryptography.psm1 from the threat actor’s Github repository and saved them in C:\Windows\Temp.

With this, I was able to answer EEE’s first question regarding the investigation — How did the threat actor gain access to the data on the EC2 servers? The adversary utilized AWS SSM’s Run Command feature to execute arbitrary commands as a system-level account on the EC2 server.

It was not yet clear how the adversary had gained access to the control plane or how much access they possessed. To answer that question, I turned to AWS CloudTrail.

AWS Cloud Forensics

CloudTrail captures SSM API calls across various actions, such as inventory listings, instance modifications, and Run Commands. With sufficient logging in place, I would be able to identify the source of the malicious SSM Run Commands. The CloudTrail data lake console provides a SQL interface for querying tracked logs.

In the CloudTrail query console, I began by identifying the most common types of events associated with each user to gain an understanding of what data and users were being logged.

SELECT userIdentity.arn, eventSource, count(*) as num_events
FROM $EDS_ID
GROUP BY userIdentity.arn, eventSource
ORDER BY num_events DESC

This provided the following output:

A snippet of the most common log sources (account ID censored for privacy).

EC2EnableSSM is a customer-managed service role assigned to EC2 instances within the environment that enables SSM to centralize management and observability, so it was logical to see hundreds of SSM events associated with this role. aaron.broadmoor and jeremiah.fort are two user accounts associated with EEE personas — both are AWS admins and commonly operate in the environment.

I then ran another aggregation query to determine which SSM-related events were utilized in the environment.

SELECT eventName, count(*) as eventCount 
FROM $EDS_ID
WHERE eventSource = 'ssm.amazonaws.com'
GROUP BY eventName
ORDER BY eventCount DESC

A sample of the result:

UpdateInstanceInformation and all read/list commands are common, benign functions of regular SSM usage. There were 17 SendCommand actions logged — these are tied to EC2 Run Command executions. Only two Run Command actions were seen on the compromised EC2 host, so the other 15 could either be part of benign network administration or malicious lateral movement to other EC2 hosts.

All 17 instances of SendCommand were spawned by the SSM_Management_Role_WF2 role. This is a legitimate role assumed by administrators whenever interacting with the SSM service. The following query was used to take a close look at this role and each SendCommand action:

SELECT eventTime, userIdentity.sessioncontext.sessionissuer.username, 
userIdentity.type, eventName, userAgent, 
map_values(requestParameters)[2] as DocumentName, 
map_values(requestParameters)[5] as Parameters
FROM $EDS_ID
WHERE userIdentity.arn LIKE '%SSM_M%'

This returned:

Four of the 17 SendCommand events — all seventeen were similar to these.

These results revealed two important indicators —

The source IP address (not listed in the image for security concerns) differed for two of the seventeen events.
The User Agent (not fully listed in the image) differed for those same two events from the rest of the seventeen.

This indicated that an anomalous source initiated two of the Run Command actions — potentially the two identified on the Windows server.

The details of these Run Command actions should be visible under “Parameters”, but AWS redacts Run Command parameters in CloudTrail logs for security reasons. Despite this, it’s still possible to view Run Command history.

By navigating to the Systems Manager service, opening the Run Command history tab, and selecting one of the two suspicious commands, I was able to see all the metadata regarding the sent command.

Within “Command parameters”, I found the same malware execution command identified in the SSM Powershell scripts on the EC2 instance.

The full command can be viewed in Run Command history.

I moved to the second anomalous command and viewed the parameters:

The second anomalous Run Command (with censored Github information).

As expected, the Powershell commands to download the ransomware and its cryptography module were in the other suspicious Run Command. At this point, I knew exactly which control plane actions deployed the ransomware on the host, but I did not know how the adversary initially accessed the AWS environment.

An AWS user is an individual or account with a set of credentials for accessing AWS. Roles, however, are temporary security credentials and privileges assigned to users or services, granting that role’s assigned permissions for a limited time. Users or services can assume a role, meaning they temporarily take on the privileges of a role, allowing users to access resources they might not normally have access to. This is useful for delegation, cross-account access, and securing temporary credentials. The SSM_Management_Role_WF2 role was conducting these actions, but I had not identified what entity had assumed the role to do so.

I queried the CloudTrail logs for AssumeRole actions involving the abused SSM role:

SELECT eventTime, userIdentity.type, userIdentity.arn, eventName, 
map_values(requestParameters)[1] as assumedRole 
FROM $EDS_ID
WHERE eventName='AssumeRole' 
AND map_values(requestParameters)[1] LIKE '%SSM_M%'
ORDER BY eventTime DESC

This query provided the following results:

Though not pictured, CloudTrail also provided the assumed role instance’s access key and session token in the logs.

User jeremiah.fort had assumed the SSM management role twice around the time of the malicious Run Command actions. I expanded the query to view all activity associated with the jeremiah.fort account.

SELECT eventTime, userIdentity.arn, userIdentity.accesskeyid, eventSource, 
eventName, userAgent, errorCode 
FROM $EDS_ID
WHERE userIdentity.arn LIKE '%jeremiah%'
ORDER BY eventTime DESC

The following image includes a portion of the results:

Anomalous activity regarding jeremiah.fort.

The logs demonstrated the an hour before the ransomware attack, the jeremiah.fort account ran a large number of service enumeration commands and enumerated its privileges. The account ran GetCallerIdentity to check its identity, ran dozens of DescribeSnapshots commands (potentially to ensure there were no EC2/EBS snapshots present that would foil a ransomware attack), and enumerated both Lambda and IAM. The string of AccessDenied error codes indicate that the operator of the account did not know what privileges the account possessed — they potentially ran an enumeration script to bruteforce check which privileges the account had access to. The following image depicts even more AccessDenied actions from jeremiah.fort:

A string of AccessDenied actions typically associated with an privilege enumeration script.

At this point in the CloudTrail investigation, it was clear that the jeremiah.fort account was abused by a threat actor. I decided to expand the query to other AccessDenied events not involving jeremiah.fort:

SELECT eventTime, userIdentity.arn, userIdentity.accesskeyid, eventSource, 
eventName, userAgent, errorCode 
FROM $EDS_ID
WHERE errorCode='AccessDenied' AND userIdentity.arn NOT LIKE '%jeremiah%'
ORDER BY eventTime DESC

A portion of the results:

Multiple AccessDenied events associated with another account.

Account aaron.broadmoor conducted a long string of failed IAM, Lambda, and EC2 enumeration commands similar to those of jeremiah.fort. I then looked at all of the aaron.broadmoor account’s actions on the morning of July 29th:

Malicious commands associated with aaron.broadmoor.

The aaron.broadmoor account, used for identity management in AWS, had been initialized with widespread permissions to run IAM commands. This enabled the threat actor to do the following:

List all user accounts within the environment.
List any policies attached to both the aaron.broadmoor and jeremiah.fort accounts.
List the AWS-managed and customer-managed roles active in the environment — most importantly the SSM_Management_Role_WF2 role.
List the policies attached to the SSM_Management_Role_WF2 role — this would have enabled the adversary to know exactly what this role was capable of doing (i.e. Run Command actions).

From the attacker’s perspective, they would have seen the following information about the SSM management role:

A portion of the adversary’s perspective from these enumeration commands.

Because of this enumeration, the adversary knew the policies associated with the SSM_Management_Role_WF2 role, and they knew that the jeremiah.fort account could assume this role. All they had to do was find a way to access the jeremiah.fort account. The over-assigned IAM permissions assigned to the aaron.broadmoor account enabled this privilege escalation:

Two CreateAccessKey events spawned by aaron.broadmoor.

The aaron.broadmoor account executed two CreateAccessKey actions against both aaron.broadmoor and jeremiah.fort. These events created long-term security credentials enabling access to the AWS CLI. With these actions, the adversary had established persistence across two accounts and privilege escalated to the jeremiah.fort account. The newly spawned access key IDs are logged in AWS (censored for security):

Parameters and response for CreateAccessKey events.

With this analysis complete, an EEE-internal investigation determined that the aaron.broadmoor account’s initial set of access keys were most likely leaked through poor security hygiene by the account’s operator. This established the adversary’s initial access vector.

Executive Summary

In the late night and early morning of July 29th, an unknown threat actor leveraged leaked, legitimate AWS access keys to gain access to the AWS control plane via the AWS CLI. This adversary enumerated permissions, established persistence by spawning additional access keys, and successfully privilege escalated to the jeremiah.fort account. With this account’s permissions, the adversary gained access to a legitimate AWS Systems Manager role with the capability to execute arbitrary commands across the fleet of EC2 devices in the EEE tenant. The threat actor selected an EC2 Windows server without an AWS snapshot backup and deployed ransomware across the system, encrypting 205 key business files. A ransom note demanding $10,000 in Bitcoin was distributed across the system, and the adversary ceased operations. At the time of this writing, actions were taken to remove the adversary’s access to the AWS environment, and the ransomed files which were backed up out-of-band have been restored.

How did the threat actor gain access to the data on the EC2 servers?

The threat actor abused the AWS Systems Manager Run Command functionality to run arbitrary commands on the affected EC2 server.

What other AWS resources were impacted by this compromise?

The only lost data was that encrypted by the ransomware. The adversary temporarily had persistent access to the environment and abused existing IAM permissions to aid in privilege escalation, but no other resources were impacted.

Does the threat actor still have access to the environment?

Prior to the DFIR investigation, the adversary had access to the environment. At the time of this writing, all AWS long-term security credentials have been rolled, removing the only source of persistent access.

Parting Thoughts

This attack, though simulated, was based off of real cloud-attack techniques and involved real data and detection methods. In reality, I was both the attacker and the defender, simulating both offensive and defensive cloud and enterprise cyber tactics. This article demonstrated a threat actor’s ability to privilege escalate via IAM and vertically (laterally) move from the control plane to system-level access in the control plane via Systems Manager. With that level of privilege, an adversary could execute any number of attacks against the data plane. If you have comments, thoughts, or questions on this project or DFIR in the cloud — please reach out, I’d love to start a discussion.

(This is the third article in a series regarding cloud forensics and threat detection. If you’d like to read about another scenario, take a look at this piece):

Cloud Digital Forensics and Incident Response — EC2 Compromise Leads to S3 Bucket Exfiltration

This article details a simulated compromise of a set of Amazon Web Services (AWS) resources and a Windows system. While…

medium.com