So You Inherited an AWS Account

Matt Fuller
Apr 28, 2020 · 16 min read

Many engineers have found themselves in the unenviable position of being handed the keys to an AWS environment with absolutely no explanation of its contents, documentation, or training. Whether an employee leaves the company, teams are restructured, or your company acquires another, you will need to quickly audit the account and get up to speed on its operation. Even worse, many of these inherited accounts are running production infrastructure that must be kept running during the transition period. Now that you’re responsible for this account, you will also be responsible for keeping it secure.

There is a wealth of documentation, training, guides, and other resources available online to learn about security in AWS cloud environments. But many of those resources assume that you are either building an account from scratch, were intimately involved in building the account from its inception, or can take great liberty in applying destructive changes. In our case, the reality is that you’re likely staring at eight years of accumulated infrastructure with absolutely no idea of what’s running or how to make changes without causing a production outage.

I’ve written this guide to help you filter through the mess, isolate the changes you need to make, and start to tame your environment. While I’ll assume that you have AWS experience, we’ll start with the security basics, along with changes that won’t impact running services, before moving to making tweaks that will require a bit more investigation and preparation. Our goal is to quickly triage the situation, implement the lowest risk but most impactful changes first, and then work our way toward a concrete security policy that can be used longer-term.

Note: The absolute best-case scenario when inheriting an account is to spin up a separate new account and migrate applications over time. However, I recognize that that is a pipe dream for many accounts, hence this guide was born.

This guide is not a substitute for a properly-designed security program. Instead, it is designed to be a quick-start guide for the first 30–90 days after assuming ownership over an account that may not have previously been properly managed.

Step 1: Get Stable Access

If you’re lucky, the target account is already configured to work with your organization’s Single Sign On (SSO) provider. In actuality, you’re more likely to have sticky note with an email address and a password on it. Our first step is to confirm access to the account and embed our own user to avoid losing access. This step is especially crucial if you’re taking over an account because a previous employee left the company.

If you were given a user account and password to sign in with, it’s possible this is the root user account. This is not a good practice, but first we need to stabilize our access by running through the following steps:

  1. Log in with the email and password to determine if you’re using the root account.

Step 2: Stop Using the Root User

Your goal from this point forward is to stop using the root user entirely. To do this safely, we will need to make sure that nothing else is using the root user programmatically and then create an MFA token for the account which you will lock in a safe somewhere.

If you’re lucky, you won’t find any access keys here.
  1. Log in as the root user (hopefully for the last time).

Step 3: Update Billing Information

While finance may be happy with someone else paying for your AWS usage for as long as it takes them to discover the charge, you want to get this information changed quickly. This info will be used by AWS to help identify you if you need to recover the account, and you don’t want to get into a digital stalemate if the previous owner tries to claim ownership because their credit card is still footing the bill.

You’re probably going to need to involve finance for this one, so get a fruit basket queued up so they prioritize your ticket and don’t faint when you explain the incoming $142k/month charge they’re about to see.

Once you get the correct billing info, make sure to add it and then remove all other payment methods including bank accounts and credit cards.

If the account is a member of an existing AWS Organization (and you can confirm it’s not one owned by your company), leave the Organization. If your company uses Organizations, be very careful about joining it at this stage; it’s possible that existing Service Control Policies may affect running services or workflows. If billing must be handled through the Organization, you’ll need to discuss adding the account with the Organization admin for this use case to take advantage of “billing only” features.

Once your billing information is changed, it’s time to log out of the root account and switch to using the IAM user created earlier.

Step 4: Enable CloudTrail Logging and Monitoring

Keep in mind that at this point, you still have no idea who or what has access to the account, what is running, and what kinds of activity is occurring in it. Let’s fix this by turning on AWS CloudTrail.

No trails is no good.
  1. Open the CloudTrail console and determine if an existing trail is configured. If it is, you’ll want to verify that the logs are being sent to a location you have access to. If you don’t recognize the location, modify the trail to send its logs to your organization’s centralized S3 bucket used for log collection. If your organization doesn’t have such a bucket, configure CloudTrail to log to a bucket in your own account for now.

There are many other AWS security solutions that may be helpful at this point, including:

One challenge you may have at this stage is identifying true security incidents from the noise. These services tend to begin producing thousands of results in a busy environment, which could lead you on an endless goose chase. I recommend enabling them in “audit mode” where possible, and returning later once the account is more carefully pruned.

Step 5: Cleanup IAM Entities

I once did some consulting work for a company that had close to 1,200 IAM users in their account, each with access keys. I nearly bit off my tongue during that walkthrough. If you’re in this situation, it’s easy to put these steps off until later. But it’s truly important to get a handle on IAM. A single user or access key with excessive permissions could compromise the entire environment. Our goal in this step is to cleanup users that have not been used in awhile, delete access keys where possible, and begin to at least scope the policies attached to each user.

Initial Cleanup

The IAM Credential Report will help you avoid carpal tunnel from clicking into every user.
  1. Download the IAM Credential Report for your account, which will contain a number of very important details for each IAM user.

Sleuthing for Users

At this point, hopefully you’ve cleaned out a significant portion of users who had access to the account. To handle the remaining ones, it’s time to do some sleuthing.

Disabling user console access might force them to find you.
  1. Start with users who have both passwords and access keys. If you recognize them, send an email asking them what the keys are being used for and whether they can be disabled. Chances are they left a script running somewhere.

We’ll now be left with a more manageable set of users who have either password or access key access to AWS (but ideally not both at the same time). From this list, I recommend placing them into three categories:

  1. Humans who need console access for legitimate businesses purposes.

Preparing Account Policies

It won’t do much good to have users in Group 1 reset their passwords if they’re allowed to change the password to something simple. Be sure to first check the IAM Password Policy for the account and check all the applicable boxes per your organization’s password policy.

You can tell how many days an employee has worked at a company by dividing the reset period by the last digit of their password.

Contacting Users

For our Group 1 users, work with them to ensure:

  • Passwords that have not been reset within the expiration period are reset.

Tracking Down Access Keys

For Group 2 users, the hard part will be tracking down where the scripts are running. Fortunately, CloudTrail contains a wealth of information, including origin IP address, user agent headers, and other details that can be used to locate the user. When all else fails, you can always try doing a search of your organization’s GitHub installation in ̶h̶o̶p̶e̶s̶ fear the key has been committed there.

For Group 3 users, the goal is to transition them to using IAM roles, deprecate the access keys, and delete the users. This may be easier said than done, especially if these are legacy applications with no automated deployment process.

When all else fails, if the keys cannot be deleted, the next best option is scope their policies to just the services they need access to. Again, this isn’t an easy task, but there are tools that can help:

  1. Use the “Access Advisor” tool in IAM to see if the policies being granted to the user are actually being used.

By now, you should be left with a more organized IAM environment, much more tightly-scoped IAM policies, and a properly configured account password policy so that humans can login (with passwords and MFA) and machines can access the necessary APIs (with access keys).

Note: Many organizations use Single Sign On internally, which is a more ideal method of configuring AWS access than password-based login for a variety of reasons, including user provisioning and deprecation. If SSO can be used, I recommend setting that up and transitioning your IAM users if possible.

Step 6: Locate Exposed Services

Aside from improperly-configured IAM users, your biggest security risk at this stage is likely to be services that are improperly configured to allow traffic from public endpoints. This includes:

  • S3 Buckets set to allow public access

There isn’t enough storage space on Medium to walk through the detailed steps of fixing all of these issues, but the goal at this point is to plug the most egregious gaps. There are a number of open source auditing tools that can be used to quickly discover at-risk resources, but your biggest objectives should be:

  • Closing ports and security group rules that are exposed publicly. You can use VPC Flow Logs (be careful, they can get expensive) to determine usage prior to closing ports.

Step 7: Lock Down Your Domains

Domains are the lifeblood of your organization’s applications and brand. If someone transfers that domain out of your Route53, a bad time is going to be had by everyone. In this step, your goals are to:

  1. Configure transfer locks on all of your supported domains.

Domain Settings

Enabling transfer locks will be an easy and non-destructive process. You can do this quickly via the Route53 console. The same is true for enabling auto-renewal.

Changing the technical and administrative contacts will be more time-consuming but is also a non-breaking change. Just be sure to use an email you have access to and that can receive email from outside sources so you can confirm the ownership.

If the domain is registered outside of Route53, you’ll need to track down the registrar and apply the changes there. If you’re up for a challenge, you can transfer the domains into Route53, but that is much more likely to lead to downtime if a mistake is made.

Domain Takeover via Unclaimed Resources

For domains that have records in Route53 pointing to S3 buckets, it is very important that you audit these records to ensure the bucket actually still exists. There is a very clever attack known as subdomain takeover, in which an attacker can take advantage of the global namespace in which S3 buckets operate to point your subdomain to a bucket they own.

You should take this opportunity to audit all domain records to ensure they are still in use and pointing to valid resources or endpoints.

Step 8: Find Expiring Certificates

AWS hides TLS certificates in two places:

  1. AWS ACM — a managed certificate service with its own dashboard in which certificates can be provisioned, renewed, and monitored.

Your challenge is to locate, rotate, and associate:

  1. Locate all certificates that are currently in use. I recommend using the APIs, including the list-server-certificates API call.

Step 9: Untangle The Web of Services

At this stage, we’ve avoided breaking things for as long as possible and done almost all we can without getting our hands too dirty. It’s time to start mapping existing running applications, shutting down unused services, and untangling the web of servers with names like “donotdeleteever.” Mistakes may be made.

There is really no ideal way to go about this process, but I generally like to do the following:

  1. Check every region for usage. Sometimes developers like to play cruel games of hide-and-seek by launching a c5d.24xlarge EC2 instance that costs $4.608 per hour in unused regions. If you discover resources like this, use CloudTrail, VPC Flow Logs, and CloudWatch metrics to determine whether they are in use. Once you’re confident, temporarily disable the resource by, for example, blocking network traffic to it. This gives you a good way to quickly restore access if you see a developer across the office immediately stand up and flip a desk.

Step 10: Monitor and Migrate

It’s important to recognize that you may never get this account into a “perfect” state. As I mentioned at the beginning of this article, there is no substitute for a brand new AWS account, provisioned from scratch to adhere to your organization’s security policies. Your goal should now be to migrate or deprecate services in this account as quickly as possible, with the eventual goal of full termination. This could be a multi-year effort.

For services that need to remain, monitoring will be key. If you can shift a majority of users and services to new accounts, this will reduce the attack surface and help protect your data. CloudTrail, with proper alerts, will help ensure that any unintended activity is quickly detected.

Being told you are now responsible for an account full of hundreds of legacy applications can be incredibly daunting. But hopefully, using the steps outlined here, you can begin to isolate and correct the worst security risks while containing and monitoring the rest. It’s not a substitute for an account that has been properly configured from the ground up, but what’s the alternative? Nuking the account and walking into the sunset?

If you liked this article, please subscribe to my mailing list for updates or follow me on Twitter.

The Startup

Get smarter at building your thing. Join The Startup’s +799K followers.

Matt Fuller

Written by

Founder of @CloudSploit , acquired by @AquaSecTeam . Former Infra / Security / Manager @Adobe , @Aviary & @Mozilla intern, @RITtigers grad, @NYC resident

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +799K followers.

Matt Fuller

Written by

Founder of @CloudSploit , acquired by @AquaSecTeam . Former Infra / Security / Manager @Adobe , @Aviary & @Mozilla intern, @RITtigers grad, @NYC resident

The Startup

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +799K followers.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store