Every network has a big problem: how to balance security with the need for ease of use and management. Generally, the more secure a network is, the more difficult it is to manage and use. Conversely, the easier a solution is to manage and use, the less secure it tends to be. We faced this problem with AWS Access management, and Mastermind was our attempt to find the sweet spot between security and ease of use. In fact, it is a surprisingly elegant solution that gives us the best of both worlds. We’ll show you how and why.
Simple Integration Example
In Mastermind, when a new application needs access to AWS resources, all that application needs to do is add an Access-Request file to its source, which defines the specific AWS resources the application needs. The file might look something like this:
The “common” section will get applied to deployments in all environments, and the “prod” section will only be applied in prod. This means that all environments will have “s3:GetObject” access on “arn:aws:s3:::example-app-bucket/*,” but only prod will have “s3:PutObject” in addition to “s3:GetObject.”
That’s all that’s needed. And we’ll show you how we make it work. But first, here’s what our network access controls looked like before Mastermind.
The “Bad Old Days“
Over the years, we implemented a variety of ways to manage application access to AWS resources. None were “ideal.” Security was the primary need, but all the reasonably manageable solutions we came up with were lacking from a security standpoint, and the solutions that were acceptable from a security standpoint became unwieldy after adding some number of applications and environments.
To make matters worse, we ended up with multiple solutions with disparate implementations spread across the organization. We knew something had to change.
So we set out to design a single-access system that could replace them all. To achieve what we wanted from a new system, we knew it would need to:
- Adhere to the principle of least privilege, meaning that access should be as closely tailored to the application’s requirements as possible (security).
- Eliminate the use of access keys, especially shared and long-lived keys (security and manageability).
- Reduce the amount of manual work needed to grant the necessary access (ease of use and manageability).
- Make software integration as seamless as possible (ease of use).
Our first attempt to implement a solution was to use static access keys. The problem with this approach was primarily management and security. Keys needed to be stored and rotated, which was a management burden. Also, keys could easily be reused by developers, which made tracking what was using those keys almost impossible.
We knew that by adding a separate management solution and greater restrictions to key usage, we could mitigate this problem. But that would increase our management overhead, as SREs would need to maintain the application and the contents. In other words, solving one problem created another.
We then decided to use roles. This ended up creating a similar problem to that created by keys since it essentially required significant management to maintain. What it did give us was a knowledge of who was using which roles, thus preventing role reuse. In the end, SREs were still manually creating, assigning, and updating roles. That is, adding security made management more difficult.
It looked like we were involved in a game of whack-a-mole. Hit the security one, the ease of use suffered, whack down the problem of ease of use and management became harder. Maybe this wasn’t solvable after all.
Then a simple thought changed everything. Who knows better what access an application needs than the developer of the application? Why not empower developers to decide what access they needed?
That is the moment we started to think about this problem from the right perspective. If we could simply let an app declare which resources it needed to access, the system could do the work of hooking up those resources with the appropriate roles.
This idea implied that applications would no longer know what roles to use. Developers just needed to define a simple file that defined the access to resources they needed. Mastermind could then work out what was necessary to connect the application to the requested resources.
And this led to more ideas: Could we automate the role of Mastermind and remove the need for human intervention entirely? Well, it turns out we could almost get there.
How It All Works
It is all well and good to wave our hands and say, “It just works.” But it’s not magic, so let’s explain it from the perspective of a new application needing resources.
The first step for any new application to request access is to create an Access-Request File. As mentioned before, it looks something like this.
The next step happens during the deployment of an application. In our case, we use Spacepods for deployment. Spacepods reads in the Access-Request File and possibly injects environment information. It then passes this data to Mastermind.
Once Mastermind has the contents of the Access-Request file, it does three things. First, it builds the necessary roles to access the requested resources. Then it generates an AWS CLI config file describing the roles and the resources they access. The generated AWS CLI config file looks something like this:
This AWS CLI config file has a “default” section that defines the role that has access to all local resources and an environment-specific section for any remote environments (resources in other AWS accounts) to which the application has requested access.
At this point, Mastermind returns the AWS CLI config file back to Spacepods. Spacepods then adds that file to the application’s container setup. So when the application spins up, the AWS CLI config file is added to the container.
This means that when the application loads the AWS SDK (or any library that knows about this config file), all the permissions necessary to access the requested resources are immediately available. All this, without the application even referencing a role.
The following Ruby example sets up access to two different buckets.
If no profile is specified, the default is used, which means we don’t need to specify a profile at all if we are only accessing resources local to our container. In the case where we are requesting access to resources in different accounts, we would need to specify the profile (‘prod’).
The primary function of Mastermind is to take a list of AWS resources and corresponding actions and then to create roles that grant access to the requested resources. Here is a high-level overview of the key pieces:
As mentioned previously, the application defines its Access-Request file and then deploys the application using our own tool, Spacepods.
Mastermind acts as a black box to create roles and link them to the application. It returns to Spacepods an AWS CLI config file that is inserted into the application’s container when it is spun up. That file is then used by the AWS SDK to give the application access to the needed AWS resources.
All this automation greatly reduces the need for managing roles and applications. The fact that the application defines which resources it wants simplifies management even further. The app manages its own needs, so by definition, it’s always correct.
However, not quite everything is handled automatically.
Every time an application decides it needs access to a new resource, a manual step is introduced. In this case, Mastermind will send an approval request to the appropriate people via Slack. Once the access has been approved, SRE is notified, and they push a button that allows Mastermind to do its magic. Whew, that’s really hard!
Wait! There’s More: It’s Secure, Too!
Access is tailored to each application so that only the resources actually used are exposed.
Spacepods acts as a trusted intermediary between the application requesting resources and Mastermind. It is the only way the application’s container is built and deployed, so we can securely control the entire process of building roles and connecting AWS resources to an application.
Developers never directly work with or know about AWS roles. Technically, they can find out the roles used by their app, but these roles are useless outside the container for which they were created. This is because roles are directly tied to the container running the application.
It Really Is That Simple and Elegant
Application developers only need to create an Access-Request file to get access to AWS resources. No roles or keys are involved. Everything they need is automatically generated for them by Mastermind via Spacepods. Manual management is nearly eliminated (there is that button to push occasionally). And security is not compromised.
It’s like we managed to whack down all those moles. Mastermind truly is a simple, elegant, and secure solution for controlling access to AWS resources.
Thanks to Regis Wilson for reading drafts of this post.