It’s time to talk about AWS Access keys. In the Direct Shopper Technology team at The LEGO Group we are moving towards security best practices by giving our engineers one less thing to worry about and taking away their personal AWS Access keys (and making the process of acquiring temporary keys as easy as running a command).
Here’s a few questions for you:
1. Do you have access to an Amazon Web Services (AWS) account?
2. Do you have any AWS access keys?
3. If you do, do you protect them as strictly as your other passwords?
4. Bonus point: Have you rotated your keys since you received them?
If you answered no to the last two questions then you may have figured out why taking away our developers’ keys to the (AWS) kingdom is a good thing.
Hint: AWS Access Keys protect the same level of access as your username and password
This article describes how we’ve replaced them with something more secure and inline with AWS best practices with as little friction as possible.
LEGO.com is hosted entirely on AWS and a large proportion of our 30+ engineering team are working on building and maintaining our serverless services. This means they require various levels of access to our AWS accounts.
Firstly, why is an AWS account worth protecting?
AWS offers a vast collection of services and resources on demand that enables companies and developers to do many things from simply storing images or data to serving up massive multiplayer games to a global audience, or in our case, hosting a global e-commerce site and the backend processing behind it.
This means that these accounts hold many things of interest like personally identifiable information, credit card details, and especially in our case, classified images about unreleased products.
Not only is the collection of data a potential goldmine for anyone who shouldn’t have access but also the pay-for-use model means that when it’s your credit card on the line, you too would want to restrict who can potentially rack up your tab with bitcoin mining machines.
From a corporate perspective unauthorised access can lead to
- Reputation damage
- Site downtime
So how can we protect our accounts?
Broadly speaking there are two methods of access control in AWS:
- Resource level access control (Policies on the resource determining who can access it)
- User level access control (Policies applied to user accounts/roles to restrict what they can do)
We will be focusing on the second point as it is the broadest and involves the most factors outside of our control as an AWS admin.
AWS User Management Basics
Let’s assume we want to securely store some images in an S3 bucket in a new AWS account.
In AWS you create a “root user” when you open a new account. This root user is the super all powerful admin that can do absolutely everything and cannot be limited. As always, AWS has security in mind and there is a checklist in the Identity and Access Management (IAM) console which includes setting password policies and securing your root user account with Multi-Factor Authentication (MFA) so that only you (or someone with your password and token generator) can login and gain these super powerful privileges.
It is best practice to use the root user account only when you absolutely have to and to use the IAM service to create other, lower privileged user accounts for your day to day usage. Let’s say you create an admin user for yourself and a user for Bob to also upload images to your account. Your account would now look something like this:
AWS’s IAM service breaks User Level Access Control down into Users, Policies, Groups and Roles.
A User is a resource containing a username and password that uniquely identifies a physical user including generated Access Keys for accessing the AWS console and/or API’s.
There are a whole load of security settings available around the User resource like password policies, multi-factor authentication and enabling or disabling Access Keys but the important thing to note here is there is no out-of-the-box way to expire Access Keys after a given date or amount of time.
This is for good reason as access keys are used for programmatic access to AWS resources so you’ll often set them in your default AWS credentials file and then forget about them. AWS Access Keys are just as important to keep safe as your password as they both protect the same level of access to your AWS account and the resources within.
Policies list the actions allowed or denied on specified AWS resources. While you can create your own tailored policies, there are AWS managed policies which cover common sets of permissions like “administrator”, which provides all permissions on all resources, or “CloudWatchReadOnlyAccess” which provides read only permissions to AWS CloudWatch resources.
Groups let you manage the Policies or permissions for a group of users as a whole entity rather than managing the permissions of all users independently. This becomes a significant time saver once you have more than one person (User) needing access to your AWS account.
Now, Roles are used slightly differently from the straightforward interactions of Groups, Users and Policies. Roles are essentially groups of permissions that can be assumed, either by other AWS services, or Federated Users (where the authentication is not managed by AWS). The key here is assuming a role generates temporary AWS Access Keys.
So now you have at least two users in your AWS account. A root user that you’ve used to create an admin user for your day to day AWS shenanigans and a dev user for Bob. You’ve created a group for admins and a group for developers with appropriate policies attached just on case you need to give more people access to your account. Groups are a great way to manage a small number of users.
That’s all you need for a personal AWS account but what happens when you start hosting projects and need to provide access to more developers, testers, or even end users? Do you trust them to protect their AWS credentials as much as you want to protect your data and resources?
Corporate AWS User Management
So let’s extend this to a corporate setting with a full team of 30 engineers who need differing levels of access across several AWS accounts.
The most common AWS set up I’ve encountered is the use of separate dev, test and production AWS accounts where developers can do pretty much anything in dev and test but have read-only or no access at all to production.
So how do you set up the required access for these 30 engineers in 3 AWS accounts with 3 tiers of access? Based on my brief introduction at the start of the article, there are two broad ways to go about this.
Using Users, Groups and Policies: Create a new AWS User for each of the 30 developers in each of the AWS accounts that they need access to and add them to groups with appropriate policies attached.
This approach is time consuming to set up and will have a high management overhead — user on-boarding/off-boarding will all need to involve an AWS account admin. It is also means your users need to remember three different passwords just for these AWS accounts and that doesn’t lend itself to good password practices.
Using Roles: In a corporate scenario, we all have a corporate login that we use for email and intranet access. We can use this same login for AWS by setting up identity federation (explained in the next section of this article) which enables us to inherit different AWS Roles based on the Active Directory (AD) groups we are in at the corporate level.
This second option has a lot of benefits in the world of user management. The first being when a user leaves the company, there is normally a policy to deactivate their user in the corporate login service and this will automatically disable their user from being able to access AWS as they can no longer authenticate and assume the role with the permissions.
The second benefit is no double creation of users and no extra passwords for users to remember. The third and most important benefit is that when a user assumes a role, they can be issued temporary access keys. Remember what I said before about no out-of-the-box solution for expiring Access keys for AWS Users?
Federated User Access
The way we connect our corporate AD server and AWS is called Identity Federation or Federated User Access.
Federated User Access is a process also known as single-sign-on and it is essentially a process to outsource user authentication — ie. “Sign in with GitHub | Facebook | Twitter | …”.
Some key terms to know:
- User / Client: The user wanting to log in and use the service
- Service Provider (SP): The service the User wants to access
- Identity Provider (IdP): The service that provides user authentication and already has data about the user in order to authenticate them
Take a look at the image below for an example of Federated User Access in action: a User wanting to access Medium.
The User wants to access Medium to write a blog. The Service Provider (SP) in this case is Medium and it provides blog creation and hosting but it doesn’t want to manage user login/authentication. The Identity Provider (IdP) in this case is Google, Facebook or Twitter and they specialise in user authentication and data and so they let other services send them requests to authenticate visitors to their sites.
How The LEGO Group does it
Let’s walk through the process of setting up and using Federated User Access to access our AWS accounts.
The first step is a greatly simplified setup and then there are 4 high level steps for the actual process of logging in:
- Establish a trust relationship
- Initiate a login attempt
- Successful authentication
- Send it all to the SP
- You’re in!
1. Establish a trust relationship
The first step is to establish a trust relationship between the IdP and SP, in this case our LEGO AD servers and our AWS accounts.
This is done by the teams looking after the IdP and SP exchanging some metadata files.
Typically the SP Metadata contains a map of user data fields that it will require when the user authenticates successfully (like username, email, AD Groups etc.). It will also contain a certificate for the IdP to verify the request and a URL so the IdP knows how to send the user back to SP.
The IdP Metadata typically contains the URL where the SP needs to redirect the user to initiate the login process, and a certificate so the SP can verify the response is coming from the IdP.
We have created AD Groups in our LEGO AD server that correspond to each of our AWS Roles in each of our AWS Accounts. Our engineers are then added to these AD groups in order to grant them the ability to assume the corresponding AWS Role in the corresponding AWS Account. This list of AD Groups that our engineer is a member of is configured as one of the user data fields in our SP Metadata.
2. Initiate a login attempt
There are two flavours of Identity Federation: SP-initiated login or IdP-initiated login. In both flavours you are trying to access the SP but the difference is whether you first go to the SP (which then redirects you to the IdP for login) OR the IdP (via a url that tells the IdP where you are intending to be redirected after you login successfully).
With SP-initiated login you go to the SP first, like in the Medium scenario, where you’re prompted with the “Sign in with X” messages.
With IdP-initiated login you go to a special URL provided by the IdP and send it your login credentials. We use IdP-initiated login as it has less redirects and our devs just bookmark the URL to navigate to for logging in.
There’s a couple ways to send login details but the way we do it is using NTLM.
NTLM stands for Windows NT LAN Manager but it’s commonly known as Windows Challenge/Response protocol. It’s basically a step up from Basic Auth.
If you haven’t heard of Basic Auth this is the setting of an Authorization header with the value set to the word Basic followed by a Base64-encoded string of your username and password separated by a colon. It is a well known way to pass login details but the only “security” around it is that the password is obfuscated — anyone can Base64-decode it and get the credentials.
With NTLM the password is never transmitted. In fact the process is as follows:
- The client sends a request to the IdP to initiate login using NTLM
- The IdP responds with a challenge string for the client to encrypt
- The client does so using the password of the User and sends it back
- The IdP can then determine if the encryption was done using the correct password
3. Successful authentication
If the Client authenticates successfully with the IdP, the IdP responds with the details that the SP listed in its metadata when the trust relationship was established in Step 1.
These user details are sent back to the client using a protocol named SAML (Security Assertion Markup Language). This is a standard for communicating Authentication data between parties and includes all the data the SP needs including session details in addition to the requested user details. Alternative names for our IdP and SP are SAML Authoriser and SAML Consumer respectively.
4. Send it all to the SP
The SAML Response from the IdP contained a list of AD Groups that the User is a member of. We present this list to the User so they can select the role that they wish to assume. We then send this selection and the entire SAML Response to the SP for verification and initialisation of the user session.
With AWS we send the request to the Secure Token Service (STS) to assume the selected AWS Role.
The AWS Security Token Service (STS) is a web service that enables you to request temporary, limited-privilege credentials for AWS Identity and Access Management (IAM) users or for users that you authenticate (federated users).
There are three methods you can use to assume a role:
- AssumeRole — typically for cross account access
- AssumeRoleWithSAML — for users that have authenticated with a SAML Response
- AssumeRoleWithWebIdentity — Users that have authenticated with an OpenId Connect compatible response ie. Cognito, Facebook, Google
Since we receive back a SAML response we call the AWS STS AssumeRoleWithSAML method to assume our selected AWS Role.
5. You’re in!
The SP creates a session for you in their service and in our case responds back with temporary AWS Access Keys with the permissions set by the Role you assumed and with the session time determined by the minimum of the duration requested in the AWS STS AssumeRoleWithSAML request or the maximum duration configured in the AWS Role.
From a user’s perspective
This process solves the problem of having permanent AWS Keys on your machine but it’s not going to be adopted if it’s hard to use — or much harder than a set and forget secret. Let’s see what the User actually experiences when going through the login process.
In order to access AWS in a browser, we just go to the special URL for IdP-initiated login — this means (2) our browser connects to the identity provider to be authenticated and (3) the successful response is then presented to the user for role selection before being (4) passed to the service provider (AWS) which verifies the response came from the trusted identity provider before (5) allowing you to proceed to the AWS console.
From the user’s perspective this means (2) click on the bookmarked URL and silently login using single-sign on (SSO), (3) presented with a list of roles and AWS accounts based on the user’s AD groups, (4) selection of the desired role, (5) access to the AWS console with that role’s permissions.
For command line access to AWS a similar process is followed but without the browser redirects that makes the browser login process so smooth. So we created a CLI Tool!
Our CLI Tool
AWS already provides a Command Line Tool written in Python that handles the federated user login flow. Since all of our stack is written in Node.js we created an equivalent script written in Node.js and supplied it to our developers as a NPM package from our GitHub Package Registry.
This allows our developers to stay in the context of their favourite terminal and automatically generate temporary AWS access keys that they can immediately use to call AWS API’s.
What it looks like
Our CLI tool is called “octan” after our main monorepo. The octan login command triggers the IdP-initiated login flow to request and retrieve temporary AWS Access Keys.
First the user is prompted for their username and password and then the first request is sent out to our corporate AD servers. If a successful response is received then let the user know they successfully logged in.
The user is then presented with a list of environments (not shown in the screenshot as the list is replaced with the selection, “dev” in this case). These environments correlate to our different AWS accounts that we have mapped to useful names of the format <project>-<env>.
When an environment has been selected the user is presented with a list of roles that they are able to assume if there is more than one choice. If they only have one eligible role then that role is automatically selected and the request to AWS is sent off immediately.
If a successful response is received then we tell the user they successfully logged in to AWS, which means they successfully assumed the selected role, and that the AWS Access Key and Secret have been saved to their “default” awscli profile.
It’s also worth mentioning we have a couple of optional parameters to adjust the default behavior. The main ones being:
- session duration — how long your credentials last for before they expire
- profile — an alternate awscli profile to save the credentials to
How it works
The excerpt from our implementation below should make sense based on the process described above.
First we send a login request to our IdP-initiated login endpoint. Our ADFS server then responds back with the SAML assertion which includes a list of AWS accounts and roles that the user is allowed to assume based on the AD groups they are a member of.
We then list these roles for the user to select the desired role and finally send the AWS STS AssumeRoleWithSAML call to AWS. The response to this is a temporary AWS Access Key and Secret which we set for the user in their default awscli profile.
- AWS Access Keys are secrets
- Creating AWS IAM Users for individual users is time consuming, results in another password the user needs to remember/store and has maintenance overhead (additional steps for on-boarding/off-boarding).
- AWS Access Keys issued to AWS IAM Users never expire until they are disabled/deleted.
- Assuming AWS Roles involves issuing temporary AWS Access Keys.
- Identity Federation is the existence of a trust relationship between two services, an Identity Provider (IdP) and a Service Provider (SP).
- Identity Federation between our corporate AD Groups and our AWS Roles means we can use our corporate login details to assume an AWS Role and receive temporary AWS Access Keys,
- This means our developers don’t have long-lived personal AWS Access Keys that, if leaked, could grant unwelcome access to one of our AWS accounts,
- We created a Node.js CLI tool to enable our developers to easily request these temporary AWS Access Keys.
I hope this post has presented the highlights of moving away from native AWS Users and into using AWS Roles and Federated User Access for a more secure and low maintenance setup for allowing access to your AWS accounts.
Nicole Yip is a Senior Infrastructure Engineer at The LEGO Group, working on the team that builds LEGO.com.