This post was originally written on June 4, 2018 and has been moved from our old blog site to Medium.com.
In this blog post I will share our experience rolling out kube2iam on our kops-launched Kubernetes cluster as well as some of the implementation details.
IAM Roles for Amazon EC2 are a great way to distribute credentials to your applications running on AWS without having to manage individual accounts. An issue we encountered when using Kubernetes on EC2 is that any pods running on your EC2 instances that need AWS service access will need to share the same IAM role attached to the instance with wider permissions than needed, thus violating the principle of least privilege.
This is how this looks like:
Here at Bench, security is a first-class citizen so we decided to explore options in order to manage permissions in a more granular way and found the excellent kube2iam. kube2iam is an open source project created by Jerome Touffe-Blin that helps securing applications running on Kubernetes and AWS by limiting the permissions that each Kubernetes pod can assume and proxying the traffic to the EC2 metadata API. After a bit of research and internal planning we decided to give kube2iam a try on our Kubernetes cluster.
- Kubernetes v1.9.6
- kops v1.9.0
- An example app running on Kubernetes:
- Datadog DaemonSet with datadog-agent v6.1.4
Rolling out kube2iam
1. Setting up the IAM roles
First, we need to provision an IAM role for kube2iam to work. In order to better understand the steps for this let’s review a few concepts involving IAM roles:
- A Trust policy is a document in JSON format that defines who can assume the role, for example entities such as IAM users from another AWS account or another IAM role. These entities are known as principals in the AWS slang.
- A Permissions policy is a document in JSON format that defines what actions and resourcesthe role can perform.
As we use kops, our master and worker nodes already have IAM roles attached to them so we decided to use these as the kube2iam role. These roles exist so the kops-provisioned Kubernetes cluster can function correctly and is able to interact with AWS services such as S3 or ELB.
kube2iam IAM role
Here is how the Trust/Permissions policies of our nodes.kops.domain.com IAM role look like:
Permissions policy (truncated):
Note that we’ve restricted what roles this role can assume by allowing to
AssumeRole only our Kubernetes IAM roles, i.e.
*.services.kops.domain.com. This is to ensure only roles that relate to Kubernetes can be assumed by kube2iam or the Kubernetes nodes. It’s important that you follow a consistent naming convention when creating IAM roles so the wildcard statement works correctly. These additional permissions are added on the
AdditionalPolicies section of your kops cluster templates.
Datadog IAM role
Let’s create a new IAM role for our Datadog DaemonSet. We use the Terraform aws_iam_role resource but this can also be done using the AWS console or your Infrastructure as code tool of choice.
We’ve allowed our kops master and node roles to assume the Datadog role by adding the role ARNs as a Principal following the standard format
arn:aws:iam::$account:role/$role-name. This is so kube2iam can assume each individual role and then provide restricted credentials to your applications.
Something I use in order to check what resource-level permissions a specific AWS API call supports is the excellent cloudonaut.io IAM reference.
2. Deploying kube2iam
Now that the IAM roles have been created, we need to deploy kube2iam on our cluster as a DaemonSet. We used the base configuration and then made a few additions:
- image: jtblin/kube2iam:latest
- name: NODE_NAME
- name: HOST_IP
- containerPort: 8181
A few other things you may want to take into account:
- If you plan to use kube2iam on your master nodes, keep in mind this open issue that affects kops master node cycling. You can create a toleration with a
NoScheduleeffect to avoid kube2iam being run on your masters in case this issue affects you.
- If you use Calico for networking, you can pass the container argument
- The privileged
securityContextis needed for the
--iptables=truefeature that restricts the EC2 metadata endpoint through iptables. You may want to comment out the iptables setup in case you don't want to enforce the EC2 endpoint metadata restriction before the rest of the kube2iam setup is completed.
- We use RBAC (Role-Based Access Control) on our cluster so we had to create a
ServiceAccount. This example is a good starting point.
3. Adding the IAM role annotation
Now we need to add an annotation for each of our applications that need AWS access. This is so kube2iam can assume the role and provide temporary credentials to each pod. Here is how it would look like for our Datadog DaemonSet definition:
If your applications run as deployments, just add the IAM role annotation under the top-level
spec of your deployment definition.
4. Testing the setup
A way to test if kube2iam is working as expected is to
kubectl exec into one of your pods and run the following (this assumes curl is installed on the container!):
The output should be your role annotation. To test that the pod can actually retrieve temp credentials you can run:
You should also see the metadata endpoint requests in the kube2iam pod logs:
kubectl logs kube2iam-p5cnt -n kube-system
time="2018-05-31T19:26:47Z" level=info msg="GET /latest/meta-data/iam/security-credentials/arn:aws:iam::111111111111:role/datadog-agent.services.kops.domain.com (200) took 92033 ns" req.method=GET req.path="/latest/meta-data/iam/security-credentials/arn:aws:iam::111111111111:role/datadog-agent.services. kops.domain.com" req.remote=100.96.2.13 res.duration=92033 res.status=200
Something we also like to do when making changes to the cluster is to roll the cluster, aka terminate and recreate each node in the cluster using
kubectl drain on one node at a time. This gives us peace of mind as we make sure that if a node is restarted by any reason, there won't be any surprises when the node picks up new configuration.
kube2iam is now deployed on your Kubernetes cluster and your applications can only access the AWS services they need: