kube2iam : Secure Kubernetes Services on AWS

Alberto Alvarez
Bench Engineering
Published in
6 min readJun 19, 2019

This post was originally written on June 4, 2018 and has been moved from our old blog site to Medium.com.

“Brass-colored Metal Padlock With Chain” by Life Of Pix

In this blog post I will share our experience rolling out kube2iam on our kops-launched Kubernetes cluster as well as some of the implementation details.

IAM Roles for Amazon EC2 are a great way to distribute credentials to your applications running on AWS without having to manage individual accounts. An issue we encountered when using Kubernetes on EC2 is that any pods running on your EC2 instances that need AWS service access will need to share the same IAM role attached to the instance with wider permissions than needed, thus violating the principle of least privilege.

This is how this looks like:

Here at Bench, security is a first-class citizen so we decided to explore options in order to manage permissions in a more granular way and found the excellent kube2iam. kube2iam is an open source project created by Jerome Touffe-Blin that helps securing applications running on Kubernetes and AWS by limiting the permissions that each Kubernetes pod can assume and proxying the traffic to the EC2 metadata API. After a bit of research and internal planning we decided to give kube2iam a try on our Kubernetes cluster.

The cast

  • Kubernetes v1.9.6
  • kops v1.9.0
  • An example app running on Kubernetes:
  • Datadog DaemonSet with datadog-agent v6.1.4

Rolling out kube2iam

1. Setting up the IAM roles

First, we need to provision an IAM role for kube2iam to work. In order to better understand the steps for this let’s review a few concepts involving IAM roles:

  • A Trust policy is a document in JSON format that defines who can assume the role, for example entities such as IAM users from another AWS account or another IAM role. These entities are known as principals in the AWS slang.
  • A Permissions policy is a document in JSON format that defines what actions and resourcesthe role can perform.

As we use kops, our master and worker nodes already have IAM roles attached to them so we decided to use these as the kube2iam role. These roles exist so the kops-provisioned Kubernetes cluster can function correctly and is able to interact with AWS services such as S3 or ELB.

kube2iam IAM role

Here is how the Trust/Permissions policies of our nodes.kops.domain.com IAM role look like:

Trust policy:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}

Permissions policy (truncated):

{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:BatchGetImage"
],
"Resource": [
"*"
]
},
{
"Sid": "",
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::111111111111:role/*.services.kops.domain.com"
}
]
}

Note that we’ve restricted what roles this role can assume by allowing to AssumeRole only our Kubernetes IAM roles, i.e. *.services.kops.domain.com. This is to ensure only roles that relate to Kubernetes can be assumed by kube2iam or the Kubernetes nodes. It’s important that you follow a consistent naming convention when creating IAM roles so the wildcard statement works correctly. These additional permissions are added on the AdditionalPolicies section of your kops cluster templates.

Datadog IAM role

Let’s create a new IAM role for our Datadog DaemonSet. We use the Terraform aws_iam_role resource but this can also be done using the AWS console or your Infrastructure as code tool of choice.

Trust policy:

{
"Version": "2012-10-17",
"Statement": [{
"Sid": "",
"Effect": "Allow",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Action": "sts:AssumeRole"
},
{
"Sid": "",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::111111111111:role/nodes.kops.domain.com",
"arn:aws:iam::111111111111:role/masters.kops.domain.com"
]
},
"Action": "sts:AssumeRole"
}
]
}

Permissions policy:

{
"Version": "2012-10-17",
"Statement": [{
"Effect": "Allow",
"Action": [
"ec2:DescribeTags"
],
"Resource": [
"*"
]
}]
}

We’ve allowed our kops master and node roles to assume the Datadog role by adding the role ARNs as a Principal following the standard format arn:aws:iam::$account:role/$role-name. This is so kube2iam can assume each individual role and then provide restricted credentials to your applications.

Something I use in order to check what resource-level permissions a specific AWS API call supports is the excellent cloudonaut.io IAM reference.

2. Deploying kube2iam

Now that the IAM roles have been created, we need to deploy kube2iam on our cluster as a DaemonSet. We used the base configuration and then made a few additions:

apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: kube2iam
namespace: kube-system
labels:
app: kube2iam
spec:
minReadySeconds: 0
updateStrategy:
type: RollingUpdate
template:
metadata:
labels:
name: kube2iam
spec:
hostNetwork: true
serviceAccountName: kube2iam
containers:
- image: jtblin/kube2iam:latest
name: kube2iam
args:
- "--node=$(NODE_NAME)"
- "--host-interface=cali+"
- "--iptables=true"
- "--host-ip=$(HOST_IP)"
- "--debug"
env:
- name: NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: HOST_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
ports:
- containerPort: 8181
hostPort: 8181
name: http
securityContext:
privileged: true

A few other things you may want to take into account:

  • If you plan to use kube2iam on your master nodes, keep in mind this open issue that affects kops master node cycling. You can create a toleration with a NoSchedule effect to avoid kube2iam being run on your masters in case this issue affects you.
  • If you use Calico for networking, you can pass the container argument --host-interface=cali+
  • The privileged securityContext is needed for the --iptables=true feature that restricts the EC2 metadata endpoint through iptables. You may want to comment out the iptables setup in case you don't want to enforce the EC2 endpoint metadata restriction before the rest of the kube2iam setup is completed.
  • We use RBAC (Role-Based Access Control) on our cluster so we had to create a ClusterRole, ClusterRoleBinding and a ServiceAccount. This example is a good starting point.

3. Adding the IAM role annotation

Now we need to add an annotation for each of our applications that need AWS access. This is so kube2iam can assume the role and provide temporary credentials to each pod. Here is how it would look like for our Datadog DaemonSet definition:

kind: DaemonSet
spec:
selector:
matchLabels:
app: datadog-agent
template:
metadata:
annotations:
iam.amazonaws.com/role: "arn:aws:iam::111111111111:role/datadog-agent.services.kops.domain.com"

If your applications run as deployments, just add the IAM role annotation under the top-level spec of your deployment definition.

4. Testing the setup

A way to test if kube2iam is working as expected is to kubectl exec into one of your pods and run the following (this assumes curl is installed on the container!):

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/

The output should be your role annotation. To test that the pod can actually retrieve temp credentials you can run:

curl http://169.254.169.254/latest/meta-data/iam/security-credentials/arn:aws:iam::111111111111:role/datadog-agent.services.kops.domain.com

You should also see the metadata endpoint requests in the kube2iam pod logs:

kubectl logs kube2iam-p5cnt -n kube-system
time="2018-05-31T19:26:47Z" level=info msg="GET /latest/meta-data/iam/security-credentials/arn:aws:iam::111111111111:role/datadog-agent.services.kops.domain.com (200) took 92033 ns" req.method=GET req.path="/latest/meta-data/iam/security-credentials/arn:aws:iam::111111111111:role/datadog-agent.services. kops.domain.com" req.remote=100.96.2.13 res.duration=92033 res.status=200

Something we also like to do when making changes to the cluster is to roll the cluster, aka terminate and recreate each node in the cluster using kubectl drain on one node at a time. This gives us peace of mind as we make sure that if a node is restarted by any reason, there won't be any surprises when the node picks up new configuration.

Summary

kube2iam is now deployed on your Kubernetes cluster and your applications can only access the AWS services they need:

If you are interested in learning more about Bench Accounting or a career with our Engineering team, then please visit us at https://bench.co/careers/

--

--