A Secure Kubernetes Cluster from Day 1

John Solis
RevOps
Published in
4 min readApr 16, 2018

Adoption of containers and containerization technologies has skyrocketed over the last few years as developers recognize their many benefits for production systems. On the orchestration side, the key players have long been Docker Swarm, Amazon Elastic Container Service, and Google Kubernetes. My own work over the last three years involved deploying and maintaining production systems on Amazon ECS, as well as, the occasional Docker Swarm cluster for side projects. However, with both Docker and Amazon recently announcing direct support for Kubernetes, it became clear where the mind-share is and that it is now time to dive into the technology.

I started out by reading various “Kubernetes for Docker Swarm Users” articles (this series is particularly good), but soon realized that I was doing myself a disservice by trying to bend the old Swarm abstractions to fit in the new Kubernetes world. To fully understand the technology, and properly maintain systems in the future, I would need to start from scratch. Luckily, Kubernetes has an excellent set of tutorials and Kuberentes Basics is the best starting point.

After making my way through the tutorials, it was time to stand up my own cluster. Given my security background, my goal was to stand up a production cluster that implemented security best practices from Day 1 without sacrificing agility or speed for our team.

Role-Based Access Control for Kubernetes

There are many tools available for provisioning and managing clusters, but we opted to go with kops since it is recommended by the Kubernetes team and also under active development.

Creating clusters on AWS with kops is incredibly simple — set a few environment variables and run kops create. Unfortunately, as this analysis shows, the default installation configuration sacrifices a lot of security for simplicity and ease of deployment; it is not a suitable configuration for production clusters. The first issue we address is the lack of RBAC authorization:

  1. Create a cluster configuration with kops (detailed documentation):
$ export NAME=mycluster.example.com
$ export KOPS_STATE_STORE=s3://my-kubernetes-state-store
$ kops create cluster --zones us-west-2a ${NAME}

Note: The older kubernetes documentation recommended against smaller t2 instances due to weird behavior and unexpected failures that can occur when available CPU credits run low. This advice still applies to the newer versions.

WARNING: beware that t2 instances receive a limited number of CPU credits per hour and might not be suitable for clusters where the CPU is used consistently.

2. Edit the cluster configuration

$ kops edit cluster ${NAME}

Update the configuration to enable RBAC authorization

...
spec:
authorization:
rbac: {}
...

3. Build and validate the new cluster

$ kops update cluster ${NAME} --yes
$ kops validate cluster

Note: Enabling RBAC on an existing cluster is possible but may cause issues with services if permissions are not configured correctly.

Block Access to AWS Metadata API

This is a pretty scary one. If pods are able to query the AWS Metadata API endpoint then they essentially inherit the permissions of the role attached to the EC2 instance.

We addressed this issue by installing kube2iam on our cluster which adds iptables rules to the host to block pods from directly accessing the API. Additionally, this enables fine-grained access control for individual pods by binding them to dedicated IAM roles through the use of mainfest annotations. The details of this process will be the subject of a future post.

Service Isolation with Namespaces

Kubernetes namespaces lets us add an additional layer of security by isolating the pods and services that share resources to a dedicated namespace. Any pod outside of this namespace will not able to access any resources that are not already publicly accessible.

The following is an example of how we define our Kubernetes manifest files for creating simple deployment within a dedicated namespace. This is the same configuration (plus an additional service manifest) that powers our pricing page tool. Our process for creating secure Docker images, pushing to private repositories on Amazon ECR, and deploying to production will be described in a future post.

  1. Create a dedicated namespace manifest for services that should be isolated from each other: page-score-namespace.yaml
# PageScore namespace
apiVersion: v1
kind: Namespace
metadata:
name: page-score

2. Deployment manifest using the namespace: page-score-deployment.yaml

# PageScore Deployment Manifest
# This manifest defines the production configuration of the pricing page score service
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: page-score-deployment
namespace: page-score
spec:
replicas: 1
template:
metadata:
labels:
app: page-score
spec:
containers:
— image: 123456789012.dkr.ecr.us-west-1.amazonaws.com/page-score:latest
imagePullPolicy: Always
name: page-score

3. Use kubectlto apply the changes to the cluster

$ kubectl apply -f page-score-namespace.yaml
$ kubectl apply -f page-score-deployment.yaml

4. Verify that all changes were applied

$ kubectl get namespace | grep page-score
page-score Active 1d
$ kubectl get deployment -n page-score
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
page-score-deployment 1 1 1 1 1d

Conclusion

In this article we have gone through the process of creating and configuring a new kubernetes cluster to ensure that security best practices are followed for production clusters. We have a cluster with RBAC authorization enabled, restricted pod access to AWS Metadata endpoint, and have shown how deployments can be isolated quickly and easily using namespaces.

Security is a mindset. It is a process that involves continuous analysis of systems, access controls, and procedures. At RevOps, security is in our DNA, and we strive to implement security best practices from day one.

--

--