Practical Kubernetes: Setting up K8s cluster on AWS using Terraform

Aleksa Vukotic
thestartupfactory.tech
9 min readMay 13, 2019

Containers and Docker have brought such a change to the software development lifecycle that it’s seems like an ancient past when we had separate the how our code runs locally, in test or production environment. With containers, we don’t simply ship just the code binary — but instead package entire runtime environment, including the code, that can run anywhere in the predictable and repeatable way.

The proliferation of containers freed us, the software development teams, from architectural and infrastructural constraints, and considerations about shared hosts, networks, installed software compatibilities — we were now able to embrace concepts like microservices fully, and package and ship individual features as built container images.

An understandable result of this pattern shift has been proliferation of containers — our applications now consist of 10s 100s containers, talking between themselves and the outside world, with different data, scalability and resiliency requirements — and all of a sudden the burden of managing this at runtime became non-trivial and overcome our DevOps and SRE teams, who spent more a more time writing tools to manage containers in production .

It was just a matter of time before we started generalising the patterns we were trying to apply, leading to the birth of Container Orchestration Platforms — which promised to standardise the way we manage containers in the wild, so we can finally abstract the infrastructure (almost) completely, and treat our k8s application the same way regardless if they are running on any public or private cloud, or in internal data centre.

Kubernetes (k8s) is one such Container Orchestration Platform. Born at Google, k8s quickly gathered large community and become the leading platform available. Once other big players like Microsoft and Amazon joined in, it seems now that Kubernetes have a big lead in the race to become The Standard, all encompassing Container Orchestrator.

In the series of blogs, starting with with one, we’ll try to demonstrate how can we use Kubernetes of different cloud providers, to deploy various workloads. In addition to showcasing hands-on how to manage k8s clusters and applications running on it, we are focused on automated, repeatable, predictable manner to achieve all of that — there will be no web portal and click-here-and-there exercises. Our goal is to be able to quickly create and destroy entire setups, including infrastructure and application code, within minutes.

As first example, we will try to configure Kubernetes cluster on AWS, powered AWS EKS (Amazon Elastic Container Service for Kubernetes), using Terraform to describe our infrastructure-as-code. We will be using community supported terraform AWS EKS module https://github.com/terraform-aws-modules/terraform-aws-eks.

All code that we’re discussing is fully workable and can be accessed on github: https://github.com/aleksav/k8s-blogs/tree/master/01-setup-k8s-on-aws-eks

Prerequisites

Provisioning AWS EKS Cluster

For this section, the only requirement is installed latest version of Terraform and AWS credentials configured (you will need other prerequisites later)

First step is the terraform resource definition for the actual cluster, have a look at the code snippet.

The cluster name and tags will get populated from the input variables, which you control via input.auto.tfvars file.

There are three more properties that we need to pass on to the EKS terraform module so it can create working cluster for us:

  • VPC where our k8s cluster will reside (vpc_id, subnets)
  • The description of EC2 instances we want to use as cluster worker nodes (worker_group_launch_template)
  • Security group to apply to all provisioned resources created in the cluster (worker_additional_security_group_ids)

Let’s take a look at these one by one

Configuring VPC and Subnets

For the purpose of this example, we are going to create and manage new VPC resources together with Kubernetes cluster, using existing terraform VPC module, as shown below:

In addition to name, cidr and availability zones (azs), we need to specify few additional important parameters — private/public subnets and some tags specific to k8s.

Once we have cluster up and running and we are deploying application services to it, we can specify different service types. Without going into too much details at this stage (we’ll get back to this when we start using the cluster services), the k8s service can be public (e.g. LoadBalancer service, accessible from the internet outside the cluster) or internal (e.g. ClusterIP, where service is only available from within the cluster). in AWS world, the distinction between public and internal services is managed using concepts of public and private subnets within the VPC. We configured 3 public and 3 private subnets in our terraform VPC code — public ones will be used by AWS load balancers (ELBs, ALBs…) to route services accessible from outside, while private ones will be used by internal load balancers

In addition to creating public and private subnets, we need to tag them correctly so that Kubernetes can discover these infrastructure resources and use them for public or internal load balancing, respectively. The details of the tags required are as follows:

  • All subnets AND the VPC must to be tagged with kubernetes.io/cluster/<cluster-name>:shared so that it can be discovered by the k8s cluster (shared means that we can use the same VPC for multiple cluster)
  • Private subnets must be tagged with kubernetes.io/role/internal-elb:1 so that they can be used for internal load balancer by Kubernetes
  • Private subnets can be tagged with kubernetes.io/role/elb:1 so that only tagged subnets are used for public load balancers (if omitted a subnet in each availability zone will be used instead)

Configuring worker nodes

With AWS EKS, Amazon will take care and manage our kubernetes master nodes, however, it is up to use to specify and configure worker nodes that will run our container workloads. There are 2 ways we can specify the number of properties of worker nodes using EKS Terraform template: Launch Configurations or Launch Configuration Templates. Launch Configuration Templates suggested as (https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchTemplates.html), and are suggested as best practice — which is why we will use them in this example.

The community Terraform AWS EKS module supports both Launch Configurations and Launch Configuration Templates, and by default creates one of each. In order to use Launch Configuration Templates only, we specified worker_group_count as zero, so that only launch configuration template is taken into account.

This is how our Launch Configuration Template looks like:

Most settings are self-descriptive — we specify the instance type (one of standard AWS instance descriptors), desired, minimum and maximum capacity for Auto Scaling Group (asg), subnet as well as additional security groups to apply to all launched instances.

We are creating 2-node cluster, which can scale to 3 nodes, and by default the terraform module will create 2 on-demand instances to fulfil desired capability. You can fine tune the Launch Template Configuration in much more detail with the Terraform module (including using spot instances or mixed fleet — more details here: https://github.com/terraform-aws-modules/terraform-aws-eks). But for now, let’s keep this to a minimum.

You will notice that we are putting our worker instances into the private subnet on line 5 of the code snippet above (which we configured earlier) — this is considered best practice, so that your worker nodes are not available from the public internet. This doesn’t mean that the service you deploy on the cluster must be internal as well — the services will be managed by the ELBs in the public subnet, as per subnet tags we specified in the previous step.

The first time we tried the launch template configuration, we go the following error: InvalidBlockDeviceMapping — the encrypted flag cannot be specified since device /dev/xvda has a snapshot specified. This is due to AWS API not allowing encrypted flag to be passed for snapshot devices, while the terraform-aws-eks module defaults the parameter to “false” — for that reason we have overriden the root_encrypted parameter with empty string value.

Configuring security group for worker nodes

We have configured 3 simple security groups, each only allowing traffic on port 22 for unix management purposes. At least one security groups is required, to be added to all worker nodes (as part of the EKS module) — the additional 2 security groups are there for illustration purposes.

In this section we discussed key elements of our Terraform code. We advise you to browse the full code on github (LINK_HERE) — clone the repo and explore it in your favourite code editor

IAM Policies for AWS EKS Cluster

Kubernetes and worker nodes will access our AWS cloud resources to perform various tasks — managing Elastic Load Balancers, accessing Certificates, networks and security groups, etc.

In order to allow access to these and other relevant cloud resources, we need to attach roles with required policies to worker nodes in the cluster.

The management of roles and privileges is very provider-specific in cloud world, so this the the area where you’ll find most differences between cloud providers and gotchas you’ll want to work around.

The code for this blog post includes the basic sec of policies that will allow all key Kubernetes activities to be performed:

https://github.com/aleksav/k8s-blogs/blob/master/01-setup-k8s-on-aws-eks/terraform/eks-k8s-cluster/02-elb-iam-policy.tf

This is probably not the full list of policies for all use cases, so be prepared to add additional policies to this file in case you’re performing more specific workloads on your K8s cluster.

Time to create our k8s cluster

Now that we have described our code, we can execute it and see our infrastructure being provisioned on the cloud.

If you haven’t done already, clone the code repo, and navigate to terraform/eks-k8s-cluster directory.

At this point you should have Terraform installed, and AWS authenticator configured (see Prerequisites paragraph at the top of the page).

This is you chance to change the cluster name or tags by updating values in the input.auto.tfvars file.

So, first we need to validate our setup and code by running terraform init, followed by terraform plan (all from the eks-k8s-cluster directory).

Inspect the output of the terraform plan command — this will tell you details about all infrastructure that will be provisioned — you will see VPC, subnets, security groups, launch configurations, EC2 instances, and anything else that we will create on AWS. If you’re not on AWS free tier, applying the changes will generate cost on your bill — so make sure you’re ok to proceed from this stage.

Let’s not run terraform apply — this will take around 10 mins, so go and make a cup of coffee, and when you’re back, you should see the the terraform success message:

If you navigate to the AWS console, you should see all resources we created

K8s cluster created using Terraform viewed from AWS console
EC2 instances provisioned for the k8s cluster

Connecting to the cluster

We have the cluster running, next step is connecting to it and running kubectl commands.

At this point you need to have kubectl installed and configured to work with AWS authentication (see Prerequisites at the top of the page)

Kubectl requires the config available in ~/.kube/config. In order to fetch the correct config for our newly created cluster, we have 2 options, you can use either:

  • Get the config using aws cli command (replace cluster name with your own if different):
aws eks — region eu-west-1 update-kubeconfig — name tsf-k8s-blog
  • Terraform script we run earlier provides default kube config as an ouput, in file ./terraform/eks-k8s-cluster/kubeconfig_tsf-k8s-blog, which you can simply copy to the config expected location:
cp kubeconfig_tsf-k8s-blog ~/.kube/config

Once you have the correct file, you can finally connect to your cluster:

kubectl cluster-infokubectl get nodes

The output should be something like this:

And that’s it — we now have a working Kubernetes cluster with 2 nodes powered by AWS EKS, which can manage in automated, repeatable manner using terraform templates. Our infrastructure provisioning code (Terraform) can be pushed to code repository, shared and improved within the team making it’s long term management much easier.

Remember that k8s cluster uses resources that use energy cost money and, so once you don’t need the infrastructure anymore, simply destroy it using terraform destroy command — with your infrastructure-as-code setup, you can always recreate it in no time!

Mentions

This code is powered by official terraform-aws-eks module,with it’s test fixture used as a starting point: https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/eks_test_fixture

All code that we’re discussing is fully workable and can be accessed on github: https://github.com/aleksav/k8s-blogs/tree/master/01-setup-k8s-on-aws-eks

Aleksa is a CTO at TheStartupFactory.tech

--

--

Aleksa Vukotic
thestartupfactory.tech

software development pragmatist, startup tech by day and night