Deploying a Production-Ready Amazon EKS Cluster using Terraform

Saravanan Mani
6 min readFeb 6, 2023

--

Deploying a production-ready Amazon EKS cluster often requires a lot of time and effort in creating the cluster, and node groups, deploying the Add-ons, and configuring the additional Security Groups.

In this article, we will use Terraform to create the Amazon EKS Cluster along with the required Add-ons.

The Terraform template creates the following

· Create Required Subnets and Route Tables

· EKS cluster with Managed Node Group

· Enable Cluster Auto-Scaler

· Deploy ALB Ingress controller for L4-L7 application load balancing

· Deploy EFS CSI Driver for Application persistent storage

· Deployed Prometheus and Grafana for monitoring

Proposed Deployment Architecture:

Proposed Deployment Architecture

Prerequisites:

We need the following resources created as an on-time configuration.

1. Create the EKS Cluster VPC

2. Configure VPC

a. Create the default public subnet

b. Create Internet Gateway and update the default route table

c. Create the NAT gateway

d. Create the Private Subnet Route table and update routing route table

3. Configure VPC Peering

a. EKS VPC to Tools VPC

b. EKS VPC to Shared Services VPC

4. Update the EKS VPC’s default route table with the routes for accessing the Tools and Shared Services VPC

5. Update the EKS VPC’s private route table with the routes for accessing the Tools and Shared Services VPC

6. Create a new SSH key pair or use the existing key pair

7. Create an S3 bucket for storing the terraform state backups

8. Get the EKS worker node AMI-Id from the official Ubuntu on Amazon Elastic Kubernetes Service (EKS) website — https://cloud-images.ubuntu.com/docs/aws/eks/

a. Select the AMI Id based on the EKS version and region, This AMI will be used to provision the EKS cluster worker nodes

10. Crated the required IAM policy to be used as a Service Account by the Application Pods

Terraform modules used from the open-source community

1. EKS — https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/18.29.0

2. AWS EKS Cluster-Autoscaler — https://github.com/DNXLabs/terraform-aws-eks-cluster-autoscaler

3. AWS ALB Ingress Controller — https://github.com/kubernetes-sigs/aws-load-balancer-controller

4. AWS EKS EFS CSI Driver — https://github.com/DNXLabs/terraform-aws-eks-efs-csi-driver

AWS EKS Grafana Prometheus — https://github.com/DNXLabs/terraform-aws-eks-grafana-prometheus

Custom terraform modules and other resources:

1. Subnets — Used to create required Private and Public Subnets for launching the EKS cluster

2. IAM — Used to create the required IAM roles and Policies for automating the deployment of add-ons using terraform Kubernetes provider

Note: All the above public modules used in this project are downloaded and used as a local module

Terraform files and folder structure:

The following screenshot shows the folder structure that we have used in our environment, the GitHub link to the entire project is available in the later section of this blog

terraform project folder structure

Main.tf

provider "aws" {
region = "${var.AWS_REGION}"
}

resource "random_id" "worker-nodes" {
keepers = {
# Generate a new id each time we switch to a new AMI id
ami_id = "${var.ami_id}"
}

byte_length = 4
}

locals {
name = "${var.cluster_name}"
cluster_version = "${var.cluster_version}"
region = "${var.AWS_REGION}"
}

#Subnet module

module "subnets" {
source = "./modules/subnets"
vpc_id = var.vpc_id
route_table_id = var.route_table_id
private_subnet_1_cidr = var.private_subnet_1_cidr
private_subnet_2_cidr = var.private_subnet_2_cidr
private_subnet_3_cidr = var.private_subnet_3_cidr
public_subnet_1_cidr = var.public_subnet_1_cidr
public_subnet_2_cidr = var.public_subnet_2_cidr
public_subnet_3_cidr = var.public_subnet_3_cidr
cluster_name = var.cluster_name
tag_shutdown = var.tag_shutdown
tag_poc = var.tag_poc
tag_customer = var.tag_customer
tag_env = var.tag_env
tag_platform_type = var.tag_platform_type
}


################################################################################
# EKS Module
################################################################################

module "eks" {
source = "./modules/eks"

cluster_name = local.name
cluster_version = local.cluster_version

vpc_id = "${var.vpc_id}"
subnets = module.subnets.private_subnets

cluster_endpoint_private_access = true
cluster_endpoint_public_access = false
cluster_create_security_group = false
cluster_security_group_id = aws_security_group.controlplane.id
manage_aws_auth=false
enable_irsa = true

node_groups = {
node_group1 = {
name_prefix = "${var.node_group1_prefix}"
desired_capacity = 1
max_capacity = 2
min_capacity = 1

disk_size = 150
disk_type = "gp2"

launch_template_id = aws_launch_template.worker-node-group-1.id
launch_template_version = aws_launch_template.worker-node-group-1.default_version
instance_types=["t2.xlarge"]
capacity_type = "ON_DEMAND"

bootstrap_env = {
CONTAINER_RUNTIME = "containerd"
USE_MAX_PODS = false
}

additional_tags = {
"node-Application"=true
Name = "${var.node_group1_prefix}${random_id.worker-nodes.hex}"
Description = "eks cluster node group created by terraform"
SHUTDOWN = "${var.tag_shutdown}"
POC = "${var.tag_poc}"
CUSTOMER = "${var.tag_customer}"
ENV = "${var.tag_env}"
PLATFORM_TYPE = "${var.tag_platform_type}"

}
}

node_group2 = {
name_prefix = "${var.node_group2_prefix}"
desired_capacity = 1
max_capacity = 2
min_capacity = 1

disk_size = 150
disk_type = "gp2"

launch_template_id = aws_launch_template.worker-node-group-2.id
launch_template_version = aws_launch_template.worker-node-group-2.default_version
instance_types=["t2.xlarge"]
capacity_type = "ON_DEMAND"

bootstrap_env = {
CONTAINER_RUNTIME = "containerd"
USE_MAX_PODS = false
}

additional_tags = {
"node-Application"=true
Name = "${var.node_group2_prefix}${random_id.worker-nodes.hex}"
Description = "eks cluster node group created by terraform"
SHUTDOWN = "${var.tag_shutdown}"
POC = "${var.tag_poc}"
CUSTOMER = "${var.tag_customer}"
ENV = "${var.tag_env}"
PLATFORM_TYPE = "${var.tag_platform_type}"

}
}
}



tags = {
Name = local.name
Description = "eks cluster created by terraform"
SHUTDOWN = "${var.tag_shutdown}"
POC = "${var.tag_poc}"
CUSTOMER = "${var.tag_customer}"
ENV = "${var.tag_env}"
PLATFORM_TYPE = "${var.tag_platform_type}"

}
}



Variables.tf

variable "AWS_REGION" {
default = "us-west-2"
}

variable "vpc_id" {
type = string
default = ""
}

variable "tools_vpc_id" {
default = ""
}

variable "platform_vpc_id" {
default = ""
}

variable "ami_id" {
default=" "
}

variable "ssh_access_key" {
default = ""
}

variable "route_table_id" {
default = ""
}

variable "node_group1_prefix" {
default = "workers-APP1-"
}

variable "node_group2_prefix" {
default = "workers-APP2-"
}


variable "cluster_name" {
default = "eks-dev"
}

variable "cluster_version" {
default = "1.20"
}

variable "private_subnet_1_cidr" {
default = "172.16.2.0/24"
}

variable "private_subnet_2_cidr" {
default = "172.16.3.0/24"
}

variable "private_subnet_3_cidr" {
default = "172.16.4.0/24"
}

variable "public_subnet_1_cidr" {
default = "172.16.5.0/24"
}

variable "public_subnet_2_cidr" {
default = "172.16.6.0/24"
}

variable "public_subnet_3_cidr" {
default = "172.16.7.0/24"
}

#Tags

variable "tag_shutdown" {
default = "Never"
}

variable "tag_poc" {
default = "test@example.com"
}

variable "tag_customer" {
default = "Internal"
}

variable "tag_env" {
default = "DEV"
}

variable "tag_platform_type" {
default = "APP"
}

variable "namespace" {
default = "application"
}

variable "service_account_name" {
default = "ekssa"
}

variable "ApplicationsAccess_arn" {
default = "arn:aws:iam::XXXXXXXXXXX:policy/ApplicationsAccess"
}

backend.tf

Update this file and securely store the terraform states in the S3 Bucket.

AWS_Region = Region where the EKS will be provisioned

Bucket = s3 bucket name created in the previous step for storing terraform state

Key = terraform/eks-cluster-state-cluster- <<environment name>>

e.g., terraform/eks-cluster-state-cluster-app-dev

terraform {
backend "s3" {
bucket="eks-cluster-tf-state-backup"
key="terraform/eks-cluster-state-cluster-dev"
region="us-west-2"
}
}

Terraform Execute

After updating the above input files execute the below terraform command sequentially

Terraform init command will initialize the provider and other modules used in his automation.

terraform init

Terraform plan will dry run the template and show the expected changes will be made to the infrastructure

terraform plan

Terraform apply will apply the changes to the infrastructure.

terraform apply

Enter “Yes” to accept the changes.

Wait for the operation to complete,

Note: The provisioning will take up to 30 minutes. Please watch the console logs for the progress.

Accessing provisioned EKS cluster:

Set the kube config in the bastion host using the below command line to run Kubectl command

“aws eks update-kubeconfig --region <<region name>> --name -eks-<<cluster_name>>”
Example- “aws eks update-kubeconfig --region us-west-2 --name eks-dev”

GitHub Repository:

--

--