Provisioning AWS Infrastructure Using Terraform and Amazon Linux Based EKS Optimized Golden AMI Built by Packer (Argocd to Deploy Netflix)

Published in

Paul Zhao Projects

99 min readMar 14, 2024

This project is intended to provide a terraform template to provision EKS and its resources using Amazon Linux based EKS optimized Golden AMI built by Packer. We will deploy Netflix by Argocd in EKS worker node using terraform

Github Repos for this project:

Repo: Amazon Linux Based EKS Optimized Golden AMI Built by Packer

Repo: Argocd Installation by URL (Make sure you do not use official version since I tweaked manifest files for this project)

Repo: Netflix Deployment using Kubernetes (We need only Kubernetes folder of this repo)

Repo: Provisioning-AWS-Infrastructure-using-Terraform-Packer-Kubernetes-Ansible (Argocd to Deploy Netflix App)

Objectives:

Infrastructure Provisioning Automation: Implement automation for provisioning AWS infrastructure using Terraform to define and manage cloud resources efficiently.
Integration of Packer and Terraform: Integrate Packer-built Amazon Linux-based EKS Optimized Golden AMI with Terraform to ensure standardized and efficient deployment of worker nodes within Amazon EKS clusters.
Customized Worker Node Configuration: Utilize Terraform to leverage the launch template in Amazon EKS node groups, allowing for the customization of worker nodes’ configuration via userdata, facilitating the implementation of specific configurations or software installations (e.g., Ansible).
Enhanced Scalability and Efficiency: Utilize Amazon EKS to automatically manage the scaling and deployment of containerized applications, leveraging the optimized Golden AMI to ensure consistent performance and reliability across worker nodes.
Security and Compliance: Implement best practices for security and compliance by ensuring that infrastructure deployments adhere to AWS security standards and policies, leveraging Terraform’s infrastructure as code approach to enforce security controls and configurations consistently.
Documentation and Knowledge Transfer: Develop comprehensive documentation detailing the setup, configuration, and usage of the Terraform scripts, Packer-built Golden AMI, and Amazon EKS clusters to facilitate knowledge transfer and enable smooth maintenance and future enhancements.
Testing and Validation: Establish testing procedures to validate the correctness and reliability of the infrastructure provisioning process, including testing for infrastructure scalability, fault tolerance, and compatibility with containerized applications.
Continuous Integration and Deployment (CI/CD): Integrate the infrastructure provisioning process into CI/CD pipelines to automate the deployment and management of AWS infrastructure changes, ensuring rapid iteration and deployment cycles while maintaining reliability and consistency.
Enhance Continuous Deployment Efficiency: Implement a fully automated continuous deployment pipeline that leverages ArgoCD for deployment orchestration and Terraform for infrastructure as code management. This pipeline will enable seamless, automated updates and rollbacks of the Netflix service on Amazon EKS, minimizing downtime and human error while ensuring that the latest features and fixes are promptly and safely deployed to production.

Tools:

Packer

Packer is an open-source tool developed by HashiCorp that automates the creation of identical machine images or artifacts for multiple platforms from a single source configuration. These machine images can be in various formats, such as VirtualBox, VMware, AWS, Azure, Docker containers, and others. Packer allows developers and system administrators to define machine configurations as code, using configuration files written in JSON or HCL (HashiCorp Configuration Language). It then uses this configuration to automatically create machine images, ensuring consistency and reproducibility across different environments. Packer is commonly used in conjunction with other HashiCorp tools like Vagrant, Terraform, and Consul to streamline the development and deployment processes in infrastructure as code workflows.

Terraform

Terraform is an open-source infrastructure as code (IaC) tool developed by HashiCorp. It enables users to define and provision infrastructure resources declaratively using a high-level configuration language. With Terraform, you can describe the desired state of your infrastructure in configuration files, specifying the resources and their configurations such as servers, networks, storage, and more.

Key features of Terraform include:

Declarative Configuration: Users define the desired state of their infrastructure in configuration files using a domain-specific language (DSL). Terraform then works to reconcile the current state of the infrastructure with the desired state declared in the configuration.
Resource Graph: Terraform builds a dependency graph of all resources defined in the configuration files, enabling it to determine the order in which resources should be provisioned or updated.
Execution Plans: Before making any changes to the infrastructure, Terraform generates an execution plan showing what actions it will take, such as creating new resources, updating existing ones, or destroying resources.
State Management: Terraform keeps track of the state of the infrastructure it manages, storing this information in a state file. This state file allows Terraform to understand the current state of the infrastructure and make changes accordingly.
Provider Ecosystem: Terraform supports a wide range of cloud providers, infrastructure technologies, and services through provider plugins. These plugins allow Terraform to interact with various APIs to provision and manage resources.
Immutable Infrastructure: Terraform encourages the use of immutable infrastructure patterns, where infrastructure changes are made by creating new resources rather than modifying existing ones. This approach enhances reliability and makes it easier to roll back changes if necessary.

Overall, Terraform simplifies the process of managing infrastructure by treating it as code, enabling automation, versioning, and collaboration in infrastructure provisioning and management workflows.

AWS EKS

AWS EKS stands for Amazon Elastic Kubernetes Service. It is a managed Kubernetes service provided by Amazon Web Services (AWS) that simplifies the deployment, management, and scaling of containerized applications using Kubernetes on AWS infrastructure.

Key features of AWS EKS include:

Managed Kubernetes Control Plane: AWS EKS manages the Kubernetes control plane for you, including the etcd cluster, API server, scheduler, and other components. This relieves users from the operational overhead of managing these components themselves.
Integration with AWS Services: AWS EKS integrates with other AWS services such as Elastic Load Balancing (ELB), Identity and Access Management (IAM), CloudTrail, CloudWatch, and more, providing a seamless experience for deploying and managing Kubernetes applications on AWS.
Security and Compliance: AWS EKS helps users implement security best practices by providing features such as encryption at rest and in transit, IAM integration for fine-grained access control, and support for Kubernetes network policies.
High Availability and Scalability: AWS EKS is designed for high availability and scalability. It runs Kubernetes control plane instances across multiple Availability Zones (AZs) for redundancy and automatically scales to accommodate the workload demands of your applications.
Compatibility with Standard Kubernetes Tools: AWS EKS is compatible with standard Kubernetes tools and APIs, allowing users to leverage existing Kubernetes skills, workflows, and ecosystem tools.
Hybrid and Multi-Cloud Deployments: AWS EKS supports hybrid and multi-cloud deployments, allowing users to run Kubernetes clusters across AWS and on-premises environments, or even across different cloud providers.

Overall, AWS EKS simplifies the process of running Kubernetes clusters on AWS infrastructure, enabling users to focus on building and deploying containerized applications without worrying about the underlying infrastructure management complexities.

Prometheus

Prometheus is an open-source monitoring and alerting toolkit originally built by SoundCloud and now a part of the Cloud Native Computing Foundation (CNCF). It has become increasingly popular for its powerful and dynamic data model, query language (PromQL), and its ability to handle multi-dimensional data collection, querying, and alerting.

Prometheus works on a pull model, where it scrapes (pulls) metrics from configured targets at specified intervals, stores them in its database, and then allows for flexible and powerful querying and visualization of this data. This model contrasts with push-based systems, where data is sent to the monitoring system. However, Prometheus can also support push-based metrics via an intermediary gateway for certain use cases.

Key features of Prometheus include:

Multi-Dimensional Data Model: Metrics are identified by a metric name and key/value pairs (also known as labels), enabling highly detailed data tracking and querying.
PromQL (Prometheus Query Language): A powerful query language that allows users to select and aggregate time series data in real-time. It enables defining precise alerts or conditions that can trigger notifications.
Service Discovery: Prometheus can discover targets to monitor in various ways, including DNS, Kubernetes, and Consul, making it highly adaptable to dynamic and cloud-based environments.
High Availability and Federation: Support for running in a high-availability setup and federation capabilities, allowing Prometheus servers to scrape selected metrics from other Prometheus servers.
Alerting: Prometheus’s Alertmanager handles alerts sent by client applications such as the Prometheus server. It supports routing, silencing, and inhibition of alerts, and it integrates with various notification platforms.
Efficient Storage: Prometheus stores time series data in a highly efficient, compressed format on disk, and it has mechanisms to ensure the longevity and scalability of data storage.

Prometheus is widely used for its simplicity, reliability, and suitability for dynamic, cloud-based environments. It is commonly used in conjunction with Grafana for visualizing the collected data, providing a comprehensive monitoring solution that can drive insights into system performance, availability, and health.

Node Exporter

Node Exporter is a Prometheus exporter that collects hardware and OS metrics exposed by *NIX kernels, with support for Windows as well. It is a tool in the Prometheus monitoring system that allows you to measure various system metrics, including CPU, disk, and memory usage, as well as network statistics. Node Exporter is designed to run on a host machine to collect and expose metrics to a Prometheus server, enabling you to monitor the performance and health of that machine.

Prometheus is a powerful open-source monitoring and alerting toolkit, and Node Exporter is a crucial component for gathering the necessary system-level metrics that can help you understand your infrastructure’s state. You deploy Node Exporter on the machines you want to monitor, and it exposes a web interface on a specific port (by default, 9100) where Prometheus can pull metrics.

The metrics collected by Node Exporter include detailed information that can be vital for system monitoring and troubleshooting. These metrics are accessible via HTTP in a format that Prometheus can query and store. You can then visualize and analyze these metrics using tools like Grafana, which can connect to Prometheus, to create dashboards that provide insights into the health and performance of your infrastructure.

Grafana

Grafana is an open-source platform for monitoring and observability that allows you to visualize, query, and alert on metrics and logs no matter where they are stored. It provides a powerful and elegant way to create, explore, and share dashboards and data with your team and the world. Grafana is designed to work with a wide variety of data sources, including Prometheus, InfluxDB, Elasticsearch, and many others.

Key features of Grafana include:

Versatile Visualizations: Grafana supports a wide range of visualizations, from simple line graphs to complex histograms, geospatial maps, and even custom visualization plugins developed by the community.
Dynamic Dashboards: Dashboards in Grafana are highly configurable and can include variables, making them dynamic and interactive. This allows users to switch between different data views and drill down into specific metrics for detailed analysis.
Alerting: Grafana includes a built-in alerting engine that allows you to define alert rules based on your queries. These alerts can notify you through various channels (such as email, Slack, or PagerDuty) when certain thresholds are breached, helping you respond quickly to incidents.
Multiple Data Sources: Grafana supports querying multiple data sources in a single dashboard, enabling complex cross-source analyses. This is particularly useful in environments where data is spread across different systems and formats.
Collaboration and Sharing: Grafana makes it easy to share dashboards and data with others. You can control access with fine-grained permission settings and even share data and dashboards publicly if desired.
Extensibility: Thanks to its plugin architecture, Grafana can be extended with additional data sources, panels, and apps. The community has contributed a wide range of plugins that add new features and integrations.

Grafana is widely used for IT operations monitoring, application performance monitoring (APM), and infrastructure health monitoring. It is also used in various other fields that require detailed analysis of time-series data, such as sensor data in IoT applications and financial data analysis. Its ability to bring together and visualize data from different sources makes Grafana a cornerstone tool in the observability and monitoring space.

SonarQube

SonarQube is an open-source platform for continuous inspection of code quality. It automates the process of detecting bugs, vulnerabilities, and code smells in your source code. Additionally, it offers metrics to help you understand the complexity of your code, the technical debt it might be carrying, and the overall health of your applications. By integrating SonarQube into your development process, you can ensure that your codebase maintains high standards of quality, readability, and security.

Key features of SonarQube include:

Wide Range of Supported Languages: SonarQube can analyze code written in many programming languages, including Java, C#, JavaScript, Python, PHP, TypeScript, and many others, making it versatile for various development projects.
Code Quality Metrics: It provides detailed metrics on code quality, including reliability (bugs), security vulnerabilities, and maintainability (code smells and technical debt). These metrics help developers and teams to prioritize fixes and improvements.
Pull Request Analysis: SonarQube can be integrated into the CI/CD pipeline to analyze pull requests, providing feedback on potential issues before the code is merged into the main branch. This helps in catching and fixing issues early in the development process.
Quality Gates: It allows the definition of quality gates, which are criteria or thresholds that code must meet before it can be released or merged. This ensures that only code that meets predefined standards of quality is deployed or integrated.
Security Analysis: SonarQube includes rules to detect security vulnerabilities and code patterns that could lead to security breaches. This helps in proactively improving the security posture of your applications.
Customizable Rules and Profiles: Teams can customize analysis rules and create profiles tailored to their specific needs or coding standards, ensuring that the analysis is relevant and aligned with project goals.
Dashboard and Reports: It provides a comprehensive dashboard and detailed reports of the code analysis, offering insights into the health of the codebase and highlighting areas for improvement.

SonarQube can be run as a standalone tool or integrated into your development environment and CI/CD pipelines, supporting both automated and manual code review processes. It’s an essential tool for development teams aiming to improve code quality, maintainability, and security, ultimately leading to more reliable and robust software.

Trivy

Trivy is a comprehensive, open-source vulnerability scanner for containers and other artifacts, designed to make it easy to find vulnerabilities within your container images, file systems, and even source code repositories. It is developed by Aqua Security and has become popular in the DevSecOps community for its simplicity and effectiveness in identifying security issues.

Key features and capabilities of Trivy include:

Detection of Vulnerabilities: Trivy can scan container images for known vulnerabilities, pulling data from a wide range of sources, including NVD, Red Hat, and Debian security databases. It can identify vulnerabilities in OS packages (like Debian, Alpine, and RHEL) and application dependencies for programming languages such as Ruby, Python, JavaScript, and Java.
Comprehensive Scans: Beyond container images, Trivy can scan file systems and Git repositories for vulnerabilities, providing a broader security assessment capability.
Easy Integration: Trivy is designed to be easily integrated into CI/CD pipelines, making it straightforward to include vulnerability scanning as part of the build and deployment process. This facilitates early detection of security issues, aligning with the “shift left” security approach.
High Accuracy and Low False Positives: Trivy aims to reduce false positives and ensure high accuracy in its vulnerability detection, making it a reliable tool for developers and security teams.
Misconfiguration Detection: Besides vulnerability scanning, Trivy can identify misconfigurations in infrastructure as code (IaC) files, such as Docker, Kubernetes, Terraform, and AWS CloudFormation templates, helping to ensure that cloud environments are configured securely.
Secret Scanning: It can also scan for hardcoded secrets like passwords and API keys within your project’s files and Git repositories, helping to prevent potential security breaches.

Trivy is particularly valued for its simplicity of use; it can be run with a single command, requiring no extensive configuration to get started. Its ability to provide fast and accurate scans across multiple facets of the development pipeline makes it an essential tool in the effort to secure software delivery processes.

Prerequisites:

FYI: Ubuntu solution is provided in this project, but Windows and Mac OS are allowed

AWS CLI installation
Configure AWS programmatically to communicate Terraform with AWS
Packer installation
Terraform installation

Step by step instructions:

FYI: All instructions assume you may use ubuntu server. In case you are on a different OS, please do adjustment accordingly. For instance, on centos server, you may do yum install instead of apt install

Install AWS CLI:

Official guidance: https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html

You can install the AWS CLI on Ubuntu with the following commands:

sudo apt install awscli

check aws cli

aws --version

Configure AWS programmatically to communicate in between Serverless and Lambda

To configure aws cli with default profile

aws configure

Only below 2 values need to be provided

AWS Access Key ID []:
AWS Secret Access Key []:

Where should you obtain them?

Let us jump into AWS console on IAM

Make sure you have a user with admin user (if you don’t allow admin level of access, you must be granted with all permissions need to interact with all resources created in AWS)

FYI: You may need to test along the way if you don’t have admin level of access since programmatic access is associated with policies attached the user

Install Packer:

To build Amazon Linux Based EKS Optimized Golden AMI, we need to have packer installed.

Install packer on ubuntu server

Official Guidance: https://developer.hashicorp.com/packer/tutorials/docker-get-started/get-started-install-cli

You can install the Packer on Ubuntu with the following commands:

Add the HashiCorp GPG key

curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -

Add the official HashiCorp Linux repository

sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"

Update and install

sudo apt-get update && sudo apt-get install packer

To verify Packer installation

packer --version

Let us now kick off our project!

Git clone ami from the repo

Assuming you have git pre installed. If not, please install using guidance here

git clone https://github.com/lightninglife/golden-ami-amazon-eks-optimized-from-aws-official-repo.git

FYI: You’d better git clone my repo instead of official repo since I tweaked a few to make it serve our needs (I will explain them in details)

Git clone terraform codes from the repo

git clone https://github.com/lightninglife/Provisioning-AWS-Infrastructure-using-Terraform-Packer-Kubernetes-Ansible.git

We now build our ami

FYI: I would like to explain about the key file in this folder for variables we need to adjust

eks-worker-al2-variables.json

{
    "additional_yum_repos": "",
    "ami_component_description": "(k8s: {{ user `kubernetes_version` }}, docker: {{ user `docker_version` }}, containerd: {{ user `containerd_version` }})",
    "ami_description": "EKS Kubernetes Worker AMI with AmazonLinux2 image",
    "ami_regions": "us-east-1",
    "ami_users": "",
    "associate_public_ip_address": "",
    "aws_access_key_id": "{{env `AWS_ACCESS_KEY_ID`}}",
    "aws_region": "us-east-1",
    "aws_secret_access_key": "{{env `AWS_SECRET_ACCESS_KEY`}}",
    "aws_session_token": "{{env `AWS_SESSION_TOKEN`}}",
    "binary_bucket_name": "amazon-eks",
    "binary_bucket_region": "us-west-2",
    "cache_container_images": "false",
    "cni_plugin_version": "v1.2.0",
    "containerd_version": "1.7.*",
    "creator": "{{env `USER`}}",
    "docker_version": "20.10.*",
    "enable_fips": "false",
    "encrypted": "false",
    "kernel_version": "",
    "kms_key_id": "",
    "launch_block_device_mappings_volume_size": "8",
    "pause_container_version": "3.5",
    "pull_cni_from_github": "true",
    "remote_folder": "/tmp",
    "runc_version": "1.1.*",
    "security_group_id": "",
    "source_ami_filter_name": "amzn2-ami-kernel-5.10-hvm-2.0.20240131.0-x86_64-gp2",
    "source_ami_id": "ami-0cf10cdf9fcd62d37",
    "source_ami_owners": "137112412989",
    "ssh_interface": "",
    "ssh_username": "ec2-user",
    "ssm_agent_version": "",
    "subnet_id": "",
    "temporary_security_group_source_cidrs": "",
    "volume_type": "gp2",
    "working_dir": "{{user `remote_folder`}}/worker"
}

Above file is for all variables that we may tweak accordingly

FYI: Keep this in mind, don’t ever change this value unless your ip address from China

"binary_bucket_region": "us-west-2"

Do not assume you need to adjust it based on the region you use in AWS since this is where binary is located in AWS, not the region you may apply your infrastructure to

To select based AMI used for this golden age

"source_ami_filter_name": "amzn2-ami-kernel-5.10-hvm-2.0.20240131.0-x86_64-gp2",
"source_ami_id": "ami-0cf10cdf9fcd62d37",
"source_ami_owners": "137112412989",

It is self explanatory that ami name, ami id and ami owners need to supplied with

FYI: Keep in mind, this ami has to be Amazon Linux2. From AWS console, we may pinpoint it as shown below

Click Launch instances

As highlighted below, Amazon Linux 2 was selected and ami id is shown, and make sure you double check the region (AMI id is region specific — same ami is with different id in different regions)

Search ami id as shown below

Click Community AMIs Tab to find all info needed

ami name, ami id and owner of the ami

FYI: We’re using community AMIs since it’s open to public without you subscribing

Adjust below variable for whatever region you would like to deploy your resources in AWS

"ami_regions": "us-east-1"

There are other variables you may update accordingly, but above variables may request your attention

FYI: You would like to install more tools in the worker nodes, you may add it in this file (I don’t recommend this way since it’s gonna mix with your other builds down the road. It’s way better if you keep this build with knowing it’s an Amazon Official EKS optimized Linux 2 Server)

install-worker.sh

We may run below script inside amazon-eks-ami folder

make k8s=1.29

FYI: To make sure you build an AMI with a specific Kubernetes version, adjust k8s=1.29, this is crucial as older version of kubernetes may lead to errors

We may now check instance in the region you would like to apply your resources (e.g. us-east-1)

FYI: You may have errors when building, you may fix it based on errors from terminal

At the final build stage, you may observe that the instance is stopped

From terminal, you may see AMI to become ready

From ami in AWS console

Now the build is completed

From terminal

From AWS console

Now we’re all set for terraform deploy as Golden AMI built

Install Terraform:

In case you don’t have wget and unzip installed

sudo apt-get update
sudo apt-get install -y wget unzip

Downloading the package

wget https://releases.hashicorp.com/terraform/1.7.4/terraform_1.7.4_linux_amd64.zip

Extract the Terraform binary

unzip terraform_1.7.4_linux_amd64.zip

Move Terraform to your path

sudo mv terraform /usr/bin/

Verify terraform version

terraform version

Note: Here is to share a trick for pushing codes to Github as a repo if it’s bigger than 100 MB

Download git-lfs package

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash

Install git-lfs package

sudo apt-get install git-lfs

Install lfs

git lfs install

Then we are all good to push codes to Github with the following codes

git add .
git commit -am "initial commit"
git remote add origin https://github.com/lightninglife/eks_netflix_devsecops_project_argocd.git
git push -u origin master

Folder tree

├── argocd_netflix
│   ├── api_server_endpoint.sh
│   ├── data.tf
│   ├── main.tf
│   ├── output_secrets.sh
│   ├── outputs.tf
│   ├── providers.tf
│   ├── s3.tf
│   ├── terraform.tfvars
│   ├── variables.tf
│   ├── versions.tf
│   ├── web-ec2.pem
├── eks_alb
│   ├── api_server_endpoint.sh
│   ├── assume_role_policy_updated.json
│   ├── certificate.pem
│   ├── data.tf
│   ├── main.tf
│   ├── outputs.tf
│   ├── providers.tf
│   ├── s3.tf
│   ├── terraform.tfvars
│   ├── variables.tf
│   ├── versions.tf
│   ├── web-ec2.pem
├── eks_cluster
│   ├── assume_role_policy.json
│   ├── data.tf
│   ├── main.tf
│   ├── outputs.tf
│   ├── providers.tf
│   ├── s3.tf
│   ├── terraform.tfvars
│   ├── variables.tf
│   ├── versions.tf
│   ├── web-ec2.pem
├── eks_deletion
│   ├── api_server_endpoint.sh
│   ├── data.tf
│   ├── main.tf
│   ├── providers.tf
│   ├── s3.tf
│   ├── terraform.tfvars
│   ├── variables.tf
│   ├── versions.tf
│   ├── web-ec2.pem
├── modules
│   ├── argocd_netflix
│   │   ├── argocd.tf
│   │   ├── data.tf
│   │   ├── variables.tf
│   │   ├── versions.tf
│   ├── alb
│   │   ├── alb.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   ├── asg
│   │   ├── asg.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   ├── eks
│   │   ├── bastion.tf
│   │   ├── eks.tf
│   │   ├── data.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   ├── iam
│   │   ├── iam.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   │   └── data.tf
│   ├── eks_alb
│   │   ├── alb_target_group.tf
│   │   ├── argocd_credentials.tf
│   │   ├── controller.tf
│   │   ├── data.tf
│   │   ├── get_argocd_admin_password.sh
│   │   ├── iam.tf
│   │   ├── ingress_class.tf
│   │   ├── ingress.tf
│   │   ├── local_value.tf
│   │   ├── outputs.tf
│   │   └── variables.tf
│   ├── eks_deletion
│   │   ├── data.tf
│   │   ├── main.tf
│   │   └── variables.tf
│   ├── sg
│   │   ├── outputs.tf
│   │   ├── sg.tf
│   │   └── variables.tf
│   └── vpc
│       ├── outputs.tf
│       ├── variables.tf
│       └── vpc.tf

Modules

Variables needed in alb.tf file

asg — subfolder

asg.tf

# netflix

resource "aws_launch_template" "netflix" {
  name_prefix   = var.aws_launch_template_netflix_name_prefix # "web-lt"
  image_id      = var.aws_launch_template_netflix_image_id # "ami-0c55b159cbfafe1f0"  # Replace with a valid AMI ID
  instance_type = var.aws_launch_template_netflix_instance_type # "t2.micro"

  user_data = base64encode(var.aws_launch_template_netflix_user_data) # file("${path.module}/../web_userdata.sh")
  
  vpc_security_group_ids = var.aws_launch_template_netflix_vpc_security_group_ids
  
  block_device_mappings {
    device_name = var.aws_launch_template_netflix_block_device_mappings_device_name # "/dev/sda1"
    ebs {
      volume_size = var.aws_launch_template_netflix_block_device_mappings_volume_size # 20
    }
  }
  
  key_name = var.key_pair_name
  
  # network_interfaces {
  #   security_groups = var.aws_launch_template_netflix_network_interfaces_security_groups
  #   associate_public_ip_address = true
  # }


  lifecycle {
    create_before_destroy = true # var.aws_launch_template_web_create_before_destroy # true
  }
}

Above script creates one single launch template for EKS Worker Nodes

aws_launch_template_netflix_user_data

From above script, you could see that we use a variable named aws_launch_template_netflix_user_data for userdata, whose value is provided in terraform.tfvars under eks_cluster folder

FYI: There is another alternative to apply to use data "template_file" function in terraform. But we will not be available to provide a variable in template_file. secret access key is involved, so I chose not to use this method. However, it might be useful for you in certain circumstance, please refer to reference pages below for userdata variables

Issue with solutions: https://stackoverflow.com/questions/50835636/accessing-terraform-variables-within-user-data-provider-template-file

You can do this using a template_file data source:

data "template_file" "init" {
  template = "${file("router-init.sh.tpl")}"  vars = {
    some_address = "${aws_instance.some.private_ip}"
  }
}Then reference it inside the template like:#!/bin/bashecho "SOME_ADDRESS = ${some_address}" > /tmp/Then use that for the user_data: user_data = ${data.template_file.init.rendered}

Official terraform page: https://registry.terraform.io/providers/hashicorp/template/latest/docs/data-sources/file

Terraform EC2 userdata and variables: https://faun.pub/terraform-ec2-userdata-and-variables-a25b3859118a

It’s noted that terraform doesn’t suggest this option either

Although in principle template_file can be used with an inline template string, we don't recommend this approach because it requires awkward escaping. Instead, just use template syntax directly in the configuration.

FYI: Kubernetes node group will create a EKS managed asg on behalf of us, we don’t have to create a resource for like resource “aws_autoscaling_group”

outputs.tf — asg

output "launch_template_id_netflix" {
  value = aws_launch_template.netflix.id
}

Above script will output launch template ids we are about to create for both web

variables.tf — asg

# netflix

variable "aws_launch_template_netflix_vpc_security_group_ids" {
  description = "List of security group IDs for the AWS Launch Template used in netflix EKS setup"
  type        = list(string)
  # You can provide a default value if needed:
  # default     = ["sg-xxxxxxxxxxxxxxx", "sg-yyyyyyyyyyyyyyy"]
}

variable "aws_launch_template_netflix_name_prefix" {
  description = "Name prefix for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_image_id" {
  description = "AMI ID for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_instance_type" {
  description = "Instance type for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_block_device_mappings_device_name" {
  description = "Device name for block device mappings in the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_block_device_mappings_volume_size" {
  description = "Volume size for block device mappings in the AWS launch template"
  type        = number
}

variable "aws_launch_template_netflix_create_before_destroy" {
  description = "Lifecycle setting for create_before_destroy in the AWS launch template"
  type        = bool
}

variable "aws_autoscaling_group_netflix_desired_capacity" {
  description = "Desired capacity for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_max_size" {
  description = "Maximum size for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_min_size" {
  description = "Minimum size for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_launch_template_version" {
  description = "Launch template version for the AWS Auto Scaling Group"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_key" {
  description = "Tag key for the AWS Auto Scaling Group instances"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_value" {
  description = "Tag value for the AWS Auto Scaling Group instances"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_propagate_at_launch" {
  description = "Tag propagation setting for the AWS Auto Scaling Group instances"
  type        = bool
}

variable "aws_launch_template_netflix_user_data" {
  description = "Userdata file"
  type        = string
}

variable "aws_autoscaling_group_netflix_vpc_zone_identifier" {
  description = "subnet id"
  type        = list(string)
}

variable "aws_launch_template_netflix_network_interfaces_security_groups" {
  description = "List of security group IDs to associate with network interfaces in the launch template"
  type        = list(string)
# You can set default security groups here if needed
}

variable "eks_cluster_netflix_name" {
  description = "Name of the netflix EKS cluster"
  type        = string
}

variable "aws_eks_node_group_instance_types" {
  description = "Instance types for the EKS node group"
  type        = string
}

# variable "kubernetes_network_policy_jenkins_network_policy_spec_ingress_app" {
#   description = "The label selector for matching pods in the ingress rule."
#   type        = string
# }

variable "aws_eks_cluster_netflix_version" {
  description = "The version of netflix to use with AWS EKS cluster"
  type        = string
  # You can set your desired default value here
}

variable "key_pair_name" {
  description = "Name of the AWS Key Pair to associate with EC2 instances"
  type        = string
  # Set a default value if needed
}

Variables needed in asg.tf file

eks — subfolder

bastion.tf

# Bastion for netflix
resource "aws_instance" "eks_cluster_netflix_bastion_host" {
  ami           = var.aws_instance_eks_cluster_netflix_bastion_host_ami # "ami-12345678"  # Specify an appropriate AMI for your region
  instance_type = var.aws_instance_eks_cluster_netflix_bastion_host_instance_type # "t2.micro"
  key_name      = var.key_pair_name
  subnet_id     = var.aws_instance_eks_cluster_netflix_bastion_host_subnet_id # "subnet-12345678"  # Specify the ID of your public subnet

  security_groups = var.aws_instance_eks_cluster_netflix_bastion_host_security_groups # aws_security_group.bastion_host_sg.id

  tags = {
    Name = var.aws_instance_eks_cluster_netflix_bastion_host_tags # "bastion-host"
  }

  provisioner "file" {
    source      = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source # "/path/to/your/key.pem"
    destination = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination # "/home/ec2-user/key.pem"  # Adjust the destination path as needed
    
    
    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)
      host        = self.public_ip
    }

  }
}

resource "null_resource" "force_provisioner" {
  triggers = {
    always_run = timestamp()
  }

  depends_on = [aws_instance.eks_cluster_netflix_bastion_host]
}


resource "null_resource" "trigger_remote_exec" {
  depends_on = [aws_instance.eks_cluster_netflix_bastion_host]
  
  triggers = {
    always_run = timestamp()
  }

  provisioner "remote-exec" {
    inline = var.aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline # "chmod 400 /home/ec2-user/web-ec2.pem"


    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)  # Specify the path to your private key
      host        = aws_eip.eks_cluster_netflix_bastion_eip.public_ip
  }
  }
}

resource "aws_eip" "eks_cluster_netflix_bastion_eip" {
  instance = aws_instance.eks_cluster_netflix_bastion_host.id
}

# Define other resources such as route tables, security groups for EKS worker nodes, etc.

# Bastion for netflix
resource "aws_instance" "eks_cluster_netflix_bastion_host" {
  ami           = var.aws_instance_eks_cluster_netflix_bastion_host_ami # "ami-12345678"  # Specify an appropriate AMI for your region
  instance_type = var.aws_instance_eks_cluster_netflix_bastion_host_instance_type # "t2.micro"
  key_name      = var.key_pair_name
  subnet_id     = var.aws_instance_eks_cluster_netflix_bastion_host_subnet_id # "subnet-12345678"  # Specify the ID of your public subnet

  security_groups = var.aws_instance_eks_cluster_netflix_bastion_host_security_groups # aws_security_group.bastion_host_sg.id

  tags = {
    Name = var.aws_instance_eks_cluster_netflix_bastion_host_tags # "bastion-host"
  }

  provisioner "file" {
    source      = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source # "/path/to/your/key.pem"
    destination = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination # "/home/ec2-user/key.pem"  # Adjust the destination path as needed
    
    
    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)
      host        = self.public_ip
    }

  }
}

Above script will create a bastion host for EKS worker nodes with private key added to the server using ec2-web.pem file

Note: host = self.public_ip is crucial since the ip address we try to connect to will be the bastion’s own ip address

resource "null_resource" "force_provisioner" {
  triggers = {
    always_run = timestamp()
  }
depends_on = [aws_instance.eks_cluster_netflix_bastion_host]
}

FYI: Above script was tweak to make sure the userdata will be run as expected when updating the terraform pipeline

resource "null_resource" "trigger_remote_exec" {
  depends_on = [aws_instance.eks_cluster_netflix_bastion_host]
  triggers = {
    always_run = timestamp()
  }

  provisioner "remote-exec" {
    inline = var.aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline # "chmod 400 /home/ec2-user/web-ec2.pem"


    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)  # Specify the path to your private key
      host        = aws_eip.eks_cluster_netflix_bastion_eip.public_ip
  }
  }
}

FYI: We tweaked host as I found the ip address conflicting issue when original ip address was used rather than the EIP address when using ubuntu server. To avoid this issue, we force host to be assigned to the public ip address of EIP

Also, provisioner “file” in instance section will connect to the server using private key locally

provisioner "file" {
    source      = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source # "/path/to/your/key.pem"
    destination = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination # "/home/ec2-user/key.pem"  # Adjust the destination path as needed
    
    
    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)
      host        = self.public_ip
    }

Below script will trigger the the creation of bastion every time we run the terraform pipeline, and run the script in inline section to grant read only permission for web-ec2.pem file under root level of the bastion

"chmod 400 /home/ec2-user/web-ec2.pem"

resource "null_resource" "trigger_remote_exec" {
  depends_on = [aws_instance.eks_cluster_netflix_bastion_host]
  
  triggers = {
    always_run = timestamp()
  }

  provisioner "remote-exec" {
    inline = var.aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline # "chmod 400 /home/ec2-user/web-ec2.pem"


    connection {
      type        = var.aws_instance_eks_cluster_netflix_bastion_host_file_type # "ssh"
      user        = var.aws_instance_eks_cluster_netflix_bastion_host_file_user # "ec2-user"
      private_key = file(var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_source)  # Specify the path to your private key
      host        = aws_instance.eks_cluster_netflix_bastion_host.public_ip
  }
  }
}

Lastly, we create an EIP to attach to this bastion for public access

resource "aws_eip" "eks_cluster_netflix_bastion_eip" {
  instance = aws_instance.eks_cluster_netflix_bastion_host.id
}

eks.tf — eks

# esk netflix
resource "aws_eks_cluster" "netflix" {
  name     = var.eks_cluster_netflix_name # "netflix-cluster"
  role_arn = var.aws_eks_cluster_netflix_role_arn

  vpc_config {
    subnet_ids = var.subnets # Replace with your subnet IDs
    security_group_ids = [var.aws_eks_cluster_netflix_security_group_ids]
  }

  version = var.aws_eks_cluster_netflix_version
  
  enabled_cluster_log_types = var.aws_eks_cluster_netflix_enabled_cluster_log_types # ["api", "audit", "authenticator", "controllerManager", "scheduler"]

  # Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
  # Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
  depends_on = [
    var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy, # aws_iam_role_policy_attachment.eks-AmazonEKSClusterPolicy,
    var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController # aws_iam_role_policy_attachment.eks-AmazonEKSVPCResourceController,
  ]

}

resource "aws_eks_node_group" "netflix" {
  cluster_name    = aws_eks_cluster.netflix.name
  node_group_name = var.aws_eks_node_group_netflix_name # "netflix-node-group"
  node_role_arn   = var.aws_eks_node_group_netflix_role_arn
  subnet_ids      = var.subnets # Replace with your subnet IDs
  # instance_types  = [var.aws_eks_node_group_instance_types] # ["t2.micro"]
  scaling_config {
    desired_size = var.aws_eks_node_group_desired_capacity # 2
    min_size        = var.aws_eks_node_group_min_size # 1
    max_size        = var.aws_eks_node_group_max_size # 3
  }
  launch_template {
    id = var.aws_eks_node_group_launch_template_name_prefix_netflix # "id"
    version     = var.aws_eks_node_group_launch_template_version # "$Latest"

  }

  depends_on = [
    var.eks_worker_node_policy_attachment_netflix,
    var.eks_cni_policy_attachment_netflix,
    var.eks_ec2_container_registry_readonly_attachment_netflix
  ]
}

resource "aws_eks_addon" "netflix" {
  cluster_name                = aws_eks_cluster.netflix.name
  addon_name                  = var.aws_eks_addon_netflix_addon_name # "vpc-cni"
  addon_version               = var.aws_eks_addon_netflix_addon_version # "v1.16.2-eksbuild.1" #e.g., previous version v1.9.3-eksbuild.3 and the new version is v1.10.1-eksbuild.1
}

To create eks cluster for web (netflix), we use below script

resource "aws_eks_cluster" "netflix" {
  name     = var.eks_cluster_netflix_name # "netflix-cluster"
  role_arn = var.aws_eks_cluster_netflix_role_arn

  vpc_config {
    subnet_ids = var.subnets # Replace with your subnet IDs
    security_group_ids = [var.aws_eks_cluster_netflix_security_group_ids]
  }

  version = var.aws_eks_cluster_netflix_version
  
  enabled_cluster_log_types = var.aws_eks_cluster_netflix_enabled_cluster_log_types # ["api", "audit", "authenticator", "controllerManager", "scheduler"]

  # Ensure that IAM Role permissions are created before and deleted after EKS Cluster handling.
  # Otherwise, EKS will not be able to properly delete EKS managed EC2 infrastructure such as Security Groups.
  depends_on = [
    var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy, # aws_iam_role_policy_attachment.eks-AmazonEKSClusterPolicy,
    var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController # aws_iam_role_policy_attachment.eks-AmazonEKSVPCResourceController,
  ]

}

FYI: Please keep this in mind. Otherwise, you will experience networking issue when trying to join worker node to EKS cluster

security_group_ids = [var.aws_eks_cluster_netflix_security_group_ids]

Above security group must include an ingress rule to allow access to vpc on port 443

Also, make sure version of the kubernetes server is matching with kubernetes’ client we install when building Golden ami. If you still can recall, the version was 1.29.

If this version is not matching, when using Kubectl, you may expect errors

version = var.aws_eks_cluster_ansible_version

One more concern is over dependencies

depends_on = [
    var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy, # aws_iam_role_policy_attachment.eks-AmazonEKSClusterPolicy,
    var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController # aws_iam_role_policy_attachment.eks-AmazonEKSVPCResourceController,
  ]

Creation of EKS depends on the 2 policies above. Otherwise, it will not be created

Below script will create work node using launch template we explaned previously

resource "aws_eks_node_group" "netflix" {
  cluster_name    = aws_eks_cluster.netflix.name
  node_group_name = var.aws_eks_node_group_netflix_name # "netflix-node-group"
  node_role_arn   = var.aws_eks_node_group_netflix_role_arn
  subnet_ids      = var.subnets # Replace with your subnet IDs
  # instance_types  = [var.aws_eks_node_group_instance_types] # ["t2.micro"]
  scaling_config {
    desired_size = var.aws_eks_node_group_desired_capacity # 2
    min_size        = var.aws_eks_node_group_min_size # 1
    max_size        = var.aws_eks_node_group_max_size # 3
  }
  launch_template {
    id = var.aws_eks_node_group_launch_template_name_prefix_netflix # "id"
    version     = var.aws_eks_node_group_launch_template_version # "$Latest"

  }

  depends_on = [
    var.eks_worker_node_policy_attachment_netflix,
    var.eks_cni_policy_attachment_netflix,
    var.eks_ec2_container_registry_readonly_attachment_netflix
  ]
}

Creation of EKS work group depends on the 3 policies above. Otherwise, it will not be created

To scale in/ out, you may need adjust below values in terraform.tfvars

scaling_config {
    desired_size = var.aws_eks_node_group_desired_capacity # 2
    min_size        = var.aws_eks_node_group_min_size # 1
    max_size        = var.aws_eks_node_group_max_size # 3
  }

Then, we may have options to add resources for add-on

resource "aws_eks_addon" "netflix" {
  cluster_name                = aws_eks_cluster.netflix.name
  addon_name                  = var.aws_eks_addon_netflix_addon_name # "vpc-cni"
  addon_version               = var.aws_eks_addon_netflix_addon_version # "v1.16.2-eksbuild.1" #e.g., previous version v1.9.3-eksbuild.3 and the new version is v1.10.1-eksbuild.1
}

outputs.tf — eks

output "eks_cluster_netflix_name" {
  value = aws_eks_cluster.netflix.id
}

output "eks_cluster_netflix" {
  value = aws_eks_cluster.netflix
}
 
output "eks_cluster_netflix_url" {
  value = aws_eks_cluster.netflix.identity[0].oidc[0].issuer
}

output "eks_cluster_netflix_endpoint" {
    value = aws_eks_cluster.netflix.endpoint
}

output "eks_cluster_netflix_certificate_authority" {
    value = aws_eks_cluster.netflix.certificate_authority
}

output "eks_nodegroup_netflix_name" {
    value = aws_eks_node_group.netflix.id
}

data "aws_eks_cluster" "eks_cluster_netflix" {
  name = aws_eks_cluster.netflix.name # Replace with your EKS cluster name

  depends_on = [aws_eks_cluster.netflix]
}

output "eks_cluster_security_group_ids" {
  value = data.aws_eks_cluster.eks_cluster_netflix.vpc_config[0].security_group_ids
}

output "eks_asg_name" {
  value = aws_eks_node_group.netflix.resources.0.autoscaling_groups.0.name
}

Here we output eks cluster and node group related values for use in our main.tf file

variables.tf — eks

variable "eks_cluster_netflix_name" {
  description = "Name of the netflix EKS cluster"
  type        = string
}

variable "aws_eks_node_group_netflix_name" {
  description = "Name of the netflix EKS node group"
  type        = string
}

variable "aws_eks_node_group_instance_types" {
  description = "Instance types for the EKS node group"
  type        = string
}

variable "aws_eks_node_group_desired_capacity" {
  description = "Desired capacity for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_min_size" {
  description = "Minimum size for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_max_size" {
  description = "Maximum size for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_launch_template_name_prefix" {
  description = "Name prefix for the EKS node group launch template"
  type        = string
}

variable "aws_eks_node_group_launch_template_version" {
  description = "Version for the EKS node group launch template"
  type        = string
}

variable "aws_eks_node_group_device_name" {
  description = "Device name for the EKS node group block device mappings"
  type        = string
}

variable "aws_eks_node_group_volume_size" {
  description = "Volume size for the EKS node group block device mappings"
  type        = number
}

variable "subnets" {
  description = "subnets"
  type        = list(string)
}

variable "aws_eks_cluster_netflix_role_arn" {
  description = "EKS Cluster for netflix's role arn"
  type        = string
}

variable "aws_eks_node_group_netflix_role_arn" {
  description = "EKS node group for netflix's role arn"
  type        = string
}

variable "aws_eks_cluster_netflix_version" {
  description = "The version of netflix to use with AWS EKS cluster"
  type        = string
   # You can set your desired default value here
}

variable "ec2_ssh_key" {
  description = "Name of the EC2 SSH key pair"
  type        = string
  # You can set a default value if needed
  # default     = "example-key-pair-name"
}

variable "eks_worker_node_policy_attachment_netflix" {
  description = "IAM policy attachment name for worker nodes in netflix EKS setup"
  type        = string
}

variable "eks_cni_policy_attachment_netflix" {
  description = "IAM policy attachment name for CNI (Container Network Interface) in netflix EKS setup"
  type        = string
}

variable "eks_ec2_container_registry_readonly_attachment_netflix" {
  description = "IAM policy attachment name for read-only access to the EC2 container registry in netflix EKS setup"
  type        = string
}

variable "aws_eks_node_group_launch_template_name_prefix_netflix" {
  description = "Prefix for the name of the AWS EKS Node Group launch template in netflix setup"
  type        = string
  # You can provide a default prefix if needed
}

variable "aws_eks_addon_netflix_addon_name" {
  description = "Name of the AWS EKS addon for netflix"
  type        = string
}

variable "aws_eks_addon_netflix_addon_version" {
  description = "Version of the AWS EKS addon for netflix"
  type        = string
}

variable "aws_eks_cluster_netflix_security_group_ids" {
  description = "Security group IDs for the EKS cluster used by netflix"
  type        = string
}

# bastion
variable "aws_instance_eks_cluster_netflix_bastion_host_ami" {
  description = "The AMI ID for the bastion host"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_instance_type" {
  description = "The instance type for the bastion host"
  type        = string
}

variable "key_pair_name" {
  description = "The name of the AWS key pair used to access the bastion host"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_subnet_id" {
  description = "The ID of the subnet where the bastion host will be launched"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_security_groups" {
  description = "The ID of the security group(s) for the bastion host"
  type        = list(string)
}

variable "aws_instance_eks_cluster_netflix_bastion_host_tags" {
  description = "Tags for the bastion host instance"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_provisioner_source" {
  description = "Source path of the file to be provisioned to the bastion host"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination" {
  description = "Destination path on the bastion host where the file will be copied"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline" {
  description = "Inline script to be executed on the bastion host using remote-exec provisioner"
  type        = list(string)
}

# variable "kubernetes_manifest_netflix_manifest" {
#   type    = string
#   description = "List of paths to Kubernetes manifest files for the retail store sample app"
# }

# variable "apply_kubernetes_manifest_netflix_command" {
#   type        = string
#   description = "Command for applying the Kubernetes manifest for the retail store sample app"
# }

# variable "wait_for_deployments_netflix_command" {
#   type        = string
#   description = "Command for waiting for deployments to be available for the retail store sample app"
# }

variable "aws_eks_cluster_netflix_enabled_cluster_log_types" {
  description = "The log types of Netflix EKS cluster"
  type        = list(string)
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy" {
  description = "ARN of the IAM policy attached to EKS cluster for AmazonEKSClusterPolicy"
  type        = any
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController" {
  description = "ARN of the IAM policy attached to EKS cluster for AmazonEKSVPCResourceController"
  type        = any
}

# variable "kubernetes_manifest_argo_cd" {
#   description = "Path to the Argo CD manifest file"
#   type        = any
# }

variable "aws_instance_eks_cluster_netflix_bastion_host_file_type" {
  description = "The file type of the Netflix bastion host file"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_file_user" {
  description = "The user for accessing the Netflix bastion host file"
  type        = string
}

variables for eks.tf and bastion.tf files

iam — subfolder

iam.tf

# IAM for cluster
resource "aws_iam_role" "eks_cluster_netflix" {
  name = var.aws_iam_role_eks_cluster_netflix_name # "netflix-cluster-role"
  assume_role_policy = var.aws_iam_role_eks_cluster_assume_role_policy_netflix #  file("${path.module}/assume_role_policy.json")
}

## This null resource will update the trust policy for eks worker role.
# resource "null_resource" "add_eks_cluster_assume_role_policy" {
#   depends_on = [
#     aws_iam_openid_connect_provider.cluster
#   ]
#   provisioner "local-exec" {
#     command = "sleep 5;aws iam update-assume-role-policy --role-name ${var.aws_iam_role_eks_cluster_netflix_name} --policy-document '${self.triggers.after}' "
#   }
#   triggers = {
#     # updated_policy_json = (replace(replace(var.aws_iam_role_eks_cluster_assume_role_policy_netflix,"\n", "")," ", ""))
#     after = var.aws_iam_role_eks_cluster_assume_role_policy_netflix_updated
#   }
# }


# Associate IAM Policy to IAM Role for cluster
resource "aws_iam_role_policy_attachment" "eks_AmazonEKSClusterPolicy" {
  policy_arn = var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy # "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.eks_cluster_netflix.name
}

resource "aws_iam_role_policy_attachment" "eks_AmazonEKSVPCResourceController" {
  policy_arn = var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController # "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController"
  role       = aws_iam_role.eks_cluster_netflix.name
}

# IAM for node group

resource "aws_iam_role" "eks_nodegroup_role_netflix" {
  name = var.aws_iam_role_eks_nodegroup_role_netflix_name
  
  assume_role_policy = jsonencode(var.aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy)
#   assume_role_policy = jsonencode({
#     Statement = [
      
#     {
#       Action = "sts:AssumeRole"
#       Effect = "Allow"
#       Principal = {
#         Service = ["ec2.amazonaws.com", "eks.amazonaws.com"]
#       }
#     }
#     ]
# })
}

# resource "aws_iam_role_policy" "eks_nodegroup_role_netflix_policy" {
#   name   = "eks-nodegroup-role-netflix-describe"
#   role   = aws_iam_role.eks_nodegroup_role_netflix.name

#   policy = jsonencode({
#     Version = "2012-10-17",
#     Statement = [
#       {
#         Action = [
#           "eks:DescribeCluster",
#           "eks:AccessKubernetesApi",
#           "iam:CreateOpenIDConnectProvider",
#           "sts:AssumeRoleWithWebIdentity"
#         ],
#         Effect   = "Allow",
#         Resource = "*"  # You can specify the ARN of your EKS cluster if needed.
#       }
#     ]
#   })
# }

resource "aws_iam_policy_attachment" "eks_worker_node_policy" {
  name        = var.aws_iam_policy_attachment_eks_worker_node_policy_name # "eks-worker-node-policy-attachment"  # Unique name
  policy_arn  = var.aws_iam_policy_attachment_eks_worker_node_policy_policy_arn # "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}


resource "aws_iam_policy_attachment" "eks_cni_policy" {
  name = var.aws_iam_policy_attachment_eks_cni_policy_name # "eks_cni-policy"  
  policy_arn = var.aws_iam_policy_attachment_eks_cni_policy_policy_arn # "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" 
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}

resource "aws_iam_policy_attachment" "eks_ec2_container_registry_readonly" {
  name = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name # "eks_worker_nodes_policy"  
  policy_arn = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn # "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}

# open connection identity provider
resource "aws_iam_openid_connect_provider" "cluster" {
  client_id_list = ["sts.${data.aws_partition.current.dns_suffix}"] # ["sts.amazonaws.com"]
  thumbprint_list =  [data.tls_certificate.cluster.certificates.0.sha1_fingerprint]
  url = var.eks_netflix_url # aws_eks_cluster.cluster.identity.0.oidc.0.issuer
}

Below script will create iam role for the eks cluster and assume policy to the role

resource "aws_iam_role" "eks_cluster_netflix" {
  name = var.aws_iam_role_eks_cluster_netflix_name # "netflix-cluster-role"
  assume_role_policy = var.aws_iam_role_eks_cluster_assume_role_policy_netflix #  file("${path.module}/assume_role_policy.json")
}

FYI: Keep in mind, we we talk about policy here, it’s the trust relationship you may see here rather than policy under permission tab

Here the json file we use for this role policy

{
    "Version": "2012-10-17",
    "Statement": [
      {
        "Action": "sts:AssumeRole",
        "Effect": "Allow",
        "Principal": {
          "Service": ["eks.amazonaws.com","ec2.amazonaws.com" ]
        }
      }
    ]
  }

eks and ec2 assumed to the cluster

Here we use a file to assume this role

Below script is used to attach 2 policies needed for eks cluster

FYI: These policies are under permission tab

resource "aws_iam_role_policy_attachment" "eks_AmazonEKSClusterPolicy" {
  policy_arn = var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy # "arn:aws:iam::aws:policy/AmazonEKSClusterPolicy"
  role       = aws_iam_role.eks_cluster_netflix.name
}

resource "aws_iam_role_policy_attachment" "eks_AmazonEKSVPCResourceController" {policy_arn = var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController # "arn:aws:iam::aws:policy/AmazonEKSVPCResourceController"role       = aws_iam_role.eks_cluster_netflix.name}

Below script is to create iam role for eks nodegroup and assume the policy to the role

resource "aws_iam_role" "eks_nodegroup_role_netflix" {
  name = var.aws_iam_role_eks_nodegroup_role_netflix_name
  
  assume_role_policy = jsonencode(var.aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy)
}

As you know, we can also assuming straight without using a variable.

The choice is yours, but I’d prefer using files in main.tf’s folder as it allows us to better manage everything with variables

The below script is to attach 3 needed policies to nodegroup iam role

resource "aws_iam_policy_attachment" "eks_worker_node_policy" {
  name        = var.aws_iam_policy_attachment_eks_worker_node_policy_name # "eks-worker-node-policy-attachment"  # Unique name
  policy_arn  = var.aws_iam_policy_attachment_eks_worker_node_policy_policy_arn # "arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy"
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}


resource "aws_iam_policy_attachment" "eks_cni_policy" {
  name = var.aws_iam_policy_attachment_eks_cni_policy_name # "eks_cni-policy"  
  policy_arn = var.aws_iam_policy_attachment_eks_cni_policy_policy_arn # "arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy" 
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}

resource "aws_iam_policy_attachment" "eks_ec2_container_registry_readonly" {
  name = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name # "eks_worker_nodes_policy"  
  policy_arn = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn # "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
  roles       = [aws_iam_role.eks_nodegroup_role_netflix.id]
}

Note: Here’s the meat you need to pay more attention to

I spent tons of time here to figure out this provider needs to be created in IAM and be attached to alb controller in EKS cluster. Otherwise, connection or communication can’t be made!

# open connection identity provider
resource "aws_iam_openid_connect_provider" "cluster" {
  client_id_list = ["sts.${data.aws_partition.current.dns_suffix}"] # ["sts.amazonaws.com"]
  thumbprint_list =  [data.tls_certificate.cluster.certificates.0.sha1_fingerprint]
  url = var.eks_netflix_url # aws_eks_cluster.cluster.identity.0.oidc.0.issuer
}

data.tf — iam

data "tls_certificate" "cluster" {
  url =  var.eks_netflix_url # aws_eks_cluster.cluster.identity.0.oidc.0.issuer

}

data "aws_caller_identity" "current" {}
output "account_id" {
  value = data.aws_caller_identity.current.account_id
}

data "http" "lbc_iam_policy" {
  url = var.data_http_lbc_iam_policy_url # "https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/install/iam_policy.json"

  # Optional request headers
  request_headers = {
    Accept = var.data_http_lbc_iam_policy_request_headers_accept # "application/json"
  }
}

data "aws_partition" "current" {}

As you can see above, we use variables for url and other values, you may locate the values of them in main.tf of eks_cluster folder

The real values are actually from eks module’s outputs

outputs.tf — iam

output "eks_netflix_cluster_iam_role_arn" {
  value = aws_iam_role.eks_cluster_netflix.arn
}

output "eks_netflix_nodegroup_iam_role_arn" {
  value = aws_iam_role.eks_nodegroup_role_netflix.arn
}

# output "eks_netflix_nodegroup_instance_iam_role_arn" {
#   value = aws_iam_instance_profile.eks_instance_profile.arn
# }

output "eks_worker_node_policy_attachment_netflix" {
  value = var.aws_iam_policy_attachment_eks_worker_node_policy_policy_arn
}

output "eks_cni_policy_attachment_netflix" {
  value = var.aws_iam_policy_attachment_eks_cni_policy_policy_arn
}

output "eks_ec2_container_registry_readonly_attachment_netflix" {
  value = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn
}

output "oidc_provider_arn" {
  value = aws_iam_openid_connect_provider.cluster.arn
}

output "eks_AmazonEKSClusterPolicy" {
  value = aws_iam_role_policy_attachment.eks_AmazonEKSClusterPolicy
}

output "eks_AmazonEKSVPCResourceController" {
  value = aws_iam_role_policy_attachment.eks_AmazonEKSVPCResourceController
}

Output needed from IAM to be used in main.tf file

variables.tf — iam

#iam
variable "aws_iam_role_eks_cluster_netflix_name" {
  description = "Iam role name for esk cluster"
  type        = string
}

variable "aws_iam_role_eks_cluster_assume_role_policy_netflix" {
  description = "file of the policy netflix"
  type        = string
}

variable "aws_iam_role_eks_nodegroup_role_netflix_name" {
  description = "Name of the IAM role associated with EKS nodegroups for netflix"
  type        = string
  # You can set a default value if needed
  # default     = "example-role-name"
}

variable "eks_netflix_url" {
  type    = string
}

variable "tags" {
  type    = map(string)
  default = {
    Environment = "production"
    ServiceType = "backend"
    // Additional tags...
  }
}

variable "aws_iam_role_eks_cluster_assume_role_policy_netflix_updated" {
  type        = string
  description = "IAM role policy for assuming roles in the EKS cluster for Netflix (updated)"
}

variable "eks_cluster_netflix" {
  description = "Netflix EKS cluster"
  type        = any
}

variable "data_http_lbc_iam_policy_url" {
  description = "The URL for the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "data_http_lbc_iam_policy_request_headers_accept" {
  description = "The value for the 'Accept' header in the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy" {
  description = "ARN of the IAM policy attached to an EKS cluster role allowing control plane to make API requests on your behalf"
  type        = string
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController" {
  description = "ARN of the IAM policy attached to an EKS cluster role allowing the VPC resource controller to make API requests on your behalf"
  type        = string
}

variable "aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy" {
  description = "The assume role policy document for the Netflix EKS node group role"
  type        = any
}

variable "aws_iam_policy_attachment_eks_worker_node_policy_name" {
  description = "The name of the IAM policy attachment for the EKS worker node policy"
  type        = string
}

variable "aws_iam_policy_attachment_eks_worker_node_policy_policy_arn" {
  description = "The ARN of the IAM policy attached to EKS worker nodes"
  type        = string
}

variable "aws_iam_policy_attachment_eks_cni_policy_name" {
  description = "The name of the IAM policy attachment for the EKS CNI policy"
  type        = string
}

variable "aws_iam_policy_attachment_eks_cni_policy_policy_arn" {
  description = "The ARN of the IAM policy attached to EKS CNI"
  type        = string
}

variable "aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name" {
  description = "The name of the IAM policy attachment for EC2 Container Registry readonly access"
  type        = string
}

variable "aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn" {
  description = "The ARN of the IAM policy attached to EC2 Container Registry for readonly access"
  type        = string
}

variables for iam.tf file

security group (sg) — subfolder

sg.tf

# Define a security group for HTTP/HTTPS access
resource "aws_security_group" "all" {
  name        = var.security_group_name
  description = var.security_group_description
  vpc_id      = var.vpc_id

  # Allow incoming HTTP (port 80) traffic
  ingress {
    from_port   = var.port_80
    to_port     = var.port_80
    protocol    = var.security_group_protocol
    cidr_blocks = [var.web_cidr]
  }

  # Allow incoming HTTPS (port 443) traffic
  ingress {
    from_port   = var.port_443
    to_port     = var.port_443
    protocol    = var.security_group_protocol
    cidr_blocks = [var.web_cidr]
  }

  # Allow SSH access for netflix (port 22)
  ingress {
    from_port   = var.port_22
    to_port     = var.port_22
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # Allow HTTP access for Grafana (port 3000)
  ingress {
    from_port   = var.port_3000
    to_port     = var.port_3000
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # Allow HTTP access for Jenkins (port 8080)
  ingress {
    from_port   = var.port_8080
    to_port     = var.port_8080
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # # Allow HTTP access for Netflix (port 8081)
  # ingress {
  #   from_port   = var.port_8081
  #   to_port     = var.port_8081
  #   protocol    = var.security_group_protocol
  #   cidr_blocks = [var.web_cidr]
  # }


  # Allow HTTP access for SonarQube (port 9000)
  ingress {
    from_port   = var.port_9000
    to_port     = var.port_9000
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # Allow HTTP access for Prometheus (port 9090)
  ingress {
    from_port   = var.port_9090
    to_port     = var.port_9090
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # Allow HTTP access for Node Exporter (port 9100)
  ingress {
    from_port   = var.port_9100
    to_port     = var.port_9100
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

  # Allow HTTP access for Argocd (port 10250)
  ingress {
    from_port   = var.port_10250
    to_port     = var.port_10250
    protocol    = var.security_group_protocol
    cidr_blocks = [var.web_cidr]
  }

  # Allow HTTP access for Netflix (port 30007)
  ingress {
    from_port   = var.port_30007
    to_port     = var.port_30007
    protocol    = var.security_group_protocol
    cidr_blocks = [var.private_ip_address]
    self = true
  }

   # Allow HTTP access for Argocd Manifest (port 9443)
  ingress {
    from_port   = var.port_9443
    to_port     = var.port_9443
    protocol    = var.security_group_protocol
    cidr_blocks = [var.web_cidr]
  }

  # # Allow MySQL access for RDS (port 3306)
  # ingress {
  #   from_port   = var.port_3306
  #   to_port     = var.port_3306
  #   protocol    = var.security_group_protocol
  #   cidr_blocks = [var.private_ip_address]
  # }

  # Allow all outbound traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"  # All protocols
    cidr_blocks     = ["0.0.0.0/0"]  # Allow traffic to all destinations
  }
}


# Define a security group for eks clusters
# Allow incoming HTTPS (port 443) traffic
resource "aws_security_group" "eks_cluster" { 
  name        = var.security_group_name_eks_cluster
  description = var.security_group_description_eks_cluster
  vpc_id      = var.vpc_id

  ingress {
    from_port   = var.port_443
    to_port     = var.port_443
    protocol    = var.security_group_protocol
    cidr_blocks = [var.vpc_cidr_block]
  }
  # Allow all outbound traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"  # All protocols
    cidr_blocks     = ["0.0.0.0/0"]  # Allow traffic to all destinations
  }
}

# workder node to the bastion

# resource "aws_security_group_rule" "allow_ssh_from_bastion" {
#   type              = "ingress"
#   from_port         = var.port_22
#   to_port           = var.port_22
#   protocol          = var.security_group_protocol
#   security_group_id = aws_security_group.all.id
#   self = true
#   # source_security_group_id = aws_security_group.all.id
  
# }

# resource "null_resource" "force_rule_allow_ssh_from_bastion_update" {
#   triggers = {
#     always_run = timestamp()
#   }

#   depends_on = [aws_security_group_rule.allow_ssh_from_bastion]
# }

FYI: self = true will guarantee port 22 is always open for the security group itself, which will allow bastion to access to worker node even when a update is run for terraform pipeline

Below script is to create security group needed for overall access

# Define a security group for HTTP/HTTPS access
resource "aws_security_group" "all" {
  name        = var.security_group_name
  description = var.security_group_description
  vpc_id      = var.vpc_id

# Allow incoming HTTP (port 80) trafficingress {from_port   = var.port_80to_port     = var.port_80protocol    = var.security_group_protocolcidr_blocks = [var.web_cidr]}# Allow incoming HTTPS (port 443) trafficingress {from_port   = var.port_443to_port     = var.port_443protocol    = var.security_group_protocolcidr_blocks = [var.web_cidr]}# Allow SSH access for netflix (port 22)ingress {from_port   = var.port_22to_port     = var.port_22protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Grafana (port 3000)ingress {from_port   = var.port_3000to_port     = var.port_3000protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Jenkins (port 8080)ingress {from_port   = var.port_8080to_port     = var.port_8080protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for SonarQube (port 9000)ingress {from_port   = var.port_9000to_port     = var.port_9000protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Prometheus (port 9090)ingress {from_port   = var.port_9090to_port     = var.port_9090protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Node Exporter (port 9100)ingress {from_port   = var.port_9100to_port     = var.port_9100protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Argocd (port 10250)ingress {from_port   = var.port_10250to_port     = var.port_10250protocol    = var.security_group_protocolcidr_blocks = [var.web_cidr]}# Allow HTTP access for Netflix (port 30007)ingress {from_port   = var.port_30007to_port     = var.port_30007protocol    = var.security_group_protocolcidr_blocks = [var.private_ip_address]self = true}# Allow HTTP access for Argocd Manifest (port 9443)ingress {from_port   = var.port_9443to_port     = var.port_9443protocol    = var.security_group_protocolcidr_blocks = [var.web_cidr]}

Basically, we open port 80, 443 to all, port 22, 3306, 8080 to only my only ip address locally

port 80 for web service

port 443 for application service

port 22 for ssh service

port 3000 for grafana service

port 9000 for sonarqube service

port 9100 for node exporter service

port 10250 and port 9443 for argocd service

FYI: Don’t forget egress!!!

When using AWS console, you may not aware of this part as it’s pre defined.

But you may have this access issue if not setting in terraform

# Allow all outbound traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"  # All protocols
    cidr_blocks     = ["0.0.0.0/0"]  # Allow traffic to all destinations
  }

Above script is to allow outbound traffic to all

Here are the meets — EKS cluster security group must allow access to the vpc on port 443, I state it here one more time here as it took too much of my time to figure out this and it’s not documented clear on AWS official site

resource "aws_security_group" "eks_cluster" { 
  name        = var.security_group_name_eks_cluster
  description = var.security_group_description_eks_cluster
  vpc_id      = var.vpc_id

  ingress {
    from_port   = var.port_443
    to_port     = var.port_443
    protocol    = var.security_group_protocol
    cidr_blocks = [var.vpc_cidr_block]
  }
  # Allow all outbound traffic
  egress {
    from_port       = 0
    to_port         = 0
    protocol        = "-1"  # All protocols
    cidr_blocks     = ["0.0.0.0/0"]  # Allow traffic to all destinations
  }
}

Below script is to create ssh access in security group for eks worker node bastion server

resource "aws_security_group_rule" "allow_ssh_from_bastion" {
  type              = "ingress"
  from_port         = var.port_22
  to_port           = var.port_22
  protocol          = var.security_group_protocol
  security_group_id = aws_security_group.all.id
  source_security_group_id = aws_security_group.all.id
}

FYI: It allows on port 22 but only with security group for the bastion. The trick is here

security group id and source security group need both to refer to the security group created previously

security_group_id = aws_security_group.all.id
source_security_group_id = aws_security_group.all.id

outputs.tf — sg

output "security_group_id" {
  value = aws_security_group.all.id
}

output "security_group_ids" {
  value = [aws_security_group.all.id]
}
output "security_group_id_eks_cluster" {
  value = aws_security_group.eks_cluster.id
}

output above values to refer in main.tf file

variables.tf — sg

variable "security_group_name" {
  description = "Name of the AWS security group"
  type        = string
}

variable "security_group_description" {
  description = "Description of the AWS security group"
  type        = string
}

variable "security_group_name_eks_cluster" {
  description = "Name of the AWS security group for eks cluster"
  type        = string
}

variable "security_group_description_eks_cluster" {
  description = "Description of the AWS security group for eks cluster"
  type        = string
}

variable "vpc_id" {
  description = "ID of the VPC where the security group will be created"
  type        = string
}

variable "port_80" {
  description = "Port for HTTP traffic (e.g., 80)"
  type        = number
}

variable "port_443" {
  description = "Port for HTTPS traffic (e.g., 443)"
  type        = number
}

variable "port_22" {
  description = "Port for SSH access (e.g., 22)"
  type        = number
}

variable "port_3000" {
  description = "Port for HTTP access for Grafana (e.g., 3000)"
  type        = number
}


variable "port_8080" {
  description = "Port for HTTP access for Jenkins (e.g., 8080)"
  type        = number
}

variable "port_10250" {
  description = "Port for HTTP access for Argocd (e.g., 10250)"
  type        = number
}

variable "port_30007" {
  description = "Port for HTTP access for Netflix (e.g., 30007)"
  type        = number
}

# variable "port_8081" {
#   description = "Port for HTTP access for Netflix (e.g., 8081)"
#   type        = number
# }

variable "port_9000" {
  description = "Port for HTTP access for Netflix (e.g., 9000)"
  type        = number
}

variable "port_9090" {
  description = "Port for HTTP access for Prometheus (e.g., 9090)"
  type        = number
}

variable "port_9100" {
  description = "Port for HTTP access for Node Exporter (e.g., 9100)"
  type        = number
}

variable "port_9443" {
  description = "Port for HTTP access for Argocd Manifest (e.g., 9443)"
  type        = number
}

variable "port_3306" {
  description = "Port for MySQL access for RDS (e.g., 3306)"
  type        = number
}

variable "security_group_protocol" {
  description = "Protocol for the security group rules (e.g., 'tcp', 'udp', 'icmp', etc.)"
  type        = string
}

variable "web_cidr" {
  description = "CIDR block for incoming HTTP and HTTPS traffic"
  type        = string
}

variable "private_ip_address" {
  description = "CIDR block for private IP addresses (e.g., for SSH, Jenkins, MySQL)"
  type        = string
}

variable "vpc_cidr_block" {
  description = "CIDR block for the VPC"
  type        = string
}

variables for sg.tf file

vpc — subfolder

vpc.tf

resource "aws_vpc" "all" {
  cidr_block = var.vpc_cidr_block
  tags = {
    Name = var.vpc_name
  }
}

Above script can be used as a template to create a vpc with all necessary configs

VPC, Public/ Private Subnets, Internet Gateway (IGW), Nat Gateway, Elastic IP Address (EIP), Public/ Private Route Tables, Public/ Private Route Table Assocations included

outputs.tf — vpc

resource "aws_vpc" "all" {
  cidr_block = var.vpc_cidr_block
  tags = {
    Name = var.vpc_name
  }
}

resource "aws_subnet" "public" {
  count = length(var.public_subnet_cidr_blocks)

  vpc_id            = aws_vpc.all.id
  cidr_block        = var.public_subnet_cidr_blocks[count.index]
  availability_zone = var.availability_zones[count.index]
  map_public_ip_on_launch = true

  tags = {
    Name = var.aws_subnet_public_name # "public_subnets"
    "${var.aws_subnet_public_eks_alb}" = var.aws_subnet_public_eks_alb_value # "kubernetes.io/role/elb" = 1
  }

}

resource "aws_subnet" "private" {
  count = length(var.private_subnet_cidr_blocks)

  vpc_id            = aws_vpc.all.id
  cidr_block        = var.private_subnet_cidr_blocks[count.index]
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name = var.aws_subnet_private_name # "private_subnets"
    "${var.aws_subnet_private_eks_alb}" = var.aws_subnet_private_eks_alb_value # "kubernetes.io/role/internal-elb" = 1
  }
}


resource "aws_internet_gateway" "all" {
  vpc_id = aws_vpc.all.id
  tags = {
    Name = var.igw_name
  }
}

resource "aws_nat_gateway" "all" {
  count         = length(var.availability_zones)
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id

}

resource "aws_eip" "nat" {
  count = length(var.availability_zones)
}


resource "aws_route_table" "public" {
  vpc_id = aws_vpc.all.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.all.id
  }

}

resource "aws_route_table" "private" {
  count = length(var.availability_zones)
  vpc_id = aws_vpc.all.id

  route {
    cidr_block = "0.0.0.0/0"
    nat_gateway_id = aws_nat_gateway.all[count.index].id
  }

}

resource "aws_route_table_association" "public" {
  count = length(aws_subnet.public)

  subnet_id      = aws_subnet.public[count.index].id
  route_table_id = aws_route_table.public.id
}

resource "aws_route_table_association" "private" {
  count = length(aws_subnet.private)

  subnet_id      = aws_subnet.private[count.index].id
  route_table_id = aws_route_table.private[count.index].id
}

# # Define a null_resource to handle the deletion of the VPC
# resource "null_resource" "delete_vpc" {
#   # This local-exec provisioner will execute a command after the VPC is destroyed
#   triggers = {
#     vpc_id = aws_vpc.all.id
#   }

#   # This local-exec provisioner will execute a command after the VPC is destroyed
#   provisioner "local-exec" {
#     command = <<EOF
# aws ec2 delete-vpc --vpc-id ${self.triggers.vpc_id}
# EOF
#     # Specify that this provisioner should run only when the null_resource is destroyed
#     when = destroy
#   }
# }

Notes: To make ALB created by ALB Controller, we need to tag public subnets for internet facing ALB and tag private subnets for internal ALB.

Private subnets tag:

Key: “kubernetes.io/role/internal-elb”
Value: “1”

Public subnets tag:

Key: “kubernetes.io/role/elb”
Value: “1”

Reference page: https://repost.aws/knowledge-center/load-balancer-troubleshoot-creating

output "vpc_id" {value = aws_vpc.all.id}
# Output the subnet IDs created by the aws_subnet resourceoutput "subnet_ids" {# alue = [for idx, subnet_id in aws_subnet.all[*].id : subnet_id if element(aws_subnet.all[*].availability_zone, idx) != element(aws_subnet.all[*].availability_zone, 0)]value = aws_subnet.private[*].id}
# Output the subnet IDs created by the aws_subnet resourceoutput "subnet_id" {value = aws_subnet.private[0].id}
# Output internet gatewayoutput "igw" {value = aws_internet_gateway.all.id}
# Output cidr blockoutput "vpc_cidr_block" {value = aws_vpc.all.cidr_block}
output "public_subnet" {value = aws_subnet.public[0].id}

output needed to be referred in main.tf file

variables.tf — vpc

# VPC variables
variable "vpc_cidr_block" {
  description = "CIDR block for the VPC"
  type        = string
}

variable "vpc_name" {
  description = "Name for the VPC"
  type        = string
}

# Subnet variables
variable "public_subnet_cidr_blocks" {
  description = "List of CIDR blocks for public subnets"
  type        = list(string)
}

variable "private_subnet_cidr_blocks" {
  description = "List of CIDR blocks for private subnets"
  type        = list(string)
}

# variable "subnet" {
#   description = "Name of the subnet"
#   type        = string
# }


# Internet Gateway variables
variable "igw_name" {
  description = "Name for the Internet Gateway"
  type        = string
}

# Route Table variables
variable "rt_name" {
  description = "Name for the Route Table"
  type        = string
}

# Route Table Association variables
variable "rt_association" {
  description = "Name prefix for Route Table Association"
  type        = string
}

variable "web_cidr" {
  description = "Cidr block for web"
  type        = string
}

variable "availability_zones" {
  type    = list(string)
}

variable "aws_subnet_public_name" {
  description = "Name of the public subnet"
  type        = string
}

variable "aws_subnet_public_eks_alb" {
  description = "Name of the public subnet associated with the EKS Application Load Balancer (ALB)"
  type        = string
}

variable "aws_subnet_public_eks_alb_value" {
  description = "Value of the public subnet associated with the EKS ALB"
  type        = number
}

variable "aws_subnet_private_name" {
  description = "Name of the private subnet"
  type        = string
}

variable "aws_subnet_private_eks_alb" {
  description = "Name of the private subnet associated with the EKS Application Load Balancer (ALB)"
  type        = string
}

variable "aws_subnet_private_eks_alb_value" {
  description = "Value of the private subnet associated with the EKS ALB"
  type        = number
}

variables needed in vpc.tf file

eks_alb — subfolder

alb_target_group.tf

# Netflix
resource "aws_autoscaling_attachment" "alb_attachment" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.netflix_tg.arn
}

# Step 1: Create an Application Load Balancer (ALB)
resource "aws_lb" "eks" {
  name               = var.aws_lb_eks_name # "eks-netflix"
  internal           = var.aws_lb_eks_internal_bool # false
  load_balancer_type = var.aws_lb_eks_load_balancer_type # "application"
  security_groups    = [var.security_group]
  subnets            = var.public_subnets
}

# Step 2: Create a Target Group
resource "aws_lb_target_group" "netflix_tg" {
  name     = var.aws_lb_target_group_netflix_tg_name # "netflix-target-group"
  port     = var.aws_lb_target_group_netflix_tg_port # 30007
  protocol = var.aws_lb_target_group_netflix_tg_protocol # "HTTP"
  vpc_id   = var.vpc_id
  
  health_check {
    path                = var.aws_lb_target_group_netflix_tg_health_check["path"] # "/"
    port                = var.aws_lb_target_group_netflix_tg_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_netflix_tg_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_netflix_tg_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_netflix_tg_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_netflix_tg_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_netflix_tg_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_netflix_tg_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "netflix_alb_attachment" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.netflix_tg.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_netflix" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_netflix_port # 30007                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_netflix_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_netflix_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.netflix_tg.arn
  }
}

# SonarQube
resource "aws_autoscaling_attachment" "alb_attachment_sonarqube" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.eks_tg_sonarqube.arn
}


# Step 2: Create a Target Group
resource "aws_lb_target_group" "eks_tg_sonarqube" {
  name     = var.aws_lb_target_group_eks_tg_sonarqube_name # "eks-target-group-sonarqube"
  port     = var.aws_lb_target_group_eks_tg_sonarqube_port # 9000
  protocol = var.aws_lb_target_group_eks_tg_sonarqube_protocol # "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = var.aws_lb_target_group_eks_tg_sonarqube_health_check["path"] # "/"
    port                = var.aws_lb_target_group_eks_tg_sonarqube_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_eks_tg_sonarqube_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_eks_tg_sonarqube_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_eks_tg_sonarqube_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_eks_tg_sonarqube_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_eks_tg_sonarqube_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_eks_tg_sonarqube_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "eks_alb_attachment_sonarqube" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.eks_tg_sonarqube.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_sonarqube" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_sonarqube_port # 9000                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_sonarqube_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_sonarqube_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.eks_tg_sonarqube.arn
  }
}


# Grafana
resource "aws_autoscaling_attachment" "alb_attachment_grafana" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.eks_tg_grafana.arn
}


# Step 2: Create a Target Group
resource "aws_lb_target_group" "eks_tg_grafana" {
  name     = var.aws_lb_target_group_eks_tg_grafana_name # "eks-target-group-grafana"
  port     = var.aws_lb_target_group_eks_tg_grafana_port # 3000
  protocol = var.aws_lb_target_group_eks_tg_grafana_protocol # "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = var.aws_lb_target_group_eks_tg_grafana_health_check["path"] # "/api/health"
    port                = var.aws_lb_target_group_eks_tg_grafana_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_eks_tg_grafana_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_eks_tg_grafana_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_eks_tg_grafana_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_eks_tg_grafana_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_eks_tg_grafana_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_eks_tg_grafana_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "eks_alb_attachment_grafana" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.eks_tg_grafana.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_grafana" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_grafana_port # 3000                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_grafana_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_grafana_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.eks_tg_grafana.arn
  }
}

# Prometheus
resource "aws_autoscaling_attachment" "alb_attachment_prometheus" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.eks_tg_prometheus.arn
}


# Step 2: Create a Target Group
resource "aws_lb_target_group" "eks_tg_prometheus" {
  name     = var.aws_lb_target_group_eks_tg_prometheus_name # "eks-target-group-prometheus"
  port     = var.aws_lb_target_group_eks_tg_prometheus_port # 9090
  protocol = var.aws_lb_target_group_eks_tg_prometheus_protocol # "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = var.aws_lb_target_group_eks_tg_prometheus_health_check["path"] # "/status"
    port                = var.aws_lb_target_group_eks_tg_prometheus_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_eks_tg_prometheus_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_eks_tg_prometheus_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_eks_tg_prometheus_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_eks_tg_prometheus_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_eks_tg_prometheus_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_eks_tg_prometheus_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "eks_alb_attachment_prometheus" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.eks_tg_prometheus.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_prometheus" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_prometheus_port # 9090                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_prometheus_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_prometheus_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.eks_tg_prometheus.arn
  }
}


# Node Exporter
resource "aws_autoscaling_attachment" "alb_attachment_node_exporter" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.eks_tg_node_exporter.arn
}


# Step 2: Create a Target Group
resource "aws_lb_target_group" "eks_tg_node_exporter" {
  name     = var.aws_lb_target_group_eks_tg_node_exporter_name # "eks-target-group-node-exporter"
  port     = var.aws_lb_target_group_eks_tg_node_exporter_port # 9100
  protocol = var.aws_lb_target_group_eks_tg_node_exporter_protocol # "HTTP"
  vpc_id   = var.vpc_id

  health_check {
    path                = var.aws_lb_target_group_eks_tg_node_exporter_health_check["path"] # "/"
    port                = var.aws_lb_target_group_eks_tg_node_exporter_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_eks_tg_node_exporter_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_eks_tg_node_exporter_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_eks_tg_node_exporter_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_eks_tg_node_exporter_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_eks_tg_node_exporter_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_eks_tg_node_exporter_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "eks_alb_attachment_node_exporter" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.eks_tg_node_exporter.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_node_exporter" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_node_exporter_port # 9100                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_node_exporter_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_node_exporter_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.eks_tg_node_exporter.arn
  }
}

The structure of different services is the same, so we’ll digest only one of them

Netflix

# Netflix
resource "aws_autoscaling_attachment" "alb_attachment" {
  autoscaling_group_name = var.aws_autoscaling_attachment_alb_attachment_autoscaling_group_name
  lb_target_group_arn   = aws_lb_target_group.netflix_tg.arn
}

# Step 1: Create an Application Load Balancer (ALB)
resource "aws_lb" "eks" {
  name               = var.aws_lb_eks_name # "eks-netflix"
  internal           = var.aws_lb_eks_internal_bool # false
  load_balancer_type = var.aws_lb_eks_load_balancer_type # "application"
  security_groups    = [var.security_group]
  subnets            = var.public_subnets
}

# Step 2: Create a Target Group
resource "aws_lb_target_group" "netflix_tg" {
  name     = var.aws_lb_target_group_netflix_tg_name # "netflix-target-group"
  port     = var.aws_lb_target_group_netflix_tg_port # 30007
  protocol = var.aws_lb_target_group_netflix_tg_protocol # "HTTP"
  vpc_id   = var.vpc_id
  
  health_check {
    path                = var.aws_lb_target_group_netflix_tg_health_check["path"] # "/"
    port                = var.aws_lb_target_group_netflix_tg_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_netflix_tg_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_netflix_tg_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_netflix_tg_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_netflix_tg_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_netflix_tg_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_netflix_tg_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "netflix_alb_attachment" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.netflix_tg.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_netflix" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_netflix_port # 30007                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_netflix_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_netflix_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.netflix_tg.arn
  }
}

From above script, we attach application load balancer to auto scaling group created by auto scaling group automatically created by EKS worker node

Shown from above, the value of asg’s name is referred from terraform state file from eks_cluster resources’ outputs

And this value is from node group created using node group creation resource

Note: Using terraform state file is a method to refer to certain values generated from previous stages

Then we go through the following script

# Step 1: Create an Application Load Balancer (ALB)
resource "aws_lb" "eks" {
  name               = var.aws_lb_eks_name # "eks-netflix"
  internal           = var.aws_lb_eks_internal_bool # false
  load_balancer_type = var.aws_lb_eks_load_balancer_type # "application"
  security_groups    = [var.security_group]
  subnets            = var.public_subnets
}

# Step 2: Create a Target Group
resource "aws_lb_target_group" "netflix_tg" {
  name     = var.aws_lb_target_group_netflix_tg_name # "netflix-target-group"
  port     = var.aws_lb_target_group_netflix_tg_port # 30007
  protocol = var.aws_lb_target_group_netflix_tg_protocol # "HTTP"
  vpc_id   = var.vpc_id
  
  health_check {
    path                = var.aws_lb_target_group_netflix_tg_health_check["path"] # "/"
    port                = var.aws_lb_target_group_netflix_tg_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_netflix_tg_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_netflix_tg_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_netflix_tg_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_netflix_tg_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_netflix_tg_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_netflix_tg_health_check["matcher"] # "200"
  }

}

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "netflix_alb_attachment" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.netflix_tg.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_netflix" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_netflix_port # 30007                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_netflix_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_netflix_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.netflix_tg.arn
  }
}

Here we basically create application load balancer (internet facing) — just a reminder. To make the argocd work, public subnets involved must be tagged as introducted previously

Then, we create target group for Netflix service

# Step 2: Create a Target Group
resource "aws_lb_target_group" "netflix_tg" {
  name     = var.aws_lb_target_group_netflix_tg_name # "netflix-target-group"
  port     = var.aws_lb_target_group_netflix_tg_port # 30007
  protocol = var.aws_lb_target_group_netflix_tg_protocol # "HTTP"
  vpc_id   = var.vpc_id
  
  health_check {
    path                = var.aws_lb_target_group_netflix_tg_health_check["path"] # "/"
    port                = var.aws_lb_target_group_netflix_tg_health_check["port"] # "traffic-port"
    protocol            = var.aws_lb_target_group_netflix_tg_health_check["protocol"] # "HTTP"
    interval            = var.aws_lb_target_group_netflix_tg_health_check["interval"] # 30
    timeout             = var.aws_lb_target_group_netflix_tg_health_check["timeout"] # 5
    healthy_threshold   = var.aws_lb_target_group_netflix_tg_health_check["healthy_threshold"] # 2
    unhealthy_threshold = var.aws_lb_target_group_netflix_tg_health_check["unhealthy_threshold"] # 2
    matcher             = var.aws_lb_target_group_netflix_tg_health_check["matcher"] # "200"
  }

}

Here we apply object to address health check. Otherwise, you will end up create 8 different values for each service. Put it in context, if you apply this infrastructure in production environment, you may end up with hundreds of, even thousands of variables.

To assign values to this object, we may do so in terraform.tfvars file

aws_lb_target_group_eks_tg_sonarqube_health_check = {
  path                = "/"
  port                = "traffic-port"
  protocol            = "HTTP"
  interval            = 30
  timeout             = 5
  healthy_threshold   = 2
  unhealthy_threshold = 2
  matcher             = "200"
}

Lastly, we attach target group to alb and create alb listener rule for alb

# Step 3: Attach the Target Group to the ALB
resource "aws_lb_target_group_attachment" "netflix_alb_attachment" {
  # target_group_arn = aws_lb_target_group.eks_tg.arn
  # target_id        = var.eks_worker_node_id # You need to specify the target for your Target Group, adjust as needed

  count            = length(var.eks_worker_node_id) 
  target_group_arn = aws_lb_target_group.netflix_tg.arn
  target_id        = var.eks_worker_node_id[count.index]
}

resource "aws_lb_listener" "http_listener_netflix" {
  load_balancer_arn = aws_lb.eks.arn  # Specify the ARN of your ALB
  port              = var.aws_lb_listener_http_listener_netflix_port # 30007                  # Specify the port for the HTTP listener
  protocol          = var.aws_lb_listener_http_listener_netflix_protocol # "HTTP"               # Specify the protocol (HTTP)
  
  default_action {
    type             = var.aws_lb_listener_http_listener_netflix_default_action_type # "forward"
    target_group_arn = aws_lb_target_group.netflix_tg.arn
  }
}

Note: The port depends on you application, say, netflix is exposed on port 30007, so it’s the value of it. If wrong port is provided, health check on target group may return with unhealthy check. Also, path of the health check may differ depending on different services.

Reminder: here we only apply ALB’s DNS as domain name with HTTP. For production environment, you may add HTTPS with custom name (verfied using AWS ACM and attached it to the ALB)

alb_target_group.tf file covers services

Netflix — port 30007

SonarQube — port 9000

Grafana — port 3000

Prometheus — port 9090

Node Exporter — port 9100

argocd_credentials.tf

It is self-explanary that it extracts argocd password from kubernetes secret under namespace argocd and then output it to S3 bucket designated as an object. In this way, we may securely refer it in the next step to deploy the application of Netflix using argocd in terraform.

Again, this is my solution, it could be done with other alternatives by all means. Feel free to come up with yours :)

Let us now dive in component by compenent

resource "null_resource" "get_argocd_admin_password" {
  provisioner "remote-exec" {
    inline = [
      "ssh -o StrictHostKeyChecking=no -i \"web-ec2.pem\" ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_get_argocd_admin_password_remote_exec_inline}\""
    ]
  

  connection {
    type        = var.null_resource_get_argocd_admin_password_connection_type # "ssh"
    user        = var.null_resource_get_argocd_admin_password_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_get_argocd_admin_password_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
    # insecure = true
    
    }
  }
}

As you can tell from above script, we take advantage of null resource to connect to bastion we created previously (stress it here again, bastion, not worker node). After we land on bastion, we may run the script from inline

"ssh -o StrictHostKeyChecking=no -i \"web-ec2.pem\" ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_get_argocd_admin_password_remote_exec_inline}\""

Note: Please check every single character above as I run into a great number of issues when applying it.

Thanks to terraform robust logs, the bash file used to run it will be saved under /tmp/ of the bastion. So you may access to the path with file designated to double check in case you encounter any outstanding issues

Also, a few more points to add

StrictHostKeyChecking=no

Above script is to make sure fingerprint is skipped. I tried other alternatives, but this seems to be the one that works

\"web-ec2.pem\"

If you have some exposure about bash, the stupid escape issue may occur occassionally. Here is a good example of it. Double quote needs to be escaped suing \

For more reference of escape, please refer to https://www.baeldung.com/linux/bash-escape-characters

data.aws_instance.eks_netflix.private_ip

Here we use data to retrieve private ip of our worker node

data "aws_instance" "eks_netflix" { 
  filter {
    name   = "tag:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }

}

Just a reminder, you’d better apply more than one filter since I encountered issue of terminated worker node instance still filtered. With a running state added, active worker node with correct cluster name is filtered as expected

At the end of this script

var.null_resource_get_argocd_admin_password_remote_exec_inline}

This value is seen in terraform.tfvars

null_resource_get_argocd_admin_password_remote_exec_inline    = "sudo -u ec2-user /usr/bin/kubectl get secret argocd-initial-admin-secret -o  jsonpath='{.data.password}' -n argocd | base64 -d > /tmp/secrets.txt"

A few added points for this command line

sudo -u ec2-user /usr/bin/kubectl

Above kubectl command is preferred rather than kubectl alone since I came across permission issues when runing only kubectl in userdata or inline.

base64 -d

Make sure base64 -d is applied since in secret under namespace argocd, password is masked and coded.

base64 -d will docode for us and get the authentic password needed

We then direct the output — the password to a file under /tmp/ in worker node (stress it here again, it’s worker node rather than bastion), which means it’s pretty safe.

Below script is a commanly used to trigger resource to build every time the terraform pipeline is run

resource "null_resource" "force_provisioner_get_password" {
  triggers = {
    always_run = timestamp()
  }

  depends_on = [null_resource.get_argocd_admin_password]
}

Below script helps to output password saved in /tmp/ of worker node to S3 bucket

resource "null_resource" "output_argocd_admin_password" {
  provisioner "remote-exec" {
    inline = [
      # var.null_resource_output_argocd_admin_password_remote_exec_inline # echo $ARGO_PWD
      "ssh -o StrictHostKeyChecking=no -i \"web-ec2.pem\" -y ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_output_argocd_admin_password_remote_exec_inline}\""
    ]
  connection {
    type        = var.null_resource_output_argocd_admin_password_connection_type # "ssh"
    user        = var.null_resource_output_argocd_admin_password_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_output_argocd_admin_password_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
    }
  }

  depends_on = [null_resource.get_argocd_admin_password]
}

controller.tf

# Install AWS Load Balancer Controller using HELM

# Resource: Helm Release 
resource "helm_release" "loadbalancer_controller" {
  depends_on = [aws_iam_role.lbc_iam_role]            
  name       = var.helm_release_loadbalancer_controller_name # "aws-load-balancer-controller"

  repository = var.helm_release_loadbalancer_controller_repository # "https://aws.github.io/eks-charts"
  chart      = var.helm_release_loadbalancer_controller_chart # "aws-load-balancer-controller"

  namespace = var.helm_release_loadbalancer_controller_namespace # "kube-system"     

  # Value changes based on your Region (Below is for us-east-1)
  set {
    name = var.helm_release_loadbalancer_controller_set_image_name # "image.repository"
    value = var.helm_release_loadbalancer_controller_set_image_value # "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller" 
    # Changes based on Region - This is for us-east-1 Additional Reference: https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html
  }       

  set {
    name  = var.helm_release_loadbalancer_controller_set_service_account_create_name # "serviceAccount.create"
    value = var.helm_release_loadbalancer_controller_set_service_account_create_value # "true"
  }
  
  # below value is to fix an issue facing argocd deployment using manifest
  set {
    name  = var.helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_name #  "enableServiceMutatorWebhook"
    value = var.helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_value # "false"
  }

  set {
    name  = var.helm_release_loadbalancer_controller_set_service_account_name # "serviceAccount.name"
    value = var.helm_release_loadbalancer_controller_set_service_account_value # "aws-load-balancer-controller"
  }

  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = "${aws_iam_role.lbc_iam_role.arn}"
  }

  set {
    name  = "vpcId"
    value = "${var.vpc_id}"
  }  

  set {
    name  = "region"
    value = "${var.aws_region}"
  }    

  set {
    name  = "clusterName"
    value = "${var.aws_eks_cluster_auth_cluster_name}"
  }    
    
}

Above script may create alb controller in EKS cluster using Helm

There are a few values we need to adjust accordingly

  # Value changes based on your Region (Below is for us-east-1)
  set {
    name = var.helm_release_loadbalancer_controller_set_image_name # "image.repository"
    value = var.helm_release_loadbalancer_controller_set_image_value # "602401143452.dkr.ecr.us-west-2.amazonaws.com/amazon/aws-load-balancer-controller" 
    # Changes based on Region - This is for us-east-1 Additional Reference: https://docs.aws.amazon.com/eks/latest/userguide/add-ons-images.html
  }

Here, you need to adjust region for image picked up from ECR since EKS service is region in specific

The below script is essential for permissions granted to the ALB controller. If not configured properly, constant errors shown in ingress


  set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = "${aws_iam_role.lbc_iam_role.arn}"
  }

The value of iam role is shown as below in main.tf of eks_alb folder

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "eks.amazonaws.com",
          "ec2.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
       "Principal": {
          "Federated": "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${data.aws_iam_openid_connect_provider.eks_cluster_netflix.url}"
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
            "StringEquals": {
              "${replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")}:aud": "sts.amazonaws.com",            
              "${replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")}:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"
            }
          }       
    }
  ]
}
EOF

Make sure above script is properly configured

Data to retrieve account info

data.aws_caller_identity.current.account_id

data "aws_caller_identity" "current" {}

Data to retrieve url of provider info

data.aws_iam_openid_connect_provider.eks_cluster_netflix.url

Using terraform state from S3 from previous stage

data "terraform_remote_state" "s3" {
  backend = "s3" # Set your backend configuration here
  config = {
    bucket = "eks-netflix-argocd"
    key    = "eks-cluster"
    region = "us-east-1"
  }
}

data "aws_iam_openid_connect_provider" "eks_cluster_netflix" {
  arn = data.terraform_remote_state.s3.outputs.oidc_provider_arn
}

Lastly, we apply a function to remove arn:aws:iam::951507339182:oidc-provider/ out of the arn, so provider info is applied as expected

replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")

Below are for vpc id, cluster name and region needed.

  set {
    name  = "vpcId"
    value = "${var.vpc_id}"
  }  

  set {
    name  = "region"
    value = "${var.aws_region}"
  }    

  set {
    name  = "clusterName"
    value = "${var.aws_eks_cluster_auth_cluster_name}"
  }

For vpc and clustername, we may retrieve using data, and region may be assigned with the value

data.tf

# Datasource: AWS Load Balancer Controller IAM Policy get from aws-load-balancer-controller/ GIT Repo (latest)
data "http" "lbc_iam_policy" {
  url = var.data_http_lbc_iam_policy_url # "https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/install/iam_policy.json"

  # Optional request headers
  request_headers = {
    Accept = var.data_http_lbc_iam_policy_request_headers_accept # "application/json"
  }
}

data "aws_lb" "eks" {
  depends_on = [aws_lb.eks]
  name       = aws_lb.eks.name
}


data "aws_instance" "eks_netflix" { 
  filter {
    name   = "tag:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }


}

data "aws_instance" "bastion" { 
  filter {
    name   = "tag:Name"
    values = ["bastion-host"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }
}

Below script is used to retrieve iam policy using url for alb of argocd and netflix

data "http" "lbc_iam_policy" {
  url = var.data_http_lbc_iam_policy_url # "https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/main/docs/install/iam_policy.json"

  # Optional request headers
  request_headers = {
    Accept = var.data_http_lbc_iam_policy_request_headers_accept # "application/json"
  }
}

iam.tf


# Resource: Create AWS Load Balancer Controller IAM Policy 
resource "aws_iam_policy" "lbc_iam_policy" {
  name        = var.aws_iam_policy_lbc_iam_policy_name # "${local.name}-AWSLoadBalancerControllerIAMPolicy"
  path        = var.aws_iam_policy_lbc_iam_policy_path # "/"
  description = var.aws_iam_policy_lbc_iam_policy_description # "AWS Load Balancer Controller IAM Policy"
  #policy = data.http.lbc_iam_policy.body
  policy = data.http.lbc_iam_policy.response_body
}

# IAM role for ALB Ingress Controller
resource "aws_iam_role" "lbc_iam_role" {
  name = "${local.name}-lbc-iam-role-netflix"

  # Terraform's "jsonencode" function converts a Terraform expression result to valid JSON syntax.
  assume_role_policy = var.aws_iam_role_lbc_iam_role_assume_role_policy

  tags = {
    tag-key = var.aws_iam_role_lbc_iam_role_tags # "AWSLoadBalancerControllerIAMPolicy"
  }
}

# Associate Load Balanacer Controller IAM Policy to IAM Role
resource "aws_iam_role_policy_attachment" "lbc_iam_role_policy_attach" {
  policy_arn = aws_iam_policy.lbc_iam_policy.arn 
  role       = aws_iam_role.lbc_iam_role.name
}

Above script creates iam policy, iam role and attach policy to the role used for alb

Note: policy used here needs to be properly configured in controller.tf file

 set {
    name  = "serviceAccount.annotations.eks\\.amazonaws\\.com/role-arn"
    value = "${aws_iam_role.lbc_iam_role.arn}"
  }

ingress_class.tf

# Resource: Kubernetes Ingress Class
resource "kubernetes_ingress_class_v1" "ingress_class_default" {
  depends_on = [helm_release.loadbalancer_controller]
  metadata {
    name = var.kubernetes_ingress_class_v1_ingress_class_default_metadata_name # "aws-eks-ingress-class"
    annotations = {
      "${var.kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class}" = var.kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class_value # "ingressclass.kubernetes.io/is-default-class" = "true"
    }
  }  
  spec {
    controller = var.kubernetes_ingress_class_v1_ingress_class_default_spec_controller_alb # "ingress.k8s.aws/alb"
  }
}

Above script is used to create ingress class under argocd namespace in EKS cluster, this is not neccessary, but it could be used when needed as a customized ingress class

ingress.tf

resource "kubernetes_ingress_v1" "argocd_ingress" {
  metadata {
    name      = var.kubernetes_ingress_v1_argocd_ingress_metadata_name # "argocd-server"
    namespace = var.kubernetes_ingress_v1_argocd_ingress_metadata_namespace # "argocd"
    labels = {
      "${var.kubernetes_ingress_v1_argocd_ingress_labels_name_argocd_server}" = var.kubernetes_ingress_v1_argocd_ingress_labels_value_argocd_server # "app.kubernetes.io/name" = "argocd-server" 
    }
    annotations = {
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_value # "kubernetes.io/ingress.class" = "alb"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name_value # "alb.ingress.kubernetes.io/load-balancer-name" = "argocd"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol_value # "alb.ingress.kubernetes.io/backend-protocol": "HTTP"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_scheme}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_scheme_value # "alb.ingress.kubernetes.io/scheme"             = "internet-facing"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol_value # "alb.ingress.kubernetes.io/healthcheck-protocol" =  "HTTP"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port_value # "alb.ingress.kubernetes.io/healthcheck-port"   = "traffic-port"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path_value # "alb.ingress.kubernetes.io/healthcheck-path"   = "/"
      # "alb.ingress.kubernetes.io/target-type"        = "ip"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect_value # "alb.ingress.kubernetes.io/force-ssl-redirect" = "false"
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds_value # "alb.ingress.kubernetes.io/healthcheck-interval-seconds" = 15
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds_value # "alb.ingress.kubernetes.io/healthcheck-timeout-seconds"  = 5
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_success_codes}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_success_codes_value # "alb.ingress.kubernetes.io/success-codes"      = 200
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count_value # "alb.ingress.kubernetes.io/healthy-threshold-count"     = 2
      "${var.kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count}" = var.kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count_value # "alb.ingress.kubernetes.io/unhealthy-threshold-count"   = 2
      # "alb.ingress.kubernetes.io/listen-ports"       = "[{\"HTTP\":80}]"
      # "alb.ingress.kubernetes.io/ssl-passthrough" = "true"
    #   "alb.ingress.kubernetes.io/subnets" = jsonencode([
    #   "subnet-0bbe2c7b0fb37f7c3",
    #   "subnet-01a4685a9d3144fa3",
    # ])
     }
  }

   spec {
    ingress_class_name = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name # "alb" # Ingress Class            
    default_backend {
      service {
        name = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name # "argocd-server"
        port {
          number = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port # 80
        }
      }
    }
    
    tls {
      secret_name = var.kubernetes_ingress_v1_argocd_ingress_tls_secret_name # "argocd-secret"
    }
  }
}

There are few points that need to be higlighted here

First of all, we need to make sure alb is used for class since a number of annotations are needed

"kubernetes.io/ingress.class" = "alb"

class used in alb created by ingress will be applied with class alb and it’s deployed on port 80 from below script

spec {
    ingress_class_name = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name # "alb" # Ingress Class            
    default_backend {
      service {
        name = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name # "argocd-server"
        port {
          number = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port # 80
        }
      }
    }
    
    tls {
      secret_name = var.kubernetes_ingress_v1_argocd_ingress_tls_secret_name # "argocd-secret"
    }
  }

Note: Since we use http with alb’s DNS rather than a custom url, we must update our argocd deployment from installation file

 containers:
      - args:
        - /usr/local/bin/argocd-server
        - "--insecure"

Above line is from 22297 of https://github.com/lightninglife/argo-cd/blob/master/argocd.yaml

the args are the comment to be run in the container of the eks cluster

We need to add — insure to make sure http will not be constantly redirected to https, which will cause the issue of inaccessibility of the argocd UI

local_value.tf

# Define Local Values in Terraform
locals {
  owners = "eks"
  environment = "netflix"
  name = "netflix"
  #name = "${local.owners}-${local.environment}"
  common_tags = {
    owners = local.owners
    environment = local.environment
  }
}

Here we only explore the possibilities of using local rather than terraform.tfvars for values alone

outputs.tf

# Helm Release Outputs
output "lbc_helm_metadata" {
  description = "Metadata Block outlining status of the deployed release."
  value = helm_release.loadbalancer_controller.metadata
}

output "lbc_iam_role_arn" {
  description = "AWS Load Balancer Controller IAM Role ARN"
  value = aws_iam_role.lbc_iam_role.arn
}

output "lbc_iam_policy" {
  #value = data.http.lbc_iam_policy.body
  value = data.http.lbc_iam_policy.response_body
}

output "lbc_iam_policy_arn" {
  value = aws_iam_policy.lbc_iam_policy.arn 
}

output "alb_dns_name" {
  value = data.aws_lb.eks.dns_name
}


output "aws_instance_eks_netflix_private_ip" {
  value = data.aws_instance.eks_netflix.private_ip
}

output all values needed for next stage to deploy netflix using argocd

variables.tf

# controller
variable "helm_release_loadbalancer_controller_set_service_account_create_name" {
  description = "Name of the Kubernetes service account to create for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_create_value" {
  description = "Value of the Kubernetes service account to create for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_name" {
  description = "Name of the Kubernetes service account for Enable Service Mutator Webhook."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_value" {
  description = "Value of the Kubernetes service account for Enable Service Mutator Webhook."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_name" {
  description = "Name of the Kubernetes service account for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_value" {
  description = "Value of the Kubernetes service account for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_name" {
  type        = string
  description = "The name of the Helm release for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_repository" {
  type        = string
  description = "The repository URL from which to fetch the Helm chart for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_chart" {
  type        = string
  description = "The name of the Helm chart for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_namespace" {
  type        = string
  description = "The Kubernetes namespace in which to install the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_set_image_name" {
  type        = string
  description = "The name of the Helm chart value to set the image repository for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_set_image_value" {
  type        = string
  description = "The value of the image repository for the AWS Load Balancer Controller."
}

variable "vpc_id" {
  type        = string
  description = "The ID of the VPC where the AWS Load Balancer Controller will be deployed."
}

variable "aws_region" {
  type        = string
  description = "The AWS region where the AWS Load Balancer Controller will be deployed."
}

variable "aws_eks_cluster_auth_cluster_name" {
  type        = string
  description = "The name of the AWS EKS cluster where the AWS Load Balancer Controller will be deployed."
}

# iam
variable "aws_iam_policy_lbc_iam_policy_name" {
  type        = string
  description = "The name of the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_policy_lbc_iam_policy_path" {
  type        = string
  description = "The path for the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_policy_lbc_iam_policy_description" {
  type        = string
  description = "The description of the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_role_lbc_iam_role_name" {
  type        = string
  description = "The name of the AWS IAM role for the Load Balancer Controller."
}

# ingress

variable "kubernetes_ingress_class_v1_ingress_class_default_metadata_annotations" {
  type        = map(string)
  description = "Annotations for the Kubernetes Ingress Class metadata."
}

variable "kubernetes_ingress_class_v1_ingress_class_default_spec" {
  type        = string
  description = "The controller specification for the Kubernetes Ingress Class."
}

# # providers
# variable "API_SERVER_ENDPOINT" {
#   type        = string
#   description = "The API server endpoint for the Kubernetes cluster."
# }

# variable "CERTIFICATE_AUTHORITY" {
#   type        = string
#   description = "The base64-encoded certificate authority data for the Kubernetes cluster."
# }

variable "aws_autoscaling_attachment_alb_attachment_autoscaling_group_name" {
  type    = string
  description = "Name of the Auto Scaling Group to attach the ALB to."
  # You can add additional validation rules if needed
}

variable "security_group" {
  description = "The ID of the security group"
  type        = string
}

variable "public_subnets" {
  description = "A list of subnet IDs"
  type        = list(string)
}

variable "eks_worker_node_id" {
  description = "The id of worker node"
  type        = list(string)
}

# argocd
variable "argocd_version" {
  type = string
}

# variable "env" {
#   type = string
# }

variable "fqdn" {
  type = string
}

variable "loadbalancer_dns" {
  type = string
}

# variable "region" {
#   type = string
# }

variable "aws_iam_role_lbc_iam_role_assume_role_policy" {
  type        = any
  description = "IAM role's assume role policy"
}

variable "aws_iam_role_lbc_iam_role_tags" {
  description = "Tags for the IAM role used by the load balancer controller"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_metadata_name" {
  description = "Default metadata name for Kubernetes IngressClass"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class" {
  description = "Default annotations for the default class in Kubernetes IngressClass"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class_value" {
  description = "Default value for the default class annotation in Kubernetes IngressClass"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_spec_controller_alb" {
  description = "Controller configuration for ALB in Kubernetes IngressClass"
  type        = string
}

variable "data_http_lbc_iam_policy_url" {
  description = "The URL for the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "data_http_lbc_iam_policy_request_headers_accept" {
  description = "The value for the 'Accept' header in the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

# ingress
variable "kubernetes_ingress_v1_argocd_ingress_metadata_name" {
  description = "Name of the Ingress resource for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_metadata_namespace" {
  description = "Namespace of the Ingress resource for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_labels_name_argocd_server" {
  description = "Name label for ArgoCD server"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_labels_value_argocd_server" {
  description = "Value label for ArgoCD server"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class" {
  description = "Ingress class annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name" {
  description = "Load balancer name annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol" {
  description = "Backend protocol annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_scheme" {
  description = "Scheme annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol" {
  description = "Healthcheck protocol annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port" {
  description = "Healthcheck port annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path" {
  description = "Healthcheck path annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect" {
  description = "Force SSL redirect annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds" {
  description = "Healthcheck interval in seconds annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds" {
  description = "Healthcheck timeout in seconds annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_success_codes" {
  description = "Success codes annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count" {
  description = "Healthy threshold count annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count" {
  description = "Unhealthy threshold count annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name" {
  description = "Name of the default backend service for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port" {
  description = "Port of the default backend service for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_tls_secret_name" {
  description = "Name of the TLS secret for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_value" {
  description = "Value for the 'ingress.class' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/load-balancer-name' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/backend-protocol' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_scheme_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/scheme' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-protocol' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-port' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-path' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/force-ssl-redirect' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-interval-seconds' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-timeout-seconds' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_success_codes_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/success-codes' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthy-threshold-count' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/unhealthy-threshold-count' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name" {
  description = "Annotation key for specifying the Ingress class"
  type        = string
}


# alb & target group - netflix
variable "aws_lb_eks_name" {
  description = "Name of the Elastic Load Balancer (ELB) for EKS"
  type        = string
}

variable "aws_lb_eks_internal_bool" {
  description = "Boolean indicating whether the ELB for EKS is internal (true/false)"
  type        = bool
}

variable "aws_lb_eks_load_balancer_type" {
  description = "Type of load balancer for EKS (e.g., application, network)"
  type        = string
}

variable "aws_lb_target_group_netflix_tg_name" {
  description = "Name of the target group for Netflix on the ELB"
  type        = string
}

variable "aws_lb_target_group_netflix_tg_port" {
  description = "Port for the target group for Netflix on the ELB"
  type        = number
}

variable "aws_lb_target_group_netflix_tg_protocol" {
  description = "Protocol for the target group for Netflix on the ELB (e.g., HTTP, HTTPS)"
  type        = string
}

variable "aws_lb_listener_http_listener_netflix_port" {
  description = "Port for the HTTP listener for Netflix on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_netflix_protocol" {
  description = "Protocol for the HTTP listener for Netflix on the ELB (e.g., HTTP, HTTPS)"
  type        = string
}

variable "aws_lb_listener_http_listener_netflix_default_action_type" {
  description = "Type of default action for the HTTP listener for Netflix on the ELB"
  type        = string
}


variable "aws_lb_target_group_netflix_tg_health_check" {
  description = "Health check configuration for the target group"
  type = object({
    path                = string
    port                = string
    protocol            = string
    interval            = number
    timeout             = number
    healthy_threshold   = number
    unhealthy_threshold = number
    matcher             = string
  })
}

# alb & target group - SonarQube
variable "aws_lb_target_group_eks_tg_sonarqube_name" {
  description = "Name of the Target Group for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_sonarqube_port" {
  description = "Port for the Target Group for SonarQube on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_sonarqube_protocol" {
  description = "Protocol for the Target Group for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_sonarqube_health_check" {
  description = "Health check configuration for the Target Group for SonarQube on the ELB"
  type = object({
    path                = string
    port                = string
    protocol            = string
    interval            = number
    timeout             = number
    healthy_threshold   = number
    unhealthy_threshold = number
    matcher             = string
  })
}

variable "aws_lb_listener_http_listener_sonarqube_port" {
  description = "Port for the HTTP listener for SonarQube on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_sonarqube_protocol" {
  description = "Protocol for the HTTP listener for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_sonarqube_default_action_type" {
  description = "Type of default action for the HTTP listener for SonarQube on the ELB"
  type        = string
}

# alb & target group - Grafana

variable "aws_lb_target_group_eks_tg_grafana_name" {
  description = "Name of the Target Group for Grafana on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_grafana_port" {
  description = "Port for the Target Group for Grafana on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_grafana_protocol" {
  description = "Protocol for the Target Group for Grafana on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_grafana_health_check" {
  description = "Health check configuration for the Target Group for Grafana on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_grafana_port" {
  description = "Port for the HTTP listener for Grafana on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_grafana_protocol" {
  description = "Protocol for the HTTP listener for Grafana on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_grafana_default_action_type" {
  description = "Type of default action for the HTTP listener for Grafana on the ELB"
  type        = string
}

# alb & target group - Prometheus
variable "aws_lb_target_group_eks_tg_prometheus_name" {
  description = "Name of the Target Group for Prometheus on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_prometheus_port" {
  description = "Port for the Target Group for Prometheus on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_prometheus_protocol" {
  description = "Protocol for the Target Group for Prometheus on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_prometheus_health_check" {
  description = "Health check configuration for the Target Group for Prometheus on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_prometheus_port" {
  description = "Port for the HTTP listener for Prometheus on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_prometheus_protocol" {
  description = "Protocol for the HTTP listener for Prometheus on the ELB"
  type        = string

}

variable "aws_lb_listener_http_listener_prometheus_default_action_type" {
  description = "Type of default action for the HTTP listener for Prometheus on the ELB"
  type        = string
}

# alb & target group - Node Exporter
variable "aws_lb_target_group_eks_tg_node_exporter_name" {
  description = "Name of the Target Group for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_node_exporter_port" {
  description = "Port for the Target Group for Node Exporter on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_node_exporter_protocol" {
  description = "Protocol for the Target Group for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_node_exporter_health_check" {
  description = "Health check configuration for the Target Group for Node Exporter on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_node_exporter_port" {
  description = "Port for the HTTP listener for Node Exporter on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_node_exporter_protocol" {
  description = "Protocol for the HTTP listener for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_node_exporter_default_action_type" {
  description = "Type of default action for the HTTP listener for Node Exporter on the ELB"
  type        = string
}

# argocd credentials
variable "null_resource_get_argocd_admin_password_remote_exec_inline" {
  type    = string
  # Replace the value with your actual command for fetching the ArgoCD admin password using remote-exec provisioner
}

variable "null_resource_get_argocd_admin_password_connection_type" {
  type    = string
  # Specify the connection type (e.g., ssh)
}

variable "null_resource_get_argocd_admin_password_connection_user" {
  type    = string
  # Specify the username used for the connection
}

variable "null_resource_get_argocd_admin_password_connection_private_key" {
  type    = string
  # Specify the path to the private key file used for authentication
}

variable "null_resource_output_argocd_admin_password_remote_exec_inline" {
  type    = string
  # Replace the value with your actual command for outputting the ArgoCD admin password using remote-exec provisioner
}

variable "null_resource_output_argocd_admin_password_connection_type" {
  type    = string
  # Specify the connection type (e.g., ssh)
}

variable "null_resource_output_argocd_admin_password_connection_user" {
  type    = string
  # Specify the username used for the connection
}

variable "null_resource_output_argocd_admin_password_connection_private_key" {
  type    = string
  # Specify the path to the private key file used for authentication
}

Variables required for folder eks_alb

argocd_netflix — subfolder

argocd.tf

# Wait for the Argo CD server to become ready
resource "null_resource" "argocd_ready" {
  # depends_on = [kubernetes_ingress_v1.argocd_ingress]

  provisioner "remote-exec" {
    inline = [
       "ssh -i \"web-ec2.pem\"  ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_argocd_ready_remote_exec_inline}" # "kubectl wait --for=condition=Ready pod -l app.kubernetes.io/name=argocd-server --namespace=argocd"
    ]
  }

  connection {
    type        = var.null_resource_wait_for_argocd_connection_type # "ssh"
    user        = var.null_resource_wait_for_argocd_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_wait_for_argocd_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
  }
}

resource "argocd_application" "netflix" {
  depends_on = [null_resource.argocd_ready]
  metadata {
    name      = var.argocd_application_netflix_metadata_name # "netflix"
    namespace = var.argocd_application_netflix_metadata_namespace # "argocd"
  }

  spec {
    project = var.argocd_application_netflix_spec_project_name # "default"

    source {
      repo_url        = var.argocd_application_netflix_spec_source["repo_url"] # "https://github.com/lightninglife/DevSecOps-Project.git"
      target_revision = var.argocd_application_netflix_spec_source["target_revision"] # "HEAD"
      path            = var.argocd_application_netflix_spec_source["path"] # "Kubernetes"
      directory {
        recurse =  var.argocd_application_netflix_spec_source_directory["recurse"] # true
      }
    }

    destination {
      server     = var.argocd_application_netflix_spec_destination["server"] # "https://kubernetes.default.svc"
      namespace = var.argocd_application_netflix_spec_destination["namespace"] # ["argocd"]
    }

    sync_policy {
      sync_options = var.argocd_application_netflix_spec_sync_policy["sync_options"] # ["CreateNamespace=true"]

      automated {
        prune    = var.argocd_application_netflix_spec_sync_policy_automated["prune"] # true
        self_heal = var.argocd_application_netflix_spec_sync_policy_automated["self_heal"] # true
      }
    }
  }
}

Below script does a quick check to see if argocd is ready before we deploy Netflix App

# Wait for the Argo CD server to become ready
resource "null_resource" "argocd_ready" {
  # depends_on = [kubernetes_ingress_v1.argocd_ingress]

  provisioner "remote-exec" {
    inline = [
       "ssh -i \"web-ec2.pem\"  ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_argocd_ready_remote_exec_inline}" # "kubectl wait --for=condition=Ready pod -l app.kubernetes.io/name=argocd-server --namespace=argocd"
    ]
  }

  connection {
    type        = var.null_resource_wait_for_argocd_connection_type # "ssh"
    user        = var.null_resource_wait_for_argocd_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_wait_for_argocd_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
  }
}

The value of command to check

"sudo -u ec2-user /usr/bin/kubectl wait --for=condition=Ready pod -l app.kubernetes.io/name=argocd-server --namespace=argocd\""

As you can see above, we attempt to acces to pod with label of app.kubernetes.io/name=argocd-server under the namespace argocd. When the condition mataches ready, which means the pod is ready to serve traffic, then we run below script since depends_on [null_resource.argocd_ready] is applied

resource "argocd_application" "netflix" {
  depends_on = [null_resource.argocd_ready]
  metadata {
    name      = var.argocd_application_netflix_metadata_name # "netflix"
    namespace = var.argocd_application_netflix_metadata_namespace # "argocd"
  }

  spec {
    project = var.argocd_application_netflix_spec_project_name # "default"

    source {
      repo_url        = var.argocd_application_netflix_spec_source["repo_url"] # "https://github.com/lightninglife/DevSecOps-Project.git"
      target_revision = var.argocd_application_netflix_spec_source["target_revision"] # "HEAD"
      path            = var.argocd_application_netflix_spec_source["path"] # "Kubernetes"
      directory {
        recurse =  var.argocd_application_netflix_spec_source_directory["recurse"] # true
      }
    }

    destination {
      server     = var.argocd_application_netflix_spec_destination["server"] # "https://kubernetes.default.svc"
      namespace = var.argocd_application_netflix_spec_destination["namespace"] # ["argocd"]
    }

    sync_policy {
      sync_options = var.argocd_application_netflix_spec_sync_policy["sync_options"] # ["CreateNamespace=true"]

      automated {
        prune    = var.argocd_application_netflix_spec_sync_policy_automated["prune"] # true
        self_heal = var.argocd_application_netflix_spec_sync_policy_automated["self_heal"] # true
      }
    }
  }
}

Above script is used to deploy Netflix App using Github under the path of Kubernetes

data.tf

data "aws_instance" "eks_netflix" { 
  filter {
    name   = "tag:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }


}

data "aws_instance" "bastion" { 
  filter {
    name   = "tag:Name"
    values = ["bastion-host"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }
}

Use data to retrieve both bastion and netflix instance for reference

variables.tf

variable "null_resource_argocd_ready_remote_exec_inline" {
  type    = string
}

variable "null_resource_wait_for_argocd_connection_type" {
  type    = string
}

variable "null_resource_wait_for_argocd_connection_user" {
  type    = string
}

variable "null_resource_wait_for_argocd_connection_private_key" {
  type    = string
}

variable "argocd_application_netflix_metadata_name" {
  type    = string
}

variable "argocd_application_netflix_metadata_namespace" {
  type    = string
}

variable "argocd_application_netflix_spec_project_name" {
  type    = string
}

variable "argocd_application_netflix_spec_source" {
  type    = any
}

variable "argocd_application_netflix_spec_source_directory" {
  type    = any
}

variable "argocd_application_netflix_spec_destination" {
  type    = any
}

variable "argocd_application_netflix_spec_sync_policy" {
  type    = any
}

# variable "null_resource_get_argocd_admin_password_remote_exec_inline" {
#   type    = string
# }

variable "argocd_application_netflix_spec_sync_policy_automated" {
  type    = any
}

Variables used in argocd.tf file

versions.tf

terraform {
  required_version = "~>1.7"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "<= 5.38"
    }

    argocd = {
      source = "oboukili/argocd"
      version = "6.0.3"
    }
  }
}

# Note: Replace the version constraints with those appropriate for your project.

Note: Make sure you have below argocd included and configured. Otherwise, terraform will automatically pick up hashicorp/argocd, which doesn’t exist.

argocd = {
      source = "oboukili/argocd"
      version = "6.0.3"
    }

eks_deletion — subfolder

main.tf

resource "null_resource" "patch_ingress" {
  provisioner "remote-exec" {
    inline = [
      "ssh -i \"web-ec2.pem\"  ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_patch_ingress_remote_exec_inline}" # "kubectl patch ingress argocd-server -n argocd -p '{\"metadata\":{\"finalizers\":[]}}' --type=merge"
    ]
  }

  # You can specify the connection information here.
  connection {
    type        = var.null_resource_patch_ingress_connection_type # "ssh"
    user        = var.null_resource_patch_ingress_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_patch_ingress_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
  }
}

# security groups associated to delete
resource "aws_security_group" "deleted" {
  count = length(data.aws_security_groups.filtered.ids) + length(data.aws_security_groups.filtered_shared.ids)
  depends_on = [data.aws_lb.argocd]
  provisioner "local-exec" {
    command = "sleep 60; aws ec2 delete-security-group --group-id ${concat(data.aws_security_groups.filtered.ids, data.aws_security_groups.filtered_shared.ids)[count.index]}"
  }
}

Below script is used to fix a bug when deleting ingress of alb controller

resource "null_resource" "patch_ingress" {
  provisioner "remote-exec" {
    inline = [
      "ssh -i \"web-ec2.pem\"  ec2-user@${data.aws_instance.eks_netflix.private_ip} \"${var.null_resource_patch_ingress_remote_exec_inline}" # "kubectl patch ingress argocd-server -n argocd -p '{\"metadata\":{\"finalizers\":[]}}' --type=merge"
    ]
  }

  # You can specify the connection information here.
  connection {
    type        = var.null_resource_patch_ingress_connection_type # "ssh"
    user        = var.null_resource_patch_ingress_connection_user # "your_ssh_user"
    private_key = file(var.null_resource_patch_ingress_connection_private_key)
    host        = data.aws_instance.bastion.public_ip
  }
}

Again, below script will set finalizers for ingress, and then get ingress ready to be deleted

"sudo -u ec2-user /usr/bin/kubectl patch ingress argocd-server -n argocd -p '{\\\"metadata\\\":{\\\"finalizers\\\":[]}}' --type=merge\""

Below script to delete 2 security groups created using by alb created by ingress controller. If not handlled, vpc will not be created when destroy terraform pipeline due to dependency issue

# security groups associated to delete
resource "aws_security_group" "deleted" {
  count = length(data.aws_security_groups.filtered.ids) + length(data.aws_security_groups.filtered_shared.ids)
  depends_on = [data.aws_lb.argocd]
  provisioner "local-exec" {
    command = "sleep 60; aws ec2 delete-security-group --group-id ${concat(data.aws_security_groups.filtered.ids, data.aws_security_groups.filtered_shared.ids)[count.index]}"
  }
}

I tested runing it along with above script, sometimes it didn’t work. So I intentionally force 60 seconds to sleep before deleting security groups associated

data.aws_security_groups.filtered.ids

data.aws_security_groups.filtered_shared.ids

Above 2 data retrievals are used to refer to those 2 security groups intented

Then we use concat to delete them together

concat(data.aws_security_groups.filtered.ids, data.aws_security_groups.filtered_shared.ids)[count.index]

data.tf

data "aws_instance" "eks_netflix" { 
  filter {
    name   = "tag:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }


}

data "aws_instance" "bastion" { 
  filter {
    name   = "tag:Name"
    values = ["bastion-host"]
  }

  filter {
    name   = "instance-state-name"
    values = ["running"]
  }
}

data "aws_lb" "argocd" {
  name = "argocd"
}

data "aws_security_groups" "filtered" {
  tags = {
    "elbv2.k8s.aws/cluster"    = "eks-netflix-cluster"
    "ingress.k8s.aws/stack"     = "argocd/argocd-server"
    "ingress.k8s.aws/resource"  = "ManagedLBSecurityGroup"
  }
}

data "aws_security_groups" "filtered_shared" {
  tags = {
    "elbv2.k8s.aws/resource" = "backend-sg"
    "elbv2.k8s.aws/cluster"  = "eks-netflix-cluster"
  }
}

Note: When using data to retrieve both security groups, you may provide as many tags as possible to avoid nothing is retrieved

variables.tf

variable "null_resource_patch_ingress_remote_exec_inline" {
  description = "Inline command to execute remotely using remote-exec provisioner"
  type        = string
}

variable "null_resource_patch_ingress_connection_type" {
  description = "Type of connection for the remote-exec provisioner"
  type        = string
}

variable "null_resource_patch_ingress_connection_user" {
  description = "Username for SSH connection used in the remote-exec provisioner"
  type        = string
}

variable "null_resource_patch_ingress_connection_private_key" {
  description = "Private key file path for SSH connection used in the remote-exec provisioner"
  type        = string
}

Now may wrap up our modules. We will now start off our deployment with 3 different stages by running with the following orders

eks_cluster

eks_alb

argocd_netflix

Stage 1 — eks_cluster

main.tf

module "asg" {
  source = ".././modules/asg" # Replace with the actual path to your module directory

  # Use the same variable for multiple input arguments
  key_pair_name                                                  = var.key_pair_name
  aws_launch_template_netflix_vpc_security_group_ids             = module.sg.security_group_ids
  aws_launch_template_netflix_name_prefix                        = var.aws_launch_template_netflix_name_prefix
  aws_launch_template_netflix_image_id                           = var.aws_launch_template_netflix_image_id
  aws_launch_template_netflix_instance_type                      = var.aws_launch_template_netflix_instance_type
  aws_launch_template_netflix_block_device_mappings_device_name  = var.aws_launch_template_netflix_block_device_mappings_device_name
  aws_launch_template_netflix_block_device_mappings_volume_size  = var.aws_launch_template_netflix_block_device_mappings_volume_size
  aws_launch_template_netflix_create_before_destroy              = var.aws_launch_template_netflix_create_before_destroy
  aws_autoscaling_group_netflix_desired_capacity                 = var.aws_autoscaling_group_netflix_desired_capacity
  aws_autoscaling_group_netflix_max_size                         = var.aws_autoscaling_group_netflix_max_size
  aws_autoscaling_group_netflix_min_size                         = var.aws_autoscaling_group_netflix_min_size
  aws_autoscaling_group_netflix_launch_template_version          = var.aws_autoscaling_group_netflix_launch_template_version
  aws_autoscaling_group_netflix_tag_key                          = var.aws_autoscaling_group_netflix_tag_key
  aws_autoscaling_group_netflix_tag_value                        = var.aws_autoscaling_group_netflix_tag_value
  aws_autoscaling_group_netflix_tag_propagate_at_launch          = var.aws_autoscaling_group_netflix_tag_propagate_at_launch
  aws_launch_template_netflix_user_data                          = var.aws_launch_template_netflix_user_data
  aws_autoscaling_group_netflix_vpc_zone_identifier              = module.vpc.subnet_ids
  aws_launch_template_netflix_network_interfaces_security_groups = module.sg.security_group_ids
  eks_cluster_netflix_name                                       = var.eks_cluster_netflix_name
  aws_eks_node_group_instance_types                              = var.aws_eks_node_group_instance_types
  # kubernetes_network_policy_jenkins_network_policy_spec_ingress_app = var.kubernetes_network_policy_jenkins_network_policy_spec_ingress_app
  aws_eks_cluster_netflix_version = var.aws_eks_cluster_netflix_version
}

module "eks" {
  source = ".././modules/eks" # Replace with the actual path to your module directory

  # Use the same variable for multiple input arguments
  eks_cluster_netflix_name                       = var.eks_cluster_netflix_name
  aws_eks_node_group_netflix_name                = var.aws_eks_node_group_netflix_name
  aws_eks_node_group_instance_types              = var.aws_eks_node_group_instance_types
  aws_eks_node_group_desired_capacity            = var.aws_eks_node_group_desired_capacity
  aws_eks_node_group_min_size                    = var.aws_eks_node_group_min_size
  aws_eks_node_group_max_size                    = var.aws_eks_node_group_max_size
  aws_eks_node_group_launch_template_name_prefix = module.asg.launch_template_id_netflix
  aws_eks_node_group_launch_template_version     = var.aws_eks_node_group_launch_template_version
  aws_eks_node_group_device_name                 = var.aws_eks_node_group_device_name
  aws_eks_node_group_volume_size                 = var.aws_eks_node_group_volume_size
  subnets                                        = module.vpc.subnet_ids
  # kubernetes_network_policy_jenkins_network_policy_policy_types                                     = var.kubernetes_network_policy_jenkins_network_policy_policy_types
  # kubernetes_service_jenkins_master_service_load_balancer_ip                                        = module.alb.jenkins_alb_dns_name
  aws_eks_cluster_netflix_role_arn = module.iam.eks_netflix_cluster_iam_role_arn
  # kubernetes_horizontal_pod_autoscaler_jenkins_hpa_spec_metric_resource_target_type                 = var.kubernetes_horizontal_pod_autoscaler_jenkins_hpa_spec_metric_resource_target_type
  aws_eks_node_group_netflix_role_arn                                   = module.iam.eks_netflix_nodegroup_iam_role_arn
  ec2_ssh_key                                                           = "${path.module}/web-ec2.pem"
  eks_worker_node_policy_attachment_netflix                             = module.iam.eks_worker_node_policy_attachment_netflix
  eks_cni_policy_attachment_netflix                                     = module.iam.eks_cni_policy_attachment_netflix
  eks_ec2_container_registry_readonly_attachment_netflix                = module.iam.eks_ec2_container_registry_readonly_attachment_netflix
  aws_eks_node_group_launch_template_name_prefix_netflix                = module.asg.launch_template_id_netflix
  aws_eks_addon_netflix_addon_name                                      = var.aws_eks_addon_netflix_addon_name
  aws_eks_addon_netflix_addon_version                                   = var.aws_eks_addon_netflix_addon_version
  aws_eks_cluster_netflix_security_group_ids                            = module.sg.security_group_id_eks_cluster
  aws_eks_cluster_netflix_version                                       = var.aws_eks_cluster_netflix_version
  aws_instance_eks_cluster_netflix_bastion_host_ami                     = var.aws_instance_eks_cluster_netflix_bastion_host_ami
  aws_instance_eks_cluster_netflix_bastion_host_instance_type           = var.aws_instance_eks_cluster_netflix_bastion_host_instance_type
  key_pair_name                                                         = var.key_pair_name
  aws_instance_eks_cluster_netflix_bastion_host_subnet_id               = module.vpc.public_subnet
  aws_instance_eks_cluster_netflix_bastion_host_security_groups         = module.sg.security_group_ids
  aws_instance_eks_cluster_netflix_bastion_host_tags                    = var.aws_instance_eks_cluster_netflix_bastion_host_tags
  aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination = var.aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination
  aws_instance_eks_cluster_netflix_bastion_host_provisioner_source      = "${path.module}/web-ec2.pem"
  aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline      = var.aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline
  # kubernetes_manifest_netflix_manifest                                  = file("${path.module}/retail-store-sample-app-deploy.yaml")
  # apply_kubernetes_manifest_netflix_command = var.apply_kubernetes_manifest_netflix_command
  # wait_for_deployments_netflix_command      = var.wait_for_deployments_netflix_command
  aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy         = module.iam.eks_AmazonEKSClusterPolicy
  aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController = module.iam.eks_AmazonEKSVPCResourceController
  aws_eks_cluster_netflix_enabled_cluster_log_types                 = var.aws_eks_cluster_netflix_enabled_cluster_log_types
  # kubernetes_manifest_argo_cd                                       = file("${path.module}/install.yaml")
  aws_instance_eks_cluster_netflix_bastion_host_file_type = var.aws_instance_eks_cluster_netflix_bastion_host_file_type
  aws_instance_eks_cluster_netflix_bastion_host_file_user = var.aws_instance_eks_cluster_netflix_bastion_host_file_user
}

module "iam" {
  source = ".././modules/iam" # Replace with the actual path to your module directory

  # Use the same variable for multiple input arguments
  aws_iam_role_eks_cluster_netflix_name = var.aws_iam_role_eks_cluster_netflix_name
  # aws_iam_role_eks_cluster_assume_role_policy_netflix = templatefile("${path.module}/assume_role_policy.tpl.json", {
  #   account_id       = data.aws_caller_identity.current.account_id,
  #   oidc_provider_url = module.iam.oidc_provider_url
  # })
  aws_iam_role_eks_cluster_assume_role_policy_netflix                      = file("${path.module}/assume_role_policy.json")
  aws_iam_role_eks_nodegroup_role_netflix_name                             = var.aws_iam_role_eks_nodegroup_role_netflix_name
  eks_netflix_url                                                          = module.eks.eks_cluster_netflix_url
  eks_cluster_netflix                                                      = module.eks.eks_cluster_netflix
  aws_iam_role_eks_cluster_assume_role_policy_netflix_updated              = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "eks.amazonaws.com",
          "ec2.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "${data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn}"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringLike": {
          "${replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")}:sub": "system:serviceaccount:*"
        }
      }
    }
  ]
}
EOF
  data_http_lbc_iam_policy_url                                             = var.data_http_lbc_iam_policy_url
  data_http_lbc_iam_policy_request_headers_accept                          = var.data_http_lbc_iam_policy_request_headers_accept
  aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy                = var.aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy
  aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController        = var.aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController
  aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy               = var.aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy
  aws_iam_policy_attachment_eks_worker_node_policy_name                    = var.aws_iam_policy_attachment_eks_worker_node_policy_name
  aws_iam_policy_attachment_eks_worker_node_policy_policy_arn              = var.aws_iam_policy_attachment_eks_worker_node_policy_policy_arn
  aws_iam_policy_attachment_eks_cni_policy_name                            = var.aws_iam_policy_attachment_eks_cni_policy_name
  aws_iam_policy_attachment_eks_cni_policy_policy_arn                      = var.aws_iam_policy_attachment_eks_cni_policy_policy_arn
  aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name       = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name
  aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn = var.aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn
}

module "sg" {
  source = ".././modules/sg" # Replace with the actual path to your module directory

  # Use the same variable for multiple input arguments
  security_group_name                    = var.security_group_name
  security_group_description             = var.security_group_description
  security_group_name_eks_cluster        = var.security_group_name_eks_cluster
  security_group_description_eks_cluster = var.security_group_description_eks_cluster
  vpc_id                                 = module.vpc.vpc_id
  port_80                                = var.port_80
  port_443                               = var.port_443
  port_22                                = var.port_22
  port_3000                              = var.port_3000
  port_8080                              = var.port_8080
  # port_8081                              = var.port_8081
  port_10250              = var.port_10250
  port_30007              = var.port_30007
  port_9000               = var.port_9000
  port_9090               = var.port_9090
  port_9100               = var.port_9100
  port_9443               = var.port_9443
  port_3306               = var.port_3306
  security_group_protocol = var.security_group_protocol
  web_cidr                = var.web_cidr
  private_ip_address      = var.private_ip_address
  vpc_cidr_block          = var.vpc_cidr_block
}

module "vpc" {
  source = ".././modules/vpc" # Replace with the actual path to your module directory

  # Use the same variable for multiple input arguments
  vpc_cidr_block             = var.vpc_cidr_block
  vpc_name                   = var.vpc_name
  public_subnet_cidr_blocks  = var.public_subnet_cidr_blocks
  private_subnet_cidr_blocks = var.private_subnet_cidr_blocks
  # subnet                     = var.subnet
  igw_name                         = var.igw_name
  web_cidr                         = var.web_cidr
  rt_name                          = var.rt_name
  rt_association                   = var.rt_association
  availability_zones               = var.availability_zones
  aws_subnet_public_name           = var.aws_subnet_public_name
  aws_subnet_public_eks_alb        = var.aws_subnet_public_eks_alb
  aws_subnet_public_eks_alb_value  = var.aws_subnet_public_eks_alb_value
  aws_subnet_private_name          = var.aws_subnet_private_name
  aws_subnet_private_eks_alb       = var.aws_subnet_private_eks_alb
  aws_subnet_private_eks_alb_value = var.aws_subnet_private_eks_alb_value
}

Appying modules from modules foder

asg

eks

iam

vpc

data.tf

data "aws_caller_identity" "current" {

}

data "aws_iam_openid_connect_provider" "eks_cluster_netflix" {
  arn = module.iam.oidc_provider_arn
}

data "aws_eks_cluster_auth" "cluster" {
  name = module.eks.eks_cluster_netflix_name
}

Above is retrieve account info, iam provider info and cluster crentials info

assume_role_policy.json

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Action": "sts:AssumeRole",
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "eks.amazonaws.com",
          "ec2.amazonaws.com"
        ]
      }
    }
  ]
}

Above script is used for EKS Cluster’s IAM role

outputs.tf

output "security_group_name" {
  value = var.security_group_name
}

output "security_group_description" {
  value = var.security_group_description
}

output "security_group_name_eks_cluster" {
  value = var.security_group_name_eks_cluster
}

output "security_group_description_eks_cluster" {
  value = var.security_group_description_eks_cluster
}

output "private_ip_address" {
  value = var.private_ip_address
}

output "vpc_cidr_block" {
  value = var.vpc_cidr_block
}

output "vpc_id" {
  value = module.vpc.vpc_id
}

output "vpc_name" {
  value = var.vpc_name
}

output "public_subnet_cidr_blocks" {
  value = var.public_subnet_cidr_blocks
}

output "private_subnet_cidr_blocks" {
  value = var.private_subnet_cidr_blocks
}

output "availability_zones" {
  value = var.availability_zones
}

output "igw_name" {
  value = var.igw_name
}

output "rt_name" {
  value = var.rt_name
}

output "rt_association" {
  value = var.rt_association
}

output "eks_cluster_netflix_name" {
  value = var.eks_cluster_netflix_name
}

output "aws_eks_node_group_netflix_name" {
  value = var.aws_eks_node_group_netflix_name
}

output "aws_eks_node_group_instance_types" {
  value = var.aws_eks_node_group_instance_types
}

output "aws_eks_node_group_desired_capacity" {
  value = var.aws_eks_node_group_desired_capacity
}

output "aws_eks_node_group_min_size" {
  value = var.aws_eks_node_group_min_size
}

output "aws_eks_node_group_max_size" {
  value = var.aws_eks_node_group_max_size
}

output "aws_eks_node_group_launch_template_name_prefix" {
  value = var.aws_eks_node_group_launch_template_name_prefix
}

output "aws_eks_node_group_launch_template_version" {
  value = var.aws_eks_node_group_launch_template_version
}

output "aws_eks_node_group_device_name" {
  value = var.aws_eks_node_group_device_name
}

output "aws_eks_node_group_volume_size" {
  value = var.aws_eks_node_group_volume_size
}

output "aws_eks_cluster_netflix_version" {
  value = var.aws_eks_cluster_netflix_version
}

output "aws_eks_addon_netflix_addon_name" {
  value = var.aws_eks_addon_netflix_addon_name
}

output "aws_eks_addon_netflix_addon_version" {
  value = var.aws_eks_addon_netflix_addon_version
}

output "key_pair_name" {
  value = var.key_pair_name
}

output "aws_launch_template_netflix_name_prefix" {
  value = var.aws_launch_template_netflix_name_prefix
}

output "aws_launch_template_netflix_image_id" {
  value = var.aws_launch_template_netflix_image_id
}

output "aws_launch_template_netflix_instance_type" {
  value = var.aws_launch_template_netflix_instance_type
}

output "eks_asg_name" {
  value = module.eks.eks_asg_name
}

output "sg_id" {
  value = module.sg.security_group_id
}

output "subnet_ids" {
  value = module.vpc.subnet_ids
}

output "oidc_provider_arn" {
  value = module.iam.oidc_provider_arn
}

output all info needed to be referred in other stages or for visibility

providers.tf

provider "aws" {
  region  = "us-east-1"
  profile = "default"
}

provider "kubernetes" {
  host                   = module.eks.eks_cluster_netflix.endpoint
  cluster_ca_certificate = base64decode(module.eks.eks_cluster_netflix.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

provider for aws is configured and kubernetes is configured with host, cert and token from references (data or output from other eks modules)

s3.tf

terraform {
  backend "s3" {
    bucket  = "eks-netflix-argocd"
    key     = "eks-cluster"
    region  = "us-east-1"
    encrypt = true
    profile = "default"
  }
}

Configure state file in S3 bucket

terraform.tfvars (values of variables needed)

(you may find it from repo here)

variables.tf

# security group
variable "security_group_name" {
  description = "Name of the AWS security group"
  type        = string
}

variable "security_group_description" {
  description = "Description of the AWS security group"
  type        = string
}

variable "security_group_name_eks_cluster" {
  description = "Name of the AWS security group for eks cluster"
  type        = string
}

variable "security_group_description_eks_cluster" {
  description = "Description of the AWS security group for eks cluster"
  type        = string
}


# variable "vpc_id" {
#   description = "ID of the VPC where the security group will be created"
#   type        = string
# }

variable "port_80" {
  description = "Port for HTTP traffic (e.g., 80)"
  type        = number
}

variable "port_443" {
  description = "Port for HTTPS traffic (e.g., 443)"
  type        = number
}

variable "port_22" {
  description = "Port for SSH access (e.g., 22)"
  type        = number
}

variable "port_3000" {
  description = "Port for HTTP access for Grafana (e.g., 22)"
  type        = number
}

variable "port_8080" {
  description = "Port for HTTP access for Jenkins (e.g., 8080)"
  type        = number
}

variable "port_30007" {
  description = "Port for HTTP access for Netflix (e.g., 30007)"
  type        = number
}

# variable "port_8081" {
#   description = "Port for HTTP access for Netflix (e.g., 8081)"
#   type        = number
# }


variable "port_10250" {
  description = "Port for HTTP access for Argocd (e.g., 10250)"
  type        = number
}

variable "port_9000" {
  description = "Port for HTTP access for Netflix (e.g., 9000)"
  type        = number
}

variable "port_9090" {
  description = "Port for HTTP access for Prometheus (e.g., 9090)"
  type        = number
}

variable "port_9100" {
  description = "Port for HTTP access for Node Exporter (e.g., 9100)"
  type        = number
}

variable "port_9443" {
  description = "Port for HTTP access for Argocd Manifest (e.g., 9443)"
  type        = number
}

variable "port_3306" {
  description = "Port for MySQL access for RDS (e.g., 3306)"
  type        = number
}

variable "security_group_protocol" {
  description = "Protocol for the security group rules (e.g., 'tcp', 'udp', 'icmp', etc.)"
  type        = string
}

variable "web_cidr" {
  description = "CIDR block for incoming HTTP and HTTPS traffic"
  type        = string
}

variable "private_ip_address" {
  description = "CIDR block for private IP addresses (e.g., for SSH, Jenkins, MySQL)"
  type        = string
}


# VPC variables
variable "vpc_cidr_block" {
  description = "CIDR block for the VPC"
  type        = string
}

variable "vpc_name" {
  description = "Name for the VPC"
  type        = string
}

# Subnet variables
variable "public_subnet_cidr_blocks" {
  description = "List of CIDR blocks for public subnets"
  type        = list(string)
}

variable "private_subnet_cidr_blocks" {
  description = "List of CIDR blocks for private subnets"
  type        = list(string)
}

# variable "subnet" {
#   description = "Name of the subnet"
#   type        = string
# }

# Internet Gateway variables
variable "igw_name" {
  description = "Name for the Internet Gateway"
  type        = string
}

variable "rt_name" {
  description = "Name for the Route Table"
  type        = string
}

# Route Table Association variables
variable "rt_association" {
  description = "Name prefix for Route Table Association"
  type        = string
}

variable "availability_zones" {
  type = list(string)
}

variable "aws_subnet_public_name" {
  description = "Name of the public subnet"
  type        = string
}

variable "aws_subnet_public_eks_alb" {
  description = "Name of the public subnet associated with the EKS Application Load Balancer (ALB)"
  type        = string
}

variable "aws_subnet_public_eks_alb_value" {
  description = "Value of the public subnet associated with the EKS ALB"
  type        = number
}

variable "aws_subnet_private_name" {
  description = "Name of the private subnet"
  type        = string
}

variable "aws_subnet_private_eks_alb" {
  description = "Name of the private subnet associated with the EKS Application Load Balancer (ALB)"
  type        = string
}

variable "aws_subnet_private_eks_alb_value" {
  description = "Value of the private subnet associated with the EKS ALB"
  type        = number
}


# eks

variable "aws_eks_cluster_netflix_version" {
  description = "The version of netflix to use with AWS EKS cluster"
  type        = string
  # You can set your desired default value here
}

variable "eks_cluster_netflix_name" {
  description = "Name of the netflix EKS cluster"
  type        = string
}

variable "aws_eks_node_group_netflix_name" {
  description = "Name of the netflix EKS node group"
  type        = string
}

variable "aws_eks_node_group_instance_types" {
  description = "Instance types for the EKS node group"
  type        = string
}

variable "aws_eks_node_group_desired_capacity" {
  description = "Desired capacity for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_min_size" {
  description = "Minimum size for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_max_size" {
  description = "Maximum size for the EKS node group"
  type        = number
}

variable "aws_eks_node_group_launch_template_name_prefix" {
  description = "Name prefix for the EKS node group launch template"
  type        = string
}

variable "aws_eks_node_group_launch_template_version" {
  description = "Version for the EKS node group launch template"
  type        = string
}

variable "aws_eks_node_group_device_name" {
  description = "Device name for the EKS node group block device mappings"
  type        = string
}

variable "aws_eks_node_group_volume_size" {
  description = "Volume size for the EKS node group block device mappings"
  type        = number
}

variable "aws_eks_addon_netflix_addon_name" {
  description = "Name of the AWS EKS addon for netflix"
  type        = string
}

variable "aws_eks_addon_netflix_addon_version" {
  description = "Version of the AWS EKS addon for netflix"
  type        = string
}

# variable "kubernetes_manifest_netflix_manifest" {
#   type    = map(string)
#   description = "List of paths to Kubernetes manifest files for the retail store sample app"
# }

# variable "apply_kubernetes_manifest_netflix_command" {
#   type        = string
#   description = "Command for applying the Kubernetes manifest for the retail store sample app"
# }

# variable "wait_for_deployments_netflix_command" {
#   type        = string
#   description = "Command for waiting for deployments to be available for the retail store sample app"
# }

variable "aws_eks_cluster_netflix_enabled_cluster_log_types" {
  description = "The log types of Netflix EKS cluster"
  type        = list(string)
}

variable "aws_instance_eks_cluster_netflix_bastion_host_file_type" {
  description = "The file type of the Netflix bastion host file"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_file_user" {
  description = "The user for accessing the Netflix bastion host file"
  type        = string
}

# asg
variable "key_pair_name" {
  description = "Name of the AWS Key Pair to associate with EC2 instances"
  type        = string
  # Set a default value if needed
}

variable "aws_launch_template_netflix_name_prefix" {
  description = "Name prefix for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_image_id" {
  description = "AMI ID for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_instance_type" {
  description = "Instance type for the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_block_device_mappings_device_name" {
  description = "Device name for block device mappings in the AWS launch template"
  type        = string
}

variable "aws_launch_template_netflix_block_device_mappings_volume_size" {
  description = "Volume size for block device mappings in the AWS launch template"
  type        = number
}

variable "aws_launch_template_netflix_create_before_destroy" {
  description = "Lifecycle setting for create_before_destroy in the AWS launch template"
  type        = bool
}

variable "aws_autoscaling_group_netflix_desired_capacity" {
  description = "Desired capacity for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_max_size" {
  description = "Maximum size for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_min_size" {
  description = "Minimum size for the AWS Auto Scaling Group"
  type        = number
}

variable "aws_autoscaling_group_netflix_launch_template_version" {
  description = "Launch template version for the AWS Auto Scaling Group"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_key" {
  description = "Tag key for the AWS Auto Scaling Group instances"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_value" {
  description = "Tag value for the AWS Auto Scaling Group instances"
  type        = string
}

variable "aws_autoscaling_group_netflix_tag_propagate_at_launch" {
  description = "Tag propagation setting for the AWS Auto Scaling Group instances"
  type        = bool
}

variable "aws_launch_template_netflix_user_data" {
  description = "Userdata file"
  type        = string
}

# iam
variable "aws_iam_role_eks_cluster_netflix_name" {
  description = "Iam role name for esk cluster netflix"
  type        = string
}

# variable "aws_iam_role_eks_cluster_assume_role_policy_netflix_updated" {
#   description = "Name of the IAM role associated with EKS nodegroups for netflix"
#   type        = string
#   # You can set a default value if needed
#   # default     = "example-role-name"
# }

variable "aws_iam_role_eks_nodegroup_role_netflix_name" {
  type        = string
  description = "IAM role policy for assuming roles in the EKS cluster for Netflix (updated)"
}

variable "data_http_lbc_iam_policy_url" {
  description = "The URL for the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "data_http_lbc_iam_policy_request_headers_accept" {
  description = "The value for the 'Accept' header in the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSClusterPolicy" {
  description = "ARN of the IAM policy attached to an EKS cluster role allowing control plane to make API requests on your behalf"
  type        = string
}

variable "aws_iam_role_policy_attachment_eks_AmazonEKSVPCResourceController" {
  description = "ARN of the IAM policy attached to an EKS cluster role allowing the VPC resource controller to make API requests on your behalf"
  type        = string
}

variable "aws_iam_role_eks_nodegroup_role_netflix_assume_role_policy" {
  description = "The assume role policy document for the Netflix EKS node group role"
  type        = any
}

variable "aws_iam_policy_attachment_eks_worker_node_policy_name" {
  description = "The name of the IAM policy attachment for the EKS worker node policy"
  type        = string
}

variable "aws_iam_policy_attachment_eks_worker_node_policy_policy_arn" {
  description = "The ARN of the IAM policy attached to EKS worker nodes"
  type        = string
}

variable "aws_iam_policy_attachment_eks_cni_policy_name" {
  description = "The name of the IAM policy attachment for the EKS CNI policy"
  type        = string
}

variable "aws_iam_policy_attachment_eks_cni_policy_policy_arn" {
  description = "The ARN of the IAM policy attached to EKS CNI"
  type        = string
}

variable "aws_iam_policy_attachment_eks_ec2_container_registry_readonly_name" {
  description = "The name of the IAM policy attachment for EC2 Container Registry readonly access"
  type        = string
}

variable "aws_iam_policy_attachment_eks_ec2_container_registry_readonly_policy_arn" {
  description = "The ARN of the IAM policy attached to EC2 Container Registry for readonly access"
  type        = string
}


# bastion
variable "aws_instance_eks_cluster_netflix_bastion_host_ami" {
  description = "The AMI ID for the bastion host"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_instance_type" {
  description = "The instance type for the bastion host"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_tags" {
  description = "Tags for the bastion host instance"
  type        = string
}

# variable "aws_instance_eks_cluster_netflix_bastion_host_provisioner_source" {
#   description = "Source path of the file to be provisioned to the bastion host"
#   type        = string
# }

variable "aws_instance_eks_cluster_netflix_bastion_host_provisioner_destination" {
  description = "Destination path on the bastion host where the file will be copied"
  type        = string
}

variable "aws_instance_eks_cluster_netflix_bastion_host_remote_exec_inline" {
  description = "Inline script to be executed on the bastion host using remote-exec provisioner"
  type        = list(string)
}

versions.tf

terraform {
  required_version = "~>1.7"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "5.39.0"
    }
    # You can specify additional required providers here.
  }
}

web-ec2.pem

-----BEGIN RSA PRIVATE KEY-----
<your private key>
-----END RSA PRIVATE KEY-----

Now we’ll try the script

terraform init

terraform fmt

terraform validate

terraform plan

terraform apply --auto-approve

This may take about 11 minutes

Now let us go check in AWS console

2 instances created bastion and worker node from EKS cluster

2 launch templates

netflix-launch-template20240314012927396500000004 created by our launch template resource using terraform

eks-f6c71d4d-e685-d7aa-5f3c-ec99c5ffe65f automatically created by EKS node group

One auto scaling group created by EKS

EKS cluster created

EKS node group

There are 2 ways you may check EKS Cluster’s resources

one is from EKS cluster’ resources interface

the other is accessing to worker node via bastion as shown below

Now let’s move on to stage 2

eks_alb

main.tf

module "eks_alb" {
  source = ".././modules/eks_alb"

  helm_release_loadbalancer_controller_name                                  = var.helm_release_loadbalancer_controller_name
  helm_release_loadbalancer_controller_repository                            = var.helm_release_loadbalancer_controller_repository
  helm_release_loadbalancer_controller_chart                                 = var.helm_release_loadbalancer_controller_chart
  helm_release_loadbalancer_controller_namespace                             = var.helm_release_loadbalancer_controller_namespace
  helm_release_loadbalancer_controller_set_image_name                        = var.helm_release_loadbalancer_controller_set_image_name
  helm_release_loadbalancer_controller_set_image_value                       = var.helm_release_loadbalancer_controller_set_image_value
  vpc_id                                                                     = data.aws_vpc.vpc.id
  aws_region                                                                 = var.aws_region
  aws_eks_cluster_auth_cluster_name                                          = data.aws_eks_cluster_auth.cluster.id
  aws_iam_policy_lbc_iam_policy_name                                         = var.aws_iam_policy_lbc_iam_policy_name
  aws_iam_policy_lbc_iam_policy_path                                         = var.aws_iam_policy_lbc_iam_policy_path
  aws_iam_policy_lbc_iam_policy_description                                  = var.aws_iam_policy_lbc_iam_policy_description
  aws_iam_role_lbc_iam_role_name                                             = var.aws_iam_role_lbc_iam_role_name
  kubernetes_ingress_class_v1_ingress_class_default_metadata_name            = var.kubernetes_ingress_class_v1_ingress_class_default_metadata_name
  kubernetes_ingress_class_v1_ingress_class_default_metadata_annotations     = var.kubernetes_ingress_class_v1_ingress_class_default_metadata_annotations
  kubernetes_ingress_class_v1_ingress_class_default_spec                     = var.kubernetes_ingress_class_v1_ingress_class_default_spec
  helm_release_loadbalancer_controller_set_service_account_create_name       = var.helm_release_loadbalancer_controller_set_service_account_create_name
  helm_release_loadbalancer_controller_set_service_account_create_value      = var.helm_release_loadbalancer_controller_set_service_account_create_value
  helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_name  = var.helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_name
  helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_value = var.helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_value
  helm_release_loadbalancer_controller_set_service_account_name              = var.helm_release_loadbalancer_controller_set_service_account_name
  helm_release_loadbalancer_controller_set_service_account_value             = var.helm_release_loadbalancer_controller_set_service_account_value
  aws_autoscaling_attachment_alb_attachment_autoscaling_group_name           = data.aws_autoscaling_group.eks.name
  security_group                                                             = data.aws_security_group.all.id
  public_subnets                                                             = data.aws_subnets.public.ids
  eks_worker_node_id                                                         = data.aws_instances.eks_worker_node.ids
  # env                                                                        = local.env
  # region                                                                     = var.aws_region
  argocd_version                                                                      = var.argocd_version # "3.35.4"
  loadbalancer_dns                                                                    = module.eks_alb.alb_dns_name
  fqdn                                                                                = module.eks_alb.alb_dns_name
  aws_iam_role_lbc_iam_role_assume_role_policy                                        = <<EOF
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
          "eks.amazonaws.com",
          "ec2.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    },
    {
      "Effect": "Allow",
       "Principal": {
          "Federated": "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${data.aws_iam_openid_connect_provider.eks_cluster_netflix.url}"
        },
        "Action": "sts:AssumeRoleWithWebIdentity",
        "Condition": {
            "StringEquals": {
              "${replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")}:aud": "sts.amazonaws.com",            
              "${replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, "arn:aws:iam::951507339182:oidc-provider/", "")}:sub": "system:serviceaccount:kube-system:aws-load-balancer-controller"
            }
          }       
    }
  ]
}
EOF
  kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class         = var.kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class
  kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class_value   = var.kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class_value
  kubernetes_ingress_class_v1_ingress_class_default_spec_controller_alb               = var.kubernetes_ingress_class_v1_ingress_class_default_spec_controller_alb
  aws_iam_role_lbc_iam_role_tags                                                      = var.aws_iam_role_lbc_iam_role_tags
  data_http_lbc_iam_policy_url                                                        = var.data_http_lbc_iam_policy_url
  data_http_lbc_iam_policy_request_headers_accept                                     = var.data_http_lbc_iam_policy_request_headers_accept
  kubernetes_ingress_v1_argocd_ingress_metadata_name                                  = var.kubernetes_ingress_v1_argocd_ingress_metadata_name
  kubernetes_ingress_v1_argocd_ingress_metadata_namespace                             = var.kubernetes_ingress_v1_argocd_ingress_metadata_namespace
  kubernetes_ingress_v1_argocd_ingress_labels_name_argocd_server                      = var.kubernetes_ingress_v1_argocd_ingress_labels_name_argocd_server
  kubernetes_ingress_v1_argocd_ingress_labels_value_argocd_server                     = var.kubernetes_ingress_v1_argocd_ingress_labels_value_argocd_server
  kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class                      = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class
  kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name                 = var.kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name
  kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol                   = var.kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol
  kubernetes_ingress_v1_argocd_ingress_annotations_scheme                             = var.kubernetes_ingress_v1_argocd_ingress_annotations_scheme
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol               = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port                   = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path                   = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path
  kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect                 = var.kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds       = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds        = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds
  kubernetes_ingress_v1_argocd_ingress_annotations_success_codes                      = var.kubernetes_ingress_v1_argocd_ingress_annotations_success_codes
  kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count            = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count
  kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count          = var.kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count
  kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name              = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name
  kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port              = var.kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port
  kubernetes_ingress_v1_argocd_ingress_tls_secret_name                                = var.kubernetes_ingress_v1_argocd_ingress_tls_secret_name
  kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_value                = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_value
  kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name_value           = var.kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name_value
  kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol_value             = var.kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol_value
  kubernetes_ingress_v1_argocd_ingress_annotations_scheme_value                       = var.kubernetes_ingress_v1_argocd_ingress_annotations_scheme_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol_value         = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port_value             = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path_value             = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path_value
  kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect_value           = var.kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds_value = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds_value  = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds_value
  kubernetes_ingress_v1_argocd_ingress_annotations_success_codes_value                = var.kubernetes_ingress_v1_argocd_ingress_annotations_success_codes_value
  kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count_value      = var.kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count_value
  kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count_value    = var.kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count_value
  aws_lb_eks_name                                                                     = var.aws_lb_eks_name
  aws_lb_eks_internal_bool                                                            = var.aws_lb_eks_internal_bool
  aws_lb_eks_load_balancer_type                                                       = var.aws_lb_eks_load_balancer_type
  aws_lb_target_group_netflix_tg_name                                                 = var.aws_lb_target_group_netflix_tg_name
  aws_lb_target_group_netflix_tg_port                                                 = var.aws_lb_target_group_netflix_tg_port
  aws_lb_target_group_netflix_tg_protocol                                             = var.aws_lb_target_group_netflix_tg_protocol
  aws_lb_listener_http_listener_netflix_port                                          = var.aws_lb_listener_http_listener_netflix_port
  aws_lb_listener_http_listener_netflix_protocol                                      = var.aws_lb_listener_http_listener_netflix_protocol
  aws_lb_listener_http_listener_netflix_default_action_type                           = var.aws_lb_listener_http_listener_netflix_default_action_type
  aws_lb_target_group_netflix_tg_health_check                                         = var.aws_lb_target_group_netflix_tg_health_check
  aws_lb_target_group_eks_tg_sonarqube_name                                           = var.aws_lb_target_group_eks_tg_sonarqube_name
  aws_lb_target_group_eks_tg_sonarqube_port                                           = var.aws_lb_target_group_eks_tg_sonarqube_port
  aws_lb_target_group_eks_tg_sonarqube_protocol                                       = var.aws_lb_target_group_eks_tg_sonarqube_protocol
  aws_lb_target_group_eks_tg_sonarqube_health_check                                   = var.aws_lb_target_group_eks_tg_sonarqube_health_check
  aws_lb_listener_http_listener_sonarqube_port                                        = var.aws_lb_listener_http_listener_sonarqube_port
  aws_lb_listener_http_listener_sonarqube_protocol                                    = var.aws_lb_listener_http_listener_sonarqube_protocol
  aws_lb_listener_http_listener_sonarqube_default_action_type                         = var.aws_lb_listener_http_listener_sonarqube_default_action_type
  aws_lb_target_group_eks_tg_grafana_name                                             = var.aws_lb_target_group_eks_tg_grafana_name
  aws_lb_target_group_eks_tg_grafana_port                                             = var.aws_lb_target_group_eks_tg_grafana_port
  aws_lb_target_group_eks_tg_grafana_protocol                                         = var.aws_lb_target_group_eks_tg_grafana_protocol
  aws_lb_target_group_eks_tg_grafana_health_check                                     = var.aws_lb_target_group_eks_tg_grafana_health_check
  aws_lb_listener_http_listener_grafana_port                                          = var.aws_lb_listener_http_listener_grafana_port
  aws_lb_listener_http_listener_grafana_protocol                                      = var.aws_lb_listener_http_listener_grafana_protocol
  aws_lb_listener_http_listener_grafana_default_action_type                           = var.aws_lb_listener_http_listener_grafana_default_action_type
  aws_lb_target_group_eks_tg_prometheus_name                                          = var.aws_lb_target_group_eks_tg_prometheus_name
  aws_lb_target_group_eks_tg_prometheus_port                                          = var.aws_lb_target_group_eks_tg_prometheus_port
  aws_lb_target_group_eks_tg_prometheus_protocol                                      = var.aws_lb_target_group_eks_tg_prometheus_protocol
  aws_lb_target_group_eks_tg_prometheus_health_check                                  = var.aws_lb_target_group_eks_tg_prometheus_health_check
  aws_lb_listener_http_listener_prometheus_port                                       = var.aws_lb_listener_http_listener_prometheus_port
  aws_lb_listener_http_listener_prometheus_protocol                                   = var.aws_lb_listener_http_listener_prometheus_protocol
  aws_lb_listener_http_listener_prometheus_default_action_type                        = var.aws_lb_listener_http_listener_prometheus_default_action_type
  aws_lb_target_group_eks_tg_node_exporter_name                                       = var.aws_lb_target_group_eks_tg_node_exporter_name
  aws_lb_target_group_eks_tg_node_exporter_port                                       = var.aws_lb_target_group_eks_tg_node_exporter_port
  aws_lb_target_group_eks_tg_node_exporter_protocol                                   = var.aws_lb_target_group_eks_tg_node_exporter_protocol
  aws_lb_target_group_eks_tg_node_exporter_health_check                               = var.aws_lb_target_group_eks_tg_node_exporter_health_check
  aws_lb_listener_http_listener_node_exporter_port                                    = var.aws_lb_listener_http_listener_node_exporter_port
  aws_lb_listener_http_listener_node_exporter_protocol                                = var.aws_lb_listener_http_listener_node_exporter_protocol
  aws_lb_listener_http_listener_node_exporter_default_action_type                     = var.aws_lb_listener_http_listener_node_exporter_default_action_type
  kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name                 = var.kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name
  null_resource_get_argocd_admin_password_remote_exec_inline                          = var.null_resource_get_argocd_admin_password_remote_exec_inline
  null_resource_get_argocd_admin_password_connection_type                             = var.null_resource_get_argocd_admin_password_connection_type
  null_resource_get_argocd_admin_password_connection_user                             = var.null_resource_get_argocd_admin_password_connection_user
  null_resource_get_argocd_admin_password_connection_private_key                      = "${path.module}/web-ec2.pem"
  null_resource_output_argocd_admin_password_remote_exec_inline                       = var.null_resource_output_argocd_admin_password_remote_exec_inline
  null_resource_output_argocd_admin_password_connection_type                          = var.null_resource_output_argocd_admin_password_connection_type
  null_resource_output_argocd_admin_password_connection_user                          = var.null_resource_output_argocd_admin_password_connection_user
  null_resource_output_argocd_admin_password_connection_private_key                   = "${path.module}/web-ec2.pem"
}


# locals {
#   env    = "dev"
#   region = "us-east-1"
# }

data.tf

data "terraform_remote_state" "s3" {
  backend = "s3" # Set your backend configuration here
  config = {
    bucket = "eks-netflix-argocd"
    key    = "eks-cluster"
    region = "us-east-1"
  }
}


# Datasource: EKS Cluster Auth 
data "aws_eks_cluster_auth" "cluster" {
  name = data.terraform_remote_state.s3.outputs.eks_cluster_netflix_name
}

data "aws_iam_openid_connect_provider" "eks_cluster_netflix" {
  arn = data.terraform_remote_state.s3.outputs.oidc_provider_arn
}

data "external" "api_server_endpoint" {
  program = ["bash", "${path.module}/api_server_endpoint.sh"]
}

data "aws_eks_cluster" "certificate_authority" {
  name = data.terraform_remote_state.s3.outputs.eks_cluster_netflix_name
}


# vpc

data "aws_vpc" "vpc" {
  id = data.terraform_remote_state.s3.outputs.vpc_id
}

# eks asg
data "aws_autoscaling_group" "eks" {
  name = data.terraform_remote_state.s3.outputs.eks_asg_name

}

# # esk worker node
# data "aws_instance" "eks" {
#   instance_id = data.terraform_remote_state.s3.outputs.eks_asg_instance_ids
# }

# security group
data "aws_security_group" "all" {
  id = data.terraform_remote_state.s3.outputs.sg_id
}

# subnet ids
data "aws_subnets" "public" {
  filter {
    name   = "vpc-id"
    values = [data.terraform_remote_state.s3.outputs.vpc_id]
  }

  tags = {
    Name = "public_subnets"
  }

}

data "aws_instances" "eks_worker_node" {

  filter {
    name   = "tag:aws:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }
}


# data "aws_lb" "eks" {
#   name = "eks-netflix"

#   depends_on = [module.eks_alb.aws_lb]
# }

data "aws_caller_identity" "current" {}


locals {
  arn_prefix  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/"
  aud_sub_key = replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, local.arn_prefix, "")
}

Below script is runing bash file to retrieve api_server_endpoint of the eks cluster

data "external" "api_server_endpoint" {
  program = ["bash", "${path.module}/api_server_endpoint.sh"]
}api_server_endpoint.sh

In current directory

api_server_endpoint.sh

#!/bin/bash

# Execute your command to retrieve api server endpoint data
api_endpoint=$(sudo -u ubuntu /usr/local/bin/aws eks describe-cluster --region us-east-1 --name eks-netflix-cluster --query "cluster.endpoint" --output text)

# Construct a JSON object with the api server endpoint data
cat <<EOF
{
  "api_endpoint": "$api_endpoint"
}
EOF

Above script to run aws locally to retrieve the endpoint

outputs.tf

output "cluster_ca_certificate" {
  value = base64decode(data.aws_eks_cluster.certificate_authority.certificate_authority[0].data)
}

output "host" {
  value = data.external.api_server_endpoint.result["api_endpoint"]
}

# output "eks_asg_instance_ids" {
#   value = data.aws_instance.eks_worker_node.id
# }

# output "output_argocd_admin_password" {
#   value = module.eks_alb.output_argocd_admin_password
# }

output "aws_instance_eks_netflix_private_ip" {
  value = module.eks_alb.aws_instance_eks_netflix_private_ip
}

output values needed

providers.tf

provider "aws" {
  region  = "us-east-1"
  profile = "default"
}

# HELM Provider
provider "helm" {
  kubernetes {
    host                   = data.external.api_server_endpoint.result["api_endpoint"]
    cluster_ca_certificate = base64decode(data.aws_eks_cluster.certificate_authority.certificate_authority[0].data)
    token                  = data.aws_eks_cluster_auth.cluster.token
  }
}

# Terraform Kubernetes Provider
provider "kubernetes" {
  host                   = data.external.api_server_endpoint.result["api_endpoint"]
  cluster_ca_certificate = base64decode(data.aws_eks_cluster.certificate_authority.certificate_authority[0].data)
  token                  = data.aws_eks_cluster_auth.cluster.token
}

aws, helm and kubernetes providers are provided

s3.tf

terraform {
  backend "s3" {
    bucket  = "eks-netflix-argocd"
    key     = "eks-alb"
    region  = "us-east-1"
    encrypt = true
    profile = "default"
  }
}

state file stored in s3 bucket

terraform.tfvars (values are found in repo here)

variables.tf

### eks alb
# controller
variable "helm_release_loadbalancer_controller_set_service_account_create_name" {
  description = "Name of the Kubernetes service account to create for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_create_value" {
  description = "Value of the Kubernetes service account to create for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_name" {
  description = "Name of the Kubernetes service account for Enable Service Mutator Webhook."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_enableServiceMutatorWebhook_value" {
  description = "Value of the Kubernetes service account for Enable Service Mutator Webhook."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_name" {
  description = "Name of the Kubernetes service account for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_set_service_account_value" {
  description = "Value of the Kubernetes service account for the load balancer controller."
  type        = string
}

variable "helm_release_loadbalancer_controller_name" {
  type        = string
  description = "The name of the Helm release for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_repository" {
  type        = string
  description = "The repository URL from which to fetch the Helm chart for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_chart" {
  type        = string
  description = "The name of the Helm chart for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_namespace" {
  type        = string
  description = "The Kubernetes namespace in which to install the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_set_image_name" {
  type        = string
  description = "The name of the Helm chart value to set the image repository for the AWS Load Balancer Controller."
}

variable "helm_release_loadbalancer_controller_set_image_value" {
  type        = string
  description = "The value of the image repository for the AWS Load Balancer Controller."
}

variable "aws_region" {
  type        = string
  description = "The AWS region where the AWS Load Balancer Controller will be deployed."
}

variable "argocd_version" {
  type = string
}


# iam
variable "aws_iam_policy_lbc_iam_policy_name" {
  type        = string
  description = "The name of the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_policy_lbc_iam_policy_path" {
  type        = string
  description = "The path for the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_policy_lbc_iam_policy_description" {
  type        = string
  description = "The description of the AWS IAM policy for the Load Balancer Controller."
}

variable "aws_iam_role_lbc_iam_role_name" {
  type        = string
  description = "The name of the AWS IAM role for the Load Balancer Controller."
}

variable "aws_iam_role_lbc_iam_role_tags" {
  description = "Tags for the IAM role used by the load balancer controller"
  type        = string
}


# ingress
variable "kubernetes_ingress_class_v1_ingress_class_default_metadata_name" {
  type        = string
  description = "The name of the Kubernetes Ingress Class metadata."
}

variable "kubernetes_ingress_class_v1_ingress_class_default_metadata_annotations" {
  type        = map(string)
  description = "Annotations for the Kubernetes Ingress Class metadata."
}

variable "kubernetes_ingress_class_v1_ingress_class_default_spec" {
  type        = string
  description = "The controller specification for the Kubernetes Ingress Class."
}

variable "kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class" {
  description = "Default annotations for the default class in Kubernetes IngressClass"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_annotations_default_class_value" {
  description = "Default value for the default class annotation in Kubernetes IngressClass"
  type        = string
}

variable "kubernetes_ingress_class_v1_ingress_class_default_spec_controller_alb" {
  description = "Controller configuration for ALB in Kubernetes IngressClass"
  type        = string
}

variable "data_http_lbc_iam_policy_url" {
  description = "The URL for the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

variable "data_http_lbc_iam_policy_request_headers_accept" {
  description = "The value for the 'Accept' header in the IAM policy document for the data HTTP load balancer controller"
  type        = string
}

# ingress
variable "kubernetes_ingress_v1_argocd_ingress_metadata_name" {
  description = "Name of the Ingress resource for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_metadata_namespace" {
  description = "Namespace of the Ingress resource for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_labels_name_argocd_server" {
  description = "Name label for ArgoCD server"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_labels_value_argocd_server" {
  description = "Value label for ArgoCD server"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class" {
  description = "Ingress class annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name" {
  description = "Load balancer name annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol" {
  description = "Backend protocol annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_scheme" {
  description = "Scheme annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol" {
  description = "Healthcheck protocol annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port" {
  description = "Healthcheck port annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path" {
  description = "Healthcheck path annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect" {
  description = "Force SSL redirect annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds" {
  description = "Healthcheck interval in seconds annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds" {
  description = "Healthcheck timeout in seconds annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_success_codes" {
  description = "Success codes annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count" {
  description = "Healthy threshold count annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count" {
  description = "Unhealthy threshold count annotation for ArgoCD"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_name" {
  description = "Name of the default backend service for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_spec_default_backend_service_port" {
  description = "Port of the default backend service for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_tls_secret_name" {
  description = "Name of the TLS secret for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_value" {
  description = "Value for the 'ingress.class' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_load_balancer_name_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/load-balancer-name' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_backend_protocol_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/backend-protocol' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_scheme_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/scheme' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_protocol_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-protocol' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_port_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-port' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_path_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-path' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_force_ssl_redirect_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/force-ssl-redirect' annotation for ArgoCD Ingress"
  type        = string
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_interval_seconds_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-interval-seconds' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthcheck_timeout_seconds_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthcheck-timeout-seconds' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_success_codes_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/success-codes' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_healthy_threshold_count_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/healthy-threshold-count' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_unhealthy_threshold_count_value" {
  description = "Value for the 'alb.ingress.kubernetes.io/unhealthy-threshold-count' annotation for ArgoCD Ingress"
  type        = number
}

variable "kubernetes_ingress_v1_argocd_ingress_annotations_ingress_class_name" {
  description = "Annotation key for specifying the Ingress class"
  type        = string
}


# alb & target group - netflix
variable "aws_lb_eks_name" {
  description = "Name of the Elastic Load Balancer (ELB) for EKS"
  type        = string
}

variable "aws_lb_eks_internal_bool" {
  description = "Boolean indicating whether the ELB for EKS is internal (true/false)"
  type        = bool
}

variable "aws_lb_eks_load_balancer_type" {
  description = "Type of load balancer for EKS (e.g., application, network)"
  type        = string
}

variable "aws_lb_target_group_netflix_tg_name" {
  description = "Name of the target group for Netflix on the ELB"
  type        = string
}

variable "aws_lb_target_group_netflix_tg_port" {
  description = "Port for the target group for Netflix on the ELB"
  type        = number
}

variable "aws_lb_target_group_netflix_tg_protocol" {
  description = "Protocol for the target group for Netflix on the ELB (e.g., HTTP, HTTPS)"
  type        = string
}

variable "aws_lb_target_group_netflix_tg_health_check" {
  description = "Health check configuration for the target group"
  type = object({
    path                = string
    port                = string
    protocol            = string
    interval            = number
    timeout             = number
    healthy_threshold   = number
    unhealthy_threshold = number
    matcher             = string
  })
}


variable "aws_lb_listener_http_listener_netflix_port" {
  description = "Port for the HTTP listener for Netflix on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_netflix_protocol" {
  description = "Protocol for the HTTP listener for Netflix on the ELB (e.g., HTTP, HTTPS)"
  type        = string
}

variable "aws_lb_listener_http_listener_netflix_default_action_type" {
  description = "Type of default action for the HTTP listener for Netflix on the ELB"
  type        = string
}


# SonarQube
variable "aws_lb_target_group_eks_tg_sonarqube_name" {
  description = "Name of the Target Group for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_sonarqube_port" {
  description = "Port for the Target Group for SonarQube on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_sonarqube_protocol" {
  description = "Protocol for the Target Group for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_sonarqube_health_check" {
  description = "Health check configuration for the Target Group for SonarQube on the ELB"
  type = object({
    path                = string
    port                = string
    protocol            = string
    interval            = number
    timeout             = number
    healthy_threshold   = number
    unhealthy_threshold = number
    matcher             = string
  })
}

variable "aws_lb_listener_http_listener_sonarqube_port" {
  description = "Port for the HTTP listener for SonarQube on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_sonarqube_protocol" {
  description = "Protocol for the HTTP listener for SonarQube on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_sonarqube_default_action_type" {
  description = "Type of default action for the HTTP listener for SonarQube on the ELB"
  type        = string
}

# alb & target group - Grafana

variable "aws_lb_target_group_eks_tg_grafana_name" {
  description = "Name of the Target Group for Grafana on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_grafana_port" {
  description = "Port for the Target Group for Grafana on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_grafana_protocol" {
  description = "Protocol for the Target Group for Grafana on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_grafana_health_check" {
  description = "Health check configuration for the Target Group for Grafana on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_grafana_port" {
  description = "Port for the HTTP listener for Grafana on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_grafana_protocol" {
  description = "Protocol for the HTTP listener for Grafana on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_grafana_default_action_type" {
  description = "Type of default action for the HTTP listener for Grafana on the ELB"
  type        = string
}

# alb & target group - Prometheus
variable "aws_lb_target_group_eks_tg_prometheus_name" {
  description = "Name of the Target Group for Prometheus on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_prometheus_port" {
  description = "Port for the Target Group for Prometheus on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_prometheus_protocol" {
  description = "Protocol for the Target Group for Prometheus on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_prometheus_health_check" {
  description = "Health check configuration for the Target Group for Prometheus on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_prometheus_port" {
  description = "Port for the HTTP listener for Prometheus on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_prometheus_protocol" {
  description = "Protocol for the HTTP listener for Prometheus on the ELB"
  type        = string

}

variable "aws_lb_listener_http_listener_prometheus_default_action_type" {
  description = "Type of default action for the HTTP listener for Prometheus on the ELB"
  type        = string
}

# alb & target group - Node Exporter
variable "aws_lb_target_group_eks_tg_node_exporter_name" {
  description = "Name of the Target Group for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_node_exporter_port" {
  description = "Port for the Target Group for Node Exporter on the ELB"
  type        = number
}

variable "aws_lb_target_group_eks_tg_node_exporter_protocol" {
  description = "Protocol for the Target Group for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_target_group_eks_tg_node_exporter_health_check" {
  description = "Health check configuration for the Target Group for Node Exporter on the ELB"
  type        = map(string)
}

variable "aws_lb_listener_http_listener_node_exporter_port" {
  description = "Port for the HTTP listener for Node Exporter on the ELB"
  type        = number
}

variable "aws_lb_listener_http_listener_node_exporter_protocol" {
  description = "Protocol for the HTTP listener for Node Exporter on the ELB"
  type        = string
}

variable "aws_lb_listener_http_listener_node_exporter_default_action_type" {
  description = "Type of default action for the HTTP listener for Node Exporter on the ELB"
  type        = string
}

# argocd credentials
variable "null_resource_get_argocd_admin_password_remote_exec_inline" {
  type = string
  # Replace the value with your actual command for fetching the ArgoCD admin password using remote-exec provisioner
}

variable "null_resource_get_argocd_admin_password_connection_type" {
  type = string
  # Specify the connection type (e.g., ssh)
}

variable "null_resource_get_argocd_admin_password_connection_user" {
  type = string
  # Specify the username used for the connection
}


variable "null_resource_output_argocd_admin_password_remote_exec_inline" {
  type = string
  # Replace the value with your actual command for outputting the ArgoCD admin password using remote-exec provisioner
}

variable "null_resource_output_argocd_admin_password_connection_type" {
  type = string
  # Specify the connection type (e.g., ssh)
}

variable "null_resource_output_argocd_admin_password_connection_user" {
  type = string
  # Specify the username used for the connection
}

versions.tf

terraform {
  required_version = "~>1.7"

  required_providers {
    aws = {
      version = "<= 5.38"
    }

    # You can specify additional required providers here.
  }
}

# Note: Replace the version constraints with those appropriate for your project.

Note: Be aware the change here

aws = {
      version = "<= 5.38"
    }

There is a bug for aws version = 5.39 with argocd provider

web-ec2.pem (private key)

Run the script

terraform init

terraform fmt

terraform validate

terraform plan

terraform apply --auto-approve

It takes approximately 3 mintues to run the scripts

Check in AWS console

ALB — eks-netflix

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:9090/

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:3000/login

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:9100/

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:9000/

Now we move on to the last stage for netflix deployment using argocd

argocd_netflix folder

main.tf

module "argocd_netflix" {
  source = ".././modules/argocd_netflix"

  null_resource_argocd_ready_remote_exec_inline         = var.null_resource_argocd_ready_remote_exec_inline
  null_resource_wait_for_argocd_connection_type         = var.null_resource_wait_for_argocd_connection_type
  null_resource_wait_for_argocd_connection_user         = var.null_resource_wait_for_argocd_connection_user
  null_resource_wait_for_argocd_connection_private_key  = "${path.module}/web-ec2.pem"
  argocd_application_netflix_metadata_name              = var.argocd_application_netflix_metadata_name
  argocd_application_netflix_metadata_namespace         = var.argocd_application_netflix_metadata_namespace
  argocd_application_netflix_spec_project_name          = var.argocd_application_netflix_spec_project_name
  argocd_application_netflix_spec_source                = var.argocd_application_netflix_spec_source
  argocd_application_netflix_spec_source_directory      = var.argocd_application_netflix_spec_source_directory
  argocd_application_netflix_spec_destination           = var.argocd_application_netflix_spec_destination
  argocd_application_netflix_spec_sync_policy           = var.argocd_application_netflix_spec_sync_policy
  argocd_application_netflix_spec_sync_policy_automated = var.argocd_application_netflix_spec_sync_policy_automated
  # null_resource_get_argocd_admin_password_remote_exec_inline = var.null_resource_get_argocd_admin_password_remote_exec_inline
}

data.tf


data "aws_s3_object" "argocd_netfilx" {
  bucket = "eks-netflix-argocd"
  key    = "secrets.txt"
}


data "aws_lb" "argocd" {
  name = "argocd"
}

data "terraform_remote_state" "s3" {
  backend = "s3" # Set your backend configuration here
  config = {
    bucket = "eks-netflix-argocd"
    key    = "eks-cluster"
    region = "us-east-1"
  }
}


# Datasource: EKS Cluster Auth 
data "aws_eks_cluster_auth" "cluster" {
  name = data.terraform_remote_state.s3.outputs.eks_cluster_netflix_name
}

data "aws_iam_openid_connect_provider" "eks_cluster_netflix" {
  arn = data.terraform_remote_state.s3.outputs.oidc_provider_arn
}

data "external" "api_server_endpoint" {
  program = ["bash", "${path.module}/api_server_endpoint.sh"]
}

data "aws_eks_cluster" "certificate_authority" {
  name = data.terraform_remote_state.s3.outputs.eks_cluster_netflix_name
}


# vpc

data "aws_vpc" "vpc" {
  id = data.terraform_remote_state.s3.outputs.vpc_id
}

# eks asg
data "aws_autoscaling_group" "eks" {
  name = data.terraform_remote_state.s3.outputs.eks_asg_name

}


data "aws_security_group" "all" {
  id = data.terraform_remote_state.s3.outputs.sg_id
}

# subnet ids
data "aws_subnets" "public" {
  filter {
    name   = "vpc-id"
    values = [data.terraform_remote_state.s3.outputs.vpc_id]
  }

  tags = {
    Name = "public_subnets"
  }

}

data "aws_instances" "eks_worker_node" {

  filter {
    name   = "tag:aws:eks:cluster-name"
    values = ["eks-netflix-cluster"]
  }
}


data "aws_caller_identity" "current" {}


locals {
  arn_prefix  = "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/"
  aud_sub_key = replace(data.aws_iam_openid_connect_provider.eks_cluster_netflix.arn, local.arn_prefix, "")
}

Using below script, we retrieve secret from S3 bucket

data "aws_s3_object" "argocd_netfilx" {
  bucket = "eks-netflix-argocd"
  key    = "secrets.txt"
}

outputs.tf

output "argocd_dns_name" {
  value = data.aws_lb.argocd.dns_name
}

output "argocd_password" {
  value     = data.aws_s3_object.argocd_netfilx.body
  sensitive = true
}

output ALB’s DNS name and password of argocd with password masked (output just for me to troubleshoot credentials used for argocd)

providers.tf

provider "argocd" {
  server_addr = "${data.aws_lb.argocd.dns_name}:80" # Replace with your ArgoCD server address
  username                    = "admin" # Replace with your ArgoCD username
  password                    = data.aws_s3_object.argocd_netfilx.body
  # auth_token                  = "6yTZadKUz/RrV8j7TwIGeh88qzOJ6UARST+KwpPkv8k="
  insecure                    = true

  plain_text = true
  grpc_web = true

}

Note: Make sure you provide above scripts when attempting to access argocd using ALB’s DNS name (HTTP).

If using custom domain with acm cert, you don’t need insecure

s3.tf

terraform {
  backend "s3" {
    bucket  = "eks-netflix-argocd"
    key     = "argocd-netflix"
    region  = "us-east-1"
    encrypt = true
    profile = "default"
  }
}

terraform.tfvars (values are stored here in the repo)

variables.tf

variable "null_resource_argocd_ready_remote_exec_inline" {
  type = string
}

variable "null_resource_wait_for_argocd_connection_type" {
  type = string
}

variable "null_resource_wait_for_argocd_connection_user" {
  type = string
}

variable "argocd_application_netflix_metadata_name" {
  type = string
}

variable "argocd_application_netflix_metadata_namespace" {
  type = string
}

variable "argocd_application_netflix_spec_project_name" {
  type = string
}

variable "argocd_application_netflix_spec_source" {
  type = any
}

variable "argocd_application_netflix_spec_source_directory" {
  type = any
}

variable "argocd_application_netflix_spec_destination" {
  type = any
}

variable "argocd_application_netflix_spec_sync_policy" {
  type = any
}

# variable "null_resource_get_argocd_admin_password_remote_exec_inline" {
#   type = string
# }

variable "argocd_application_netflix_spec_sync_policy_automated" {
  type = any
}

variables used in main.tf file

versions.tf

terraform {
  required_version = "~>1.7"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "<= 5.38"
    }

    argocd = {
      source  = "oboukili/argocd"
      version = "6.0.3"
    }
  }
}

# Note: Replace the version constraints with those appropriate for your project.

Note: make sure below scripts are provided as shown

required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "<= 5.38"
    }

    argocd = {
      source  = "oboukili/argocd"
      version = "6.0.3"
    }
  }

aws version = 5.39 may trigger error with argocd provider

web-ec2.pem (private key)

Let’s run it now

terraform init

terraform fmt

terraform validate

terraform plan

terraform apply --auto-approve

It is less than 10 seconds to run the script

Final check in AWS console

argocd-450801520.us-east-1.elb.amazonaws.com

username: admin (default)

Let us grab it from our s3 bucket

Lovely, now it’s deployed without manual work!

Last but not least, we will go check out Netflix App

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:30007/

We made it!

Now on node exporter

http://eks-netflix-1695979823.us-east-1.elb.amazonaws.com:9090/targets?search=

The issue of this is because Netflix App’s metrics are collected differently. If you’re interested, here’s the reference to make it work

Not able to see Netflix default metrics in /metrics endpoint · Issue #785 ·…

In the documentation, there is discussion on default metrics on every Spring MVC request. Perhaps it is a…

github.com

Finally, we will destroy all infrastructure created.

From argocd_netlifx folder

terraform destroy --auto-approve

From eks_deletion (make sure you don’t destroy under eks_alb folder as there’s a bug facing ingress controller deletion, I made a workaround to resolve the bug by adding empty for finalizer in ingress and deleting security groups created by alb ingress)

terraform init

terraform fmt

terraform validate

terraform plan

terraform apply --auto-aprove

Note: make sure you run terraform apply rather than destroy since we are running resources intended rather than destroying anything. Also, I configured state file the same as eks_alb in order to delete resources from stage 2 at the same time

It takes around 1 minute as we configured 60 seconds sleep for the command

Finally, destroy everything from eks_cluster folder

terraform destroy --auto-approve

Roughly 7 minutes it may take to destroy the eks cluster

Conclusion:

This project aims to automate the provisioning of AWS infrastructure, particularly Amazon Elastic Kubernetes Service (EKS), using Terraform and an Amazon Linux-based EKS Optimized Golden AMI created with Packer. The approach involves defining infrastructure as code with Terraform, integrating the Packer-built Golden AMI into the EKS node group launch template, and customizing worker node configurations via userdata. By leveraging Terraform and the Packer-built Golden AMI, this project provides a streamlined and efficient solution for setting up Amazon EKS clusters. It ensures consistency, scalability, and customization while reducing manual intervention. Additionally, the project documentation offers clear guidance for users to replicate and adapt the infrastructure provisioning process for their specific requirements.

Furthermore, the project demonstrates how to deploy a Netflix application using Argo CD directly from an EKS worker node, utilizing Terraform scripts for automation. This showcases the project’s capability to not only automate the infrastructure setup but also the deployment of complex, real-world applications on the provisioned Kubernetes environment. The integration of Argo CD for continuous delivery within the EKS ecosystem highlights the project’s comprehensive approach to automation, offering a practical example of managing application deployment and lifecycle in a cloud-native landscape.

In conclusion, this project empowers users to automate the deployment of Amazon EKS clusters, facilitating the efficient management of containerized applications on Kubernetes in the AWS cloud environment. It bridges the gap between infrastructure provisioning and application deployment, providing a cohesive solution for deploying and managing applications like Netflix in a Kubernetes environment using Terraform and Argo CD.

Provisioning AWS Infrastructure Using Terraform and Amazon Linux Based EKS Optimized Golden AMI Built by Packer (Argocd to Deploy Netflix)

Not able to see Netflix default metrics in /metrics endpoint · Issue #785 ·…

In the documentation, there is discussion on default metrics on every Spring MVC request. Perhaps it is a…

Written by Paul Zhao