Creating a Private GKE Cluster and Bastion VM with Terraform

Published in

Devops Türkiye☁️ 🐧 🐳 ☸️

6 min readJul 2, 2024

In this article, we will walk through creating a private Google Kubernetes Engine (GKE) cluster and a bastion VM using Terraform. This setup will ensure that your GKE cluster remains private while still allowing secure access via the bastion VM. We’ll delve into the details of each Terraform script and explain their functions.

Step 1: Define Project and Region

First, we define our GCP project and region in the provider_vars.tf file.

provider "google" {
  project = var.project_id
  region  = var.region
}

This sets up the Terraform provider to interact with our specified GCP project and region, establishing the basic configuration needed for the Terraform scripts to run.

Step 2: Enable Required APIs

We need to enable the necessary Google Cloud APIs in the enable_apis.tf file.

locals {
  enabled_apis = [
    "serviceusage.googleapis.com",
    "cloudresourcemanager.googleapis.com",
    "compute.googleapis.com",
    "servicenetworking.googleapis.com",
    "container.googleapis.com",
    "gkehub.googleapis.com" 
  ]
}
resource "google_project_service" "enabled_apis" {
  for_each           = toset(local.enabled_apis)
  service            = each.value
  disable_on_destroy = false
  project            = var.project_id
}
resource "null_resource" "prep" {
  depends_on = [google_project_service.enabled_apis]
}

This script ensures all necessary APIs are enabled for our project. Enabling these APIs is crucial for provisioning the various GCP resources required for our setup, such as networks, subnets, and the GKE cluster.

Step 3: Create the VPC and Subnet

Next, we create a VPC network and a subnet in the network.tf file.

resource "google_compute_network" "vpc_network" {
  name                    = var.vpc_name
  auto_create_subnetworks = false
  depends_on = [null_resource.prep]
}
resource "google_compute_subnetwork" "subnetwork" {
  name          = var.subnet_name
  ip_cidr_range = var.gke_node_cidr
  region        = var.region
  network       = google_compute_network.vpc_network.id
  secondary_ip_range {
    range_name    = "pods-subnet"
    ip_cidr_range = var.pods_cidr
  }
  secondary_ip_range {
    range_name    = "services-subnet"
    ip_cidr_range = var.svc_cidr
  }
  depends_on = [
    google_compute_network.vpc_network,
  ]
}

This script creates a VPC with a subnet, including secondary ranges for GKE pods and services. The primary subnet is used for GKE nodes, while the secondary subnets are specifically designated for pod IPs and service IPs, ensuring proper IP address management within the cluster.

Step 4: Create the GKE Cluster

We define the GKE cluster configuration in the gke.tf file.

resource "google_container_cluster" "primary" {
  name               = var.cluster_name
  location           = var.region
  initial_node_count = 1
  deletion_protection = false
  network            = google_compute_network.vpc_network.name
  subnetwork         = google_compute_subnetwork.subnetwork.name
ip_allocation_policy {
    cluster_secondary_range_name  = "pods-subnet"
    services_secondary_range_name = "services-subnet"
  }
  private_cluster_config {
    enable_private_endpoint = true
    enable_private_nodes    = true
    master_ipv4_cidr_block  = var.master_ipv4_cidr_block
  }
  master_authorized_networks_config {
    cidr_blocks {
      cidr_block   = "your-master-cidr-block"
    }
  }
  default_max_pods_per_node = 110
  addons_config {
    horizontal_pod_autoscaling {
      disabled = false
    }
    http_load_balancing {
      disabled = false
    }
    network_policy_config {
      disabled = false
    }
  }
  logging_service    = "logging.googleapis.com/kubernetes"
  monitoring_service = "monitoring.googleapis.com/kubernetes"
  depends_on = [
    google_compute_subnetwork.subnetwork,
  ]
}

This script sets up a private GKE cluster with private endpoint and node access. The ip_allocation_policy ensures the use of secondary IP ranges for pods and services. The private_cluster_config ensures that the GKE control plane and nodes are not exposed to the public internet, enhancing security. The master_authorized_networks_config restricts access to the GKE master to specific CIDR blocks, adding another layer of security. The addons_config enables essential GKE add-ons like horizontal pod autoscaling, HTTP load balancing, and network policy enforcement.

Step 5: Create Bastion VM and Firewall Rules

In the bastion.tf file, we create a bastion VM and set up firewall rules for SSH access.

resource "google_compute_instance" "bastion" {
  name         = var.bastion_name
  machine_type = var.bastion_machine_type
  zone         = "${var.region}-a"
boot_disk {
    initialize_params {
      image = var.bastion_image
    }
  }
  network_interface {
    network    = google_compute_network.vpc_network.self_link
    subnetwork = google_compute_subnetwork.subnetwork.self_link
  }
  metadata_startup_script = var.bastion_startup_script
  tags = var.bastion_tags
}

resource "google_compute_firewall" "allow_ssh_bastion" {
  name    = var.firewall_name
  network = google_compute_network.vpc_network.self_link
  allow {
    protocol = "tcp"
    ports    = var.firewall_ports
  }
  source_ranges = var.firewall_source_ranges
  target_tags = var.firewall_target_tags
}

resource "google_compute_firewall" "allow_http_https_rdp" {
  name    = "allow-http-https-rdp"
  network = google_compute_network.vpc_network.self_link
  allow {
    protocol = "tcp"
    ports    = ["80", "443", "3389"]
  }
  source_ranges = ["0.0.0.0/0"]
  target_tags = ["allow-http-https-rdp"]
}

This script sets up a bastion VM with a startup script and firewall rules to allow SSH access. The bastion VM acts as an entry point to the private GKE cluster, enabling secure access. The firewall rules ensure that only specific IP ranges can access the bastion VM via SSH, enhancing security.

Step 6: Assign Service Account Roles

In the bastion.tf file, we also assign necessary roles to the service account.

resource "google_service_account" "bastion_sa" {
  account_id   = "bastion-sa"
  display_name = "Bastion Service Account"
}
resource "google_project_iam_member" "bastion_sa_roles" {
  for_each = toset(var.service_account_roles)
  project = var.project_id
  member  = "serviceAccount:${google_service_account.bastion_sa.email}"
  role    = each.value
}

This script ensures the bastion VM has the necessary permissions to interact with other GCP resources. By assigning specific IAM roles to the service account, we ensure that the bastion VM can perform required operations securely.

Step 7: Create Global IP and NAT Configuration

In the ip.tf and nat.tf files, we create a global IP address and configure NAT.

resource "google_compute_address" "global_ip" {
  name         = var.global_ip_name
  address_type = "EXTERNAL"
  region       = var.region
}
resource "null_resource" "wait_for_ip" {
  depends_on = [google_compute_address.global_ip]
  provisioner "local-exec" {
    command = "sleep 30"
  }
}
resource "google_compute_router" "nat_router" {
  name    = var.nat_router_name
  network = google_compute_network.vpc_network.name
  region  = var.region
}
resource "google_compute_router_nat" "nat" {
  name                               = var.nat_name
  router                             = google_compute_router.nat_router.name
  region                             = var.region
  nat_ip_allocate_option             = "MANUAL_ONLY"
  nat_ips                            = [google_compute_address.global_ip.self_link]
  source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES"
  depends_on = [
    google_compute_address.global_ip,
    google_compute_router.nat_router,
    null_resource.wait_for_ip,
  ]
}

These scripts set up a global IP and configure NAT for external access. The NAT configuration ensures that resources within the private subnet can access the internet when necessary, without exposing their internal IP addresses.

Step 8: Define Output Variables

In the output.tf file, we define output variables for easy access to important information.

output "gke_cluster_name" {
  description = "The name of the GKE cluster"
  value       = google_container_cluster.primary.name
}
output "gke_cluster_endpoint" {
  description = "The endpoint of the GKE cluster"
  value       = google_container_cluster.primary.endpoint
}
output "gke_cluster_master_version" {
  description = "The master Kubernetes version of the GKE cluster"
  value       = google_container
_cluster.primary.master_version
}
output "bastion_ip" {
  description = "The public IP of the bastion host"
  value       = google_compute_instance.bastion.network_interface[0].access_config[0].nat_ip
}

This script outputs key information about the GKE cluster and bastion VM. These outputs are useful for referencing the cluster endpoint and bastion IP address, making it easier to connect to and manage the GKE cluster.

Step 9: Define Variables

In the variables.tf file, we define all necessary variables for our setup.

variable "project_id" {
  description = "Google Cloud Platform project ID"
  type        = string
}
variable "region" {
  description = "Google Cloud region"
  type        = string
}
variable "gke_options" {
  description = "GKE Options"
  type = object({
    cluster_name            = string
    node_locations          = list(string)
    enable_private_nodes    = bool
    enable_private_endpoint = bool
    master_ipv4_cidr_block  = string
  })
}
variable "network_options" {
  description = "Network Options"
  type = object({
    subnet_name   = string
    vpc_name      = string
    subnet_cidr   = string
    pods_cidr     = string
    services_cidr = string
    nat_name      = string
    router_name   = string
    global_ip     = string
  })
}
variable "bastion_options" {
  description = "Bastion Options"
  type = object({
    bastion_name           = string
    bastion_machine_type   = string
    bastion_image          = string
    bastion_startup_script = string
    bastion_tags           = list(string)
  })
}
variable "firewall_options" {
  description = "Firewall Options"
  type = object({
    firewall_name            = string
    firewall_ports           = list(string)
    firewall_source_ranges   = list(string)
    firewall_target_tags     = list(string)
  })
}
variable "service_account_roles" {
  description = "List of roles to be assigned to the bastion service account"
  type        = list(string)
}

This script lists all the variables used throughout our Terraform scripts, providing a centralized location for configuration values.

Step 10: Populate Variables

In the vars.tfvars file, populate the variables with appropriate values.

project_id             = "your-project-id"
region                 = "your-region"  
cluster_name           = "your-cluster-name"
vpc_name               = "your-vpc-name"
subnet_name            = "your-subnet-name"
gke_node_cidr          = "your-node-cidr"
pods_cidr              = "your-pods-cidr"
svc_cidr               = "your-sv-cidr"
gke_master_cidr        = "your-gke-master-cidr-block"
nat_router_name        = "your-nat-router-name"
nat_name               = "your-nat-name"
global_ip_name         = "your-global-ip-name"
master_ipv4_cidr_block = "your-master-cidr-block"

bastion_name           = "your-bastion-vm-name"
bastion_machine_type   = "e2-medium"
bastion_image          = "ubuntu-2004-lts"
bastion_startup_script = <<-EOT
  #!/bin/bash
  sudo apt-get update
  sudo apt-get install -yq git
EOT
bastion_tags           = ["bastion"]
firewall_name           = "allow-ssh-bastion"
firewall_ports          = ["22"]
firewall_source_ranges  = ["0.0.0.0/0"]
firewall_target_tags    = ["bastion"]
service_account_roles  = [
  "roles/logging.logWriter",
  "roles/monitoring.metricWriter",
  "roles/monitoring.viewer",
  "roles/compute.osLogin",
  "roles/compute.admin",
  "roles/iam.serviceAccountUser",
  "roles/container.admin",            
  "roles/container.clusterAdmin",      
  "roles/compute.osAdminLogin"
]

This file contains the actual values for the variables defined earlier, allowing us to customize the setup to our specific requirements.

Conclusion

By following these steps and using the provided Terraform scripts, you can successfully create a GKE cluster and a bastion VM on Google Cloud Platform. The bastion VM allows for secure access to the private GKE cluster, ensuring a robust and secure infrastructure setup. This guide covers all the essential configurations and explains each step in detail, providing a comprehensive solution for setting up a private GKE cluster with secure access.