Shared VPC | GKE: Provisioning GKE Cluster in Shared VPC using Terraform

Vijayender R. Soma
Google Cloud - Community
11 min readSep 6, 2023

In organizations, there are often scenarios requiring separation of Network Administration from projects where you have GKE clusters (or other resources). This is where Shared VPC Networks are used.

Using the Shared VPC approach, Network teams can have control over network resources such as subnets created for clusters, while Cluster teams administer the GKE clusters using approved subnets in the Shared VPC, allowing implementation of “least privilege”.

Now, if we look at creation of a GKE cluster with Shared VPC, it requires two GCP projects, which are called host and service projects.

  1. In the host project, you create networks, subnets, secondary IP ranges, firewall rules etc.
  2. Share selected subnet from the host project (consisting of secondary IP ranges, created for the GKE cluster) with the service project
  3. Create GKE cluster in service project, using the subnet shared from the host project

The following diagram shows GKE Cluster using Shared VPC, representing the steps described above

GKE Cluster using Shared VPC

In this article, we will go through the implementation using terraform

For implementing the above steps, we will use three sample GCP projects defined as follows

Network project: cluster-gke-network

This is the Shared VPC host project, where VPC, subnets etc. are created

Cluster project: cluster-gke-myenv

This is the Shared VPC service project, where GKE cluster is created

Terraform project: platform-build-tf

This is the project where Terraform code is run

This implementation consists of steps to configure terraform to create resources as a Service Account and writing terraform scripts for execution. Using these scripts, we will provision a Shared VPC, Subnet, GKE cluster, GKE Node Pool along with other dependent resources and granting required IAM permissions etc.

Configuring Terraform to create resources as Service Account

It’s a recommended practice to run terraform code as Service Account(s) with minimum required privileges as opposed to using a user account with elevated privileges. This method will use Service Account impersonation, which doesn’t require service account keys.

This requires creation of a service account in terraform project, assigning permissions to this service account on network and cluster projects and adding terraform script for service account impersonation.

Follow the steps given below to configure terraform to use a Service Account

1) Create a service account in Terraform project

gcloud iam service-accounts create sa-platform-tf --description="Terraform Service account" --display-name="Terraform Service Account"  --project=platform-build-tf

The above command creates a service account as sa-platform-tf@platform-build-tf.iam.gserviceaccount.com in terraform project

2) Grant Permissions to terraform Service Account (created in previous step)

Grant permissions to the terraform service account for creating required resources within network and cluster projects.

2.a) Grant Shared VPC Admin role at Folder/ Org level

Assign the Shared VPC Admin role at the folder level or Org level for enabling Shared VPC. (Refer to Nominate Shared VPC Admins)

Note: This step assumes network project’s existence in a folder if assigning permission at a folder level.

The following sample shows assigning permissions at a folder level.

gcloud resource-manager folders add-iam-policy-binding FOLDER_ID 
--member='serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com'
--role="roles/compute.xpnAdmin"
Cloud IAM (demo folder): Grant Shared VPC Admin role at Folder “demo” level

2.b) Grant roles in Network (Shared VPC host) project

#To enable apis
gcloud projects add-iam-policy-binding cluster-gke-network --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/serviceusage.serviceUsageAdmin"

#To create and manage network resources
gcloud projects add-iam-policy-binding cluster-gke-network --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/compute.networkAdmin"

#To create IAM policy bindings
gcloud projects add-iam-policy-binding cluster-gke-network --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/iam.securityAdmin"
Cloud IAM (Network Project): Grant roles in Network (Shared VPC host) project

2.c) Grant roles in Cluster (Shared VPC service) project

# To enable apis
gcloud projects add-iam-policy-binding cluster-gke-myenv --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/serviceusage.serviceUsageAdmin"

# To create SA for GKE
gcloud projects add-iam-policy-binding cluster-gke-myenv --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/iam.serviceAccountAdmin"

# To create GKE
gcloud projects add-iam-policy-binding cluster-gke-myenv --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/container.clusterAdmin"

# To set a service account on nodes
gcloud projects add-iam-policy-binding cluster-gke-myenv --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/iam.serviceAccountUser"

# To reading instance group manager (Required 'compute.instanceGroupManagers.get' permission)
gcloud projects add-iam-policy-binding cluster-gke-myenv --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/compute.viewer"
Cloud IAM (Cluster Project): Grant roles in Cluster (Shared VPC service) project

2.d) Grant role(s) in Terraform project

#To enable apis
gcloud projects add-iam-policy-binding platform-build-tf --member="serviceAccount:sa-platform-tf@platform-build-tf.iam.gserviceaccount.com" --role="roles/serviceusage.serviceUsageAdmin"

3) Grant permissions to a user account (or another SA) on terraform Service Account to allow impersonation

Grant the Service Account Token Creator IAM role on terraform Service account’s IAM policy, to the account you will be authenticated as, when running terraform commands. This role enables you to impersonate service accounts to access APIs and resources. (You can use an SA for impersonating another SA)

gcloud iam service-accounts add-iam-policy-binding sa-platform-tf@platform-build-tf.iam.gserviceaccount.com --member="user:devops-user@example.com" --role="roles/iam.serviceAccountTokenCreator"

Note: After the above steps are performed, the next step related to configuring terraform to run as SA involves writing terraform script for Service Account impersonation, which can be found in the next section within main.tf file.

Create Terraform Scripts

Now, lets create terraform scripts within a folder in a developer environment, based on the samples given below.

variables.tf

variable "project" {
description = "Terraform project ID (Project where terraform scripts are running)"
default = "platform-build-tf"
}

variable "region" {
description = "Region where cluster is created"
default = "us-central1"
}

variable "net_project" {
description = "Network Project ID (GCP Project acting as Shared VPC host)"
default = "cluster-gke-network"
}

variable "cluster_project" {
description = "Cluster Project ID (GCP Project acting as Shared VPC service)"
default = "cluster-gke-myenv"
}

variable "cluster_project_number" {
description = "Cluster Project Number (GCP Project acting as Shared VPC service)"
default = "654918026407"
}

variable "terraform_service_account" {
description = "Terraform Service Account Name. This SA is created within Terraform project (Project where terraform scripts are running)"
default = "sa-platform-tf@platform-build-tf.iam.gserviceaccount.com"
}

variables.tf: This file contains variables for use in terraform scripts related to GCP projects and region for cluster etc.

(The description field contains information specifying the purpose of each variable)

outputs.tf

output "name" {
description = "The name of the cluster master. This output is used for interpolation with node pools, other modules."
value = google_container_cluster.primary.name
}

output "master_version" {
description = "The Kubernetes master version."
value = google_container_cluster.primary.master_version
}

output "endpoint" {
description = "The IP address of the cluster master."
sensitive = true
value = google_container_cluster.primary.endpoint
}

# The following outputs allow authentication and connectivity to the GKE Cluster.
output "client_certificate" {
description = "Public certificate used by clients to authenticate to the cluster endpoint."
sensitive = true
value = base64decode(google_container_cluster.primary.master_auth[0].client_certificate)
}

output "client_key" {
description = "Private key used by clients to authenticate to the cluster endpoint."
sensitive = true
value = base64decode(google_container_cluster.primary.master_auth[0].client_key)
}

output "cluster_ca_certificate" {
description = "The public certificate that is the root of trust for the cluster."
sensitive = true
value = base64decode(google_container_cluster.primary.master_auth[0].cluster_ca_certificate)
}

outputs.tf: This file contains terraform outputs, consisting of values related to GKE cluster.

(The description field contains information about each output value)

versions.tf

# Terraform and Provider versions
terraform {
required_providers {
google = {
source = "hashicorp/google"
version = "~> 4.78.0"
}
}
required_version = "~> 1.5.4"
}

versions.tf: This file contains declaration of terraform and Provider versions.

main.tf


# Declare vars
locals {
network_name = "shared-net"
subnet_name = "gke-myenv-subnet"
pods_range_name = "gke-myenv-subnet-pods"
svc_range_name = "gke-myenv-subnet-services"
}

# Configure terraform for Service Account impersonation
# Create a provider that will be used to retrieve an access token for the service account.
provider "google" {
alias = "impersonation"
scopes = [
"https://www.googleapis.com/auth/cloud-platform",
"https://www.googleapis.com/auth/userinfo.email",
]
}

# Declare a data block to retrieve the access token that will be used to authenticate as the service account
data "google_service_account_access_token" "default" {
provider = google.impersonation
target_service_account = var.terraform_service_account
scopes = ["userinfo-email", "cloud-platform"]
lifetime = "1200s"
}

# Declare a default "google" provider that will use the access token of your service account.
provider "google" {
project = var.project
access_token = data.google_service_account_access_token.default.access_token
request_timeout = "60s"
}


# Enable APIs for Terraform project (Project where terraform scripts are running)
# Declare APIs
variable "tf_project_gcp_services" {
description = "The list of apis necessary for the project"
type = list(string)
default = [
"cloudresourcemanager.googleapis.com",
"iam.googleapis.com",
"iamcredentials.googleapis.com",
# "container.googleapis.com",
]
}
# Enable APIs
resource "google_project_service" "tf" {
for_each = toset(var.tf_project_gcp_services)
project = var.project
service = each.key
disable_dependent_services = true
disable_on_destroy = false
}

main.tf (Configure Terraform to run as SA): This file contains script to configure terraform for Service Account impersonation, enable google APIs on Terraform Project etc.

Note that terraform_service_account is provided as a value for the target_service_account in data.google_service_account_access_token.default

With the above configuration in place, the terraform code will be run as a Service Account when acting on resources.

network.tf

# Enable APIs for Network project (GCP Project acting as Shared VPC host)
# Declare APIs
variable "net_project_gcp_services" {
description = "The list of apis necessary for the project"
type = list(string)
default = [
"cloudresourcemanager.googleapis.com",
"container.googleapis.com"
]
}
# Enable APIs
resource "google_project_service" "network" {
for_each = toset(var.net_project_gcp_services)
project = var.net_project
service = each.key
disable_dependent_services = true
disable_on_destroy = false
}

# Shared VPC
# Enable Shared VPC on host project
resource "google_compute_shared_vpc_host_project" "network" {
project = var.net_project
depends_on = [
google_project_service.network
]
}

# Attach Cluster project as Shared VPC Service project
resource "google_compute_shared_vpc_service_project" "cluster" {
host_project = google_compute_shared_vpc_host_project.network.project
service_project = var.cluster_project
}

# Create VPC (Shared)
resource "google_compute_network" "shared_vpc" {
name = local.network_name
auto_create_subnetworks = false
project = var.net_project
depends_on = [
google_compute_shared_vpc_service_project.cluster
]
}

# Create Subnet
resource "google_compute_subnetwork" "cluster_subnet" {
name = local.subnet_name
ip_cidr_range = "10.0.4.0/22"
region = var.region
network = google_compute_network.shared_vpc.id
project = var.net_project
private_ip_google_access = "true"

secondary_ip_range {
range_name = local.pods_range_name
ip_cidr_range = "10.4.0.0/14"
}
secondary_ip_range {
range_name = local.svc_range_name
ip_cidr_range = "10.0.32.0/20"
}
}

network.tf (Create Shared VPC): This file contains script to provision VPC, subnet, configure Shared VPC between Network(host) and Cluster(service) projects.

Note that Network project is specified as a value for the host_project and Cluster project is specified as a value for the service_project within google_compute_shared_vpc_service_project.cluster resource

iam.tf

locals {
# Service Accounts used for granting permissions
# GKE Service Agent: service-SERVICE_PROJECT_NUM@container-engine-robot.iam.gserviceaccount.com
# Google APIs Service Agent: SERVICE_PROJECT_NUM@cloudservices.gserviceaccount.com
cluster_project_gke_svc_agent_sa = format("serviceAccount:service-%s@container-engine-robot.iam.gserviceaccount.com", var.cluster_project_number)
cluster_project_gapis_svc_agent_sa = format("serviceAccount:%s@cloudservices.gserviceaccount.com", var.cluster_project_number)
}

# Assign Permissions on Network Project to the following SAs from Cluster Project

# Grant the cluster (service) project's GKE Service Agent SA the Compute Security Admin role
# within the Network (host) project. This is required to create and manage the firewall resources
resource "google_project_iam_member" "net_project_iam_sec_admin" {
project = var.net_project
role = "roles/compute.securityAdmin"
member = local.cluster_project_gke_svc_agent_sa
depends_on = [
google_project_service.cluster,
]
}

# Grant the cluster (service) project's GKE Service Agent SA the Host Service Agent role on the Network (host) project
# This binding allows the service project's GKE Service Agent SA to perform network management operations in the host project.
resource "google_project_iam_member" "net_project_iam_svc_agent_user" {
project = var.net_project
role = "roles/container.hostServiceAgentUser"
member = local.cluster_project_gke_svc_agent_sa
depends_on = [
google_project_service.cluster,
]
}

# Grant the cluster (service) project's GKE Service Agent and Google APIs Service Agent SAs the Compute Network User role
# on shared VPC subnet within the Network (host) project.
# This allows using subnet from Shared VPC
data "google_iam_policy" "subnet_policy_data" {
binding {
role = "roles/compute.networkUser"
members = [
local.cluster_project_gapis_svc_agent_sa,
local.cluster_project_gke_svc_agent_sa
]
}
}

resource "google_compute_subnetwork_iam_policy" "subnet_policy" {
project = google_compute_subnetwork.cluster_subnet.project
region = google_compute_subnetwork.cluster_subnet.region
subnetwork = google_compute_subnetwork.cluster_subnet.name
policy_data = data.google_iam_policy.subnet_policy_data.policy_data
depends_on = [
google_project_service.cluster,
]
}

iam.tf (Grant permissions): This file contains script to grant the appropriate IAM roles to the GKE Service Agent and Google APIs Service Agent related service accounts that belong to Cluster(service) project on the Network(host) project.

These permissions are required for creating and managing the firewall resources, performing network management operations in the Network(host) project, using Shared VPC and subnet.

After a successful execution, permissions will look as shown in the images below.

Cloud IAM (Network Project): Permissions granted to GKE Service Agent
Shared VPC Subnet Permissions (Network Project): Permission granted to GKE and Google APIs Service Agents

Note: Google Cloud creates Google APIs Service Agent with edit permissions and GKE Service Agent SA with the Kubernetes Engine Service Agent role on Cluster(service) project when Kubernetes Engine API is enabled on that project. If these Service Accounts and associated permissions are affected, the cluster creation and management functionality will fail. Learn more about missing edit permissions on account

cluster.tf

locals {
cluster_type = "gke-standard-private"
}

# Enable APIs for Cluster project (GCP Project acting as Shared VPC service)
# Declare APIs
variable "cluster_project_gcp_services" {
description = "The list of apis necessary for the project"
type = list(string)
default = [
"container.googleapis.com",
]
}
# Enable APIs
resource "google_project_service" "cluster" {
for_each = toset(var.cluster_project_gcp_services)
project = var.cluster_project
service = each.key
disable_dependent_services = true
disable_on_destroy = false
}

# Create Service Account to use with NodePools
resource "google_service_account" "default" {
account_id = "gke-cluster-sa-id"
display_name = "Service Account for GKE"
project = var.cluster_project
}

# Create GKE cluster in Shared VPC
resource "google_container_cluster" "primary" {
project = var.cluster_project
name = "${local.cluster_type}-cluster"
location = var.region
network = google_compute_network.shared_vpc.self_link
subnetwork = google_compute_subnetwork.cluster_subnet.self_link

release_channel {
channel = "REGULAR"
}
ip_allocation_policy {
cluster_secondary_range_name = local.pods_range_name
services_secondary_range_name = local.svc_range_name
}
private_cluster_config {
enable_private_endpoint = true
enable_private_nodes = true
master_ipv4_cidr_block = "172.36.0.0/28"
}

master_authorized_networks_config {
}

initial_node_count = 1

node_config {
machine_type = "e2-medium"
service_account = google_service_account.default.email
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
}

depends_on = [
google_project_service.cluster,
google_compute_subnetwork_iam_policy.subnet_policy,
google_project_iam_member.net_project_iam_svc_agent_user,
google_project_iam_member.net_project_iam_sec_admin
]
}

# Create Node Pool for Standard Cluster
resource "google_container_node_pool" "apps" {
project = var.cluster_project
name = "apps-node-pool"
location = var.region
cluster = google_container_cluster.primary.name
node_count = 1

node_config {
machine_type = "e2-medium"

service_account = google_service_account.default.email
oauth_scopes = [
"https://www.googleapis.com/auth/cloud-platform"
]
}
}

cluster.tf (Create GKE cluster): This file contains script to provision GKE cluster, GKE Node Pool in Shared VPC.

Note that cluster configuration (google_container_cluster.primary) refers to Shared VPC and Subnet from Network project for network and subnetwork fields respectively.

Also, a Service Account (google_service_account.default) is created and set on cluster nodes (node_config.service_account)

By having the all files with appropriate configuration as specified above in a directory, we are done with scripting and ready for execution.

Running Terraform Scripts

After the terraform scripts are created from previous steps, use terraform init, terraform plan, terraform apply commands to initialize terraform, generate plan and provision the resources. Use terraform destroy to delete the resources created by terraform.

Note: When running these commands, ensure to authenticate using the account specified in Step 3 on SA impersonation, from “Configuring Terraform to create resources as Service Account” section.

After a successful execution of terraform apply, we can find resources provisioned as shown in the images below

Shared VPC (Network Project): Displaying host and attached (service) projects
Shared VPC (Network Project): Subnet details
GKE Cluster (Cluster Project): Basic details with region, size etc
GKE Cluster (Cluster Project): Network configuration of cluster with reference to Shared VPC

Conclusion & References

This guide described steps involved in provisioning a Google Kubernetes Engine (GKE) cluster in Shared VPC using terraform. For more terraform GKE samples, visit terraform-google-kubernetes-engine git repo.

You can read more about Shared VPC and provisioning GKE in Shared VPC by visiting the following links.

--

--