Creating reusable infrastructure with Terraform on GCP

Zeynep Sanliturk
Aug 7 · 11 min read

This blog post deals with the infrastructure of an entire project to be built on the Google Cloud Platform with Terraform and creating the necessary infrastructure for serving in Kubernetes.

What I mean by the whole project is all the data sources, so in this blog post we will need and use the following resources:

  • All network needs(google_compute_network, google_compute_address, google_compute_router, google_compute_router_nat, google_compute_network_peering,google_compute_firewall, google_compute_subnetwork)
  • All source needs(google_container_cluster, google_container_node_pool),
  • All permission needs(google_service_account, google_project_iam_member)

It will shed light on how it will manage all the new needs during the lyfe-cycle of the project with Terraform. I will also explain how to manage different project environments with the same Terraform files using the concept of input variables and workspace structure in Terraform.

In a Terraform project, it is recommended to have main.tf and variables.tf with the simplest structure. We’ll start with main.tf first.

main.tf

Terraform uses the concept of state to manage your infrastructure. It stores the last state you are in with the first terraform apply command in the environment specified in main.tf.

main.tf should be the primary entry point. For a simple Terraform project, this may be where all the resources are created.

This process is described in Terraform as follows:

terraform {  backend "gcs" {    project = "project-id"    bucket  = "project-tfstate"    prefix  = "terraform/state"  }}

Here we use Google Cloud Storage to store states. That’s why we need a bucket called ’project-tfstate’ in GC Storage. We have to create this manually.

In main.tf, We also need to define the following snippet to identify our provider as GC Provider.

provider "google" {  project = "${var.gcp_project}"  region  = "${var.region}"  zone    = "${var.zone}"}provider "google-beta" {  project = "${var.gcp_project}"  region  = "${var.region}"  zone    = "${var.zone}"}

Here, we need to specify variables with “${var.variable_name}” blocks to be reusable and use the same Terraform project in different environments.

Note: Among all .tf files, only variables in the terraform {…} block are not set externally and none of the Terraform built-in functions can be used in this block. Because the terraform {…} block in main.tf must first run.

We need to create a service account to access services such as network elements, kubernetes engine etc. with Terraform on GCP.

resource "google_service_account" "sa" {  account_id   = "${var.cluster_name}-gke-sa"  display_name = "${var.cluster_name}-gke-sa"}

We must define the role or roles required for this service account.

resource "google_project_iam_member" "k8s-member" {  count   = "${length(var.iam_roles)}"  project = "${var.gcp_project}"  role    = "${element(values(var.iam_roles), count.index)}"  member  = "serviceAccount:${google_service_account.sa.email}"}

The element(list, index) function is used to access the elements in the list. We determine which element we want to reach with index. Above, we used this code for loop purposes.

network.tf

To create the project from scratch, we must create the network needs. To do this, we will first create a new VPC network.

resource “google_compute_network” “project-network” {  name = “${var.vpc_name}-network”  auto_create_subnetworks = “false”  routing_mode = “REGIONAL”}

Production and staging environments must be on the same VPC network, but two different subnets are required to isolate these environments. This is actually not a necessity but a best practice. If we opened separate networks for production and staging, we would have to create separate network elements if we needed to define a firewall rule or tunnel between all GC projects.

resource "google_compute_subnetwork" "project-subnet" {  name                     = "${var.cluster_name}"  ip_cidr_range            = "${var.subnet_cidr}"  private_ip_google_access = true  network                  = "${google_compute_network.project-network}"}

Here the cluster_name variable will differ for the production and staging environment.

Next, Firewall rules should be defined as follows.

resource "google_compute_firewall" "project-firewall-allow-ssh" {  name    = "${var.vpc_name}-allow-something"  network = "${google_compute_network.project-network.self_link}"  allow {    protocol = "some-protocol #tcp, udp, icmp...    ports    = ["some-port"] #22, 80...  }source_ranges = ["IP/range"] #according to cidr notation
# source_ranges = ["${var.subnet_cidr}", "${var.pod_range}", "${var.service_range}"
}

In some cases, we want to access the postgreSQL database in the same project in GCP but in a different network. In such a case, we can define a firewall rule as follows.

resource "google_compute_firewall" "allow-db" {  name    = "allow-from-${var.cluster_name}-cluster-to-other-project-db"  network = "other-network"  allow {    protocol = "icmp"  }  allow {    protocol = "tcp"    ports    = ["5432"]  }  source_ranges = ["${var.subnet_cidr}", "${var.pod_range}"]  target_tags = ["network-tag-name"]}

I will assume that this database is running on a GC Compute Engine VM instance. In this case, we must define the network-tag-name value as the network tag in this VM instance. Thus, when this rule create, the database will accept requests from the specified pod_range on port 5432.

All the network elements we’ve added so far included internal IP addresses. The safest way to get out of the internal network is to use a Nat gateway. Here, we will use Cloud Nat and Cloud Router services of GC and allocate public IP address first.

resource "google_compute_address" "project-nat-ips" {  count   = "${length(var.cloud_nat_ips)}"  name    = "${element(values(var.cloud_nat_ips), count.index)}"  project = "${var.gcp_project}"  region  = "${var.region}"}

We create a Cloud Router to route the VPC network with internal IP addreses called project-network to the Cloud Nat gateway.

resource “google_compute_router” “project-router” {  name = “${var.vpc_name}-nat-router”  network = “${google_compute_network.project-network.self_link}”}

We connect the Cloud Router object and the Google Compute Address Public IP that we created in the previous two blocks to the Cloud Nat gateway. In other words, we take the relevant network from the internal network to the outside world.

resource “google_compute_router_nat” “project-nat” {  name = “${var.vpc_name}-nat-gw”  router = “${google_compute_router.project-router.name}”  nat_ip_allocate_option = “MANUAL_ONLY”  nat_ips = [“${google_compute_address.project-nat-ips.*.self_link}”]  source_subnetwork_ip_ranges_to_nat = “ALL_SUBNETWORKS_ALL_IP_RANGES”  depends_on = [“google_compute_address.project-nat-ips”]}

Finally, the VPC element in the network section is VPC peering. VPC networks can communicate in private with your other VPC networks of GC projects using the VPC Peering. We can imagine that it creates an encrypted tunnel in between. We will create a VPC Peering element so that the relevant VPC network can communicate with the VPC networks or networks in another project.

resource "google_compute_network_peering" "vpc_peerings" {  count = "${length(var.vpc_peerings)}"  name         = "${element(keys(var.vpc_peerings), count.index)}"  network      = "${google_compute_network.project-network.self_link}"  peer_network = "${element(values(var.vpc_peerings), count.index)}"}

gke.tf

With main.tf and network.tf, we have made the necessary development for the creation of base resources. We will now look at the gke blocks where we will create cluster resources where the project will be the main host.

In gke.tf we will first add a cluster and then a node pool that will be autoscale in the cluster.

resource "google_container_cluster" "primary" {  provider = "google-beta"  name     = "${var.cluster_name}"  zone     = "${var.zone}"  min_master_version       = "${var.gke_version}"  remove_default_node_pool = true  initial_node_count       = 1  master_authorized_networks_config {    cidr_blocks = [    {      cidr_block   = "IP/range"    #according to cidr notation      display_name = "all"    },    ]  }  ip_allocation_policy {    cluster_ipv4_cidr_block  = "${var.pod_range}"    services_ipv4_cidr_block = "${var.service_range}"  }  network      = "${var.network_name}"  subnetwork   = "${var.cluster_name}"  node_version = "${var.gke_version}"  private_cluster_config {    enable_private_nodes   = true    master_ipv4_cidr_block = "${var.master_range}"  }  master_auth {    username = ""    password = ""     client_certificate_config {      issue_client_certificate = false    }  }}

The value of {var.cluster_name} will change for the two environments of the project, ie two different clusters will create for production and staging. These two clusters will build on the same network but on different subnets. With ip_allocation_policy, we must specify the internal IP range for the cluster and services.

One important point is the remove_default_node_pool variable. The value of this variable is True, so we don’t want it to be the default node pool when the cluster is created. The interesting point here is that; with this variable, Terraform first creates the default node pool and then deletes it. We’ll add an external node pool that will be autoscaled because it doesn’t form at all what we expect. Terraform did this because of a gcloud command constraint. Because of the GCP constraint, unable to modify existing configs in Terraform when creating a cluster, so it is created and then deleted.

either through Terraform, or manually with the gcloud client will create a default node pool which you’re not able to manage yourself.

If the master_auth block is not provided, GKE will generate a password for you with the username ‘admin’ for HTTP basic authentication when accessing the Kubernetes master endpoint.

master_authorized_networks_config block required for accessing to the master from internal IP addresses other than nodes and Pods. Address ranges that you have authorized and this block brings back a publicly accessible endpoint.

resource "google_container_node_pool" "default" {  provider = "google-beta"  name     = "${var.default_pool_name}"  zone     = "${var.zone}"  cluster    = "${google_container_cluster.primary.name}"  node_count = "${var.default_pool_node_number}"  version    = "${var.gke_version}"  autoscaling {    min_node_count = "${var.default_pool_min_node}"    max_node_count = "${var.default_pool_max_node}"  }  node_config {    machine_type     = "${var.machine_type}"    oauth_scopes = [      "https://www.googleapis.com/auth/logging.write", 
"https://www.googleapis.com/auth/monitoring",
] service_account = "${var.service_account}" metadata = { disable-legacy-endpoints = "true" } }}

We will add a node pool to the cluster we create with google_container_node_pool and determine the default number of nodes with node_count. For our application to be autoscale, we must also set min_node_count and max_node_count values.

In the node_config block, we will specify the machine type and the oauth_scopes required to monitor the machines on the GC Kubernetes Engine. In this way, all environments required for our application to run will be ready.


Evaluation

We have created .tf files that we can create all network, resource and service account needs so far. It’s time to take care of how to manage input variables. We will use the same variables.tf file for our different environment. Here we do not assign any value to our variables. We will assign the values of the variables we defined in the staging.tf and production.tf files. Now we will look at variables.tf, then we first try to understand Terraform’s workspace concept for staging and production environment.

Input Variables

we use “${var.x}” kind of variables inside the primary entrypoints, such as main.tf, network.tf and gke.tf but we define these variables in variables.tf.

variable "gcp_project" {}variable "region" {}variable "zone" {}variable "vpc_name" {}...variable "iam_roles" {type = "map"}....

All input variables you use here must be declared in variable blocks. Here we can optionally specify the type and the default value of the variable. You can also add description information to define the purpose but they are not required.

terraform.tfvars

The values of the variables in the variables.tf files are specified in the ‘.tfvars’ file. It used with the -var-file flag in the command line. .tfvars file override the default values in variables.tf.

The appropriate terraform.tfvars file for the example variables.tf in the Input variables section can be as follows.

region = "europe-west1"zone = "europe-west1-d"gcp_project = "example"vpc_name = "example"...iam_roles = {  role1 = "roles/storage.objectViewer"  role2 = "roles/logging.logWriter"  ...}

Workspace

We discussed Terraform’s state, and we created our files to store the states of our permanent data in GC Storage. These states belong to workspaces. If you have only one terraform.tfvars file, you can use the default workspace. To manage the same configuration settings in multiple environments, you must create different workspaces.

In our case, we need separate workspaces for staging and production. For example, I’ll do the necessary operations for staging.

$ terraform workspace new staging
Created and switched to workspace "staging"!
You're now on a new, empty workspace. Workspaces isolate their state, so if you run "terraform plan" Terraform will not see any existing state for this configuration.

When we create a workspace for production, we use the following command to switch between them.

terraform workspace select (staging/production)

Which workspace we are going to do changes, we have to work in that workspace.

After making sure that we are in the workspace we created for staging, we will create the staging.tfvars file for staging and determine the appropriate values for staging in our variables.

subnet_cidr = "10.10.0.0/16"
subnet_name = "example-staging"
cluster_name = "example-staging
...
network_name = "example-network"
pod_range = "10.20.0.0/16"
...
default_pool_node_number = 1
default_pool_min_node = 1
default_pool_max_node = 1
...
default_node_pool_name = "example"
machine_type = "n1-standard-2"

We run the terraform plan command, where we see a preview of all resources that will create with the current configuration settings and assigned variable values. One important point is the usage of the -out flag. The -out flag is used to save changes to a file that the terraform plan command says during that execution. It is used to save versions of states during your work.

terraform plan -var-file="staging.tfvars" -out=staging.out

Carefully examine the output of the command, the resulting resources, and variable values will be displayed completely.

Plan: n to add, 0 to change, 0 to destroy.

If everything goes well, we can create all the resources on GCP with terraform apply command.

terraform apply "staging.out"

Now our resources are created and the states are kept in the bucket in GC Storage as shown below.

We can now preview each time you make a new insertion or edit using the terraform plan. All the codes in this blog post have been included for sample creation. You can modify for your infrastructure and get help from this post.

NOTE: Although there are no major differences, it does not cover the changes that come with the latest version(v0.12) of Terraform. For example, the changes made in the last version for map type variables mentioned in Terraform’s document, as follows:

Important: In Terraform 0.12 and later, variables with map and object values behave the same way as other variables: the last value found overrides the previous values. This is a change from previous versions of Terraform, which would merge map values instead of overriding them.

References

Follow us on Twitter 🐦 and Facebook 👥 and join our Facebook Group 💬.

To join our community Slack 🗣️ and read our weekly Faun topics 🗞️, click here⬇

If this post was helpful, please click the clap 👏 button below a few times to show your support for the author! ⬇

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts

Thanks to yunus özen

Zeynep Sanliturk

Written by

DevOps Engineer

Faun

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade