Reducing your attack surface in GCP with IAP

Published in

Voi Engineering

6 min readNov 5, 2020

by Ronan Barrett and the Voi platform team

Voi is a Scandinavian micro mobility company offering electric scooters in partnership with cities and local communities around Europe. At Voi we are always looking for ways to improve the security of our cloud infrastructure. Our workloads mainly run on Google Cloud Platform (GCP) using Google Kubernetes Engine (GKE) and a mix of managed database and messaging solutions. In this blog post we will demonstrate how we have reduced our attack surface security by using Google Identity Aware Proxy (IAP). We will use terraform where possible to define the infrastructure. The terraform code used is available on our open source GitHub repository.

What’s the problem?

Traditionally, running managed services in GCP has meant assigning public IP addresses and protecting the resources using firewall rules and other networking mechanisms. In the last year or so GCP has started providing alternative solutions such as private Kubernetes clusters, where worker nodes are not directly connected to the internet. Private IPs can also be used exclusively with CloudSQL instances. Even cloud functions can be kept inaccessible from the internet using the “allow internal traffic only” option.

Whilst keeping all these managed services private is great for reducing the attack surface of our system it also introduces a problem, what if we need access to these resources from our development machines during an incident? Developers don’t normally require direct access to clusters or databases as they can use CI/CD pipelines to deploy Kubernetes resources and SQL DDL, whilst terraform is used to modify the cloud infrastructure in reproducible ways.

What’s the solution?

The traditional solution to providing controlled access to private resources is to use a bastion. However, this then results in a new attack surface, the bastion itself. This weakness can be alleviated by making the bastion itself a private resource only accessible via a GCP Identity Aware Proxy (IAP). IAP is accessed over HTTPS and can tunnel traffic of all sorts including SSH. Authentication is provided by the existing Google Sign-In component. Authorization is provided by Identity & Access Management (IAM).

User/Group provisioning

We recommend using Google Groups to organize your user accounts. Using Google Groups to organize membership is highly recommended as it makes onboarding, offboarding, and auditing much simpler. We will use groups exclusively in the configuration of our IAP setup. These groups will be assigned permissions via IAM.

IAP provisioning

The first step in provisioning the IAP is to follow the instructions outlined by GCP. We performed these steps manually as completing all the tasks via the gcloud SDK or terraform is overly complex or isn’t possible yet.

Add a firewall rule to allow IAP access to the bastion. The default configuration requires allowing IAP traffic from 35.234.220.0/20 to target VM with tag “bastion” for TCP 22. The terraform code for the firewall might look as follows:

resource "google_compute_firewall" "default" {
  name    = "iap-bastion"
  network = "default"
  allow {
    protocol = "tcp"
    ports    = ["22"]
  }
  target_tags   = ["bastion"]
  source_ranges = ["35.235.240.0/20"] # This range contains all IP addresses that IAP uses for TCP forwarding.
}

Bastion provisioning

Use the cheapest VM instances available, for example “f1-micro”, as the bastion is acting as a jump host and requires very little resources. It is required to add the label “bastion”, so that the firewall rule will be applied to this instance.

It is highly recommended to add two GCP Metadata key/value pairs to the VM as follows:

enable-oslogin TRUE
enable-oslogin-2fa TRUE

The first key/value “enable-oslogin” will restrict who can log into the bastion, ideally using Google Group membership and the role membership “roles/compute.osLogin”. Users who have the role membership “roles/compute.osLogin” will specifically not be able to sudo in the bastion. That privilege is reserved to users with the “roles/compute.osAdminLogin” role membership. The second key/value “enable-oslogin-2fa” will force the user to connect to the bastion to use two-factor authentication.

The bastion itself requires no service account or API access. It is important to give the bastion the lowest amount of privileges possible. The bastion should not have a public IP address.

We choose to use managed instance group (MIG) to give us the possibility to use preemptible instances(the MIG would then restart the preempted instances), but had some issue with IAM bindings. So this part can be either provisioned in a managed instance group or not: either way, you can use the instance template.

The terraform code for a template instance for a VM might look as follows:

resource "google_compute_instance_template" "bastion-template" {
  name        = "bastion-template"
  description = "This template is used to create bastion instances."
  tags = ["bastion"]
  labels = {
    environment = var.environment_name
  }
  instance_description = "Instance used for bastion"
  machine_type         = "f1-micro"
  scheduling {
    automatic_restart   = false
    on_host_maintenance = "TERMINATE"
    preemptible         = false
  }
  // Create a new boot disk from an image
  disk {
    source_image = "debian-cloud/debian-9"
    auto_delete  = true
    boot         = true
  }
  network_interface {
    network = "default"
  }
  metadata = {
    enable-oslogin     = "True"
    enable-oslogin-2fa = "True"
  }
}

To permit connectivity to the bastion VM for authorized users coming via the IAP tunnel the role “roles/iap.tunnelResourceAccessor” should be bound to the bastion instance for the Google Group using an IAM policy binding. The terraform code for this might look as follows:

resource "google_iap_tunnel_instance_iam_member" "member" {
  project  = var.project_id
  zone     = var.bastion_zone_id
  instance = var.bastion_name
  role     = "roles/iap.tunnelResourceAccessor"
  member   = "group:your_group@yourdomain.com"
}

To permit login access to the bastion VM for authorized users the role “roles/compute.osLogin” should be bound to the bastion instance for the Google Group using an IAM policy binding.

The terraform code for this might look as follows:

resource "google_compute_instance_iam_member" "member" {
  project       = var.project_id
  zone          = var.bastion_zone_id
  instance      = var.bastion_name
  role          = "roles/compute.osLogin"
  member        = "group:your_group@yourdomain.com"
}

The bastion itself can be further hardened by configuring the operating system’s sshd_config and by other mechanisms out of the scope for this article. For some inspiration take a look at Digital Oceans recommendations. You should also ensure the bastion’s OS is kept up to date with patches using something like GCP’s OS patch management. When using GCP’s OS Config you must provide your bastion with a service account that has the lowest possible permissions.

Reducing the attack surface

Now that we have IAP setup with connectivity to our bastion we can start removing public IPs from our managed resources. Some examples are as follows:

Use private GKE clusters
Use CloudSQL instance with only private IP addresses
Use Cloud functions that allow internal traffic only

Connecting to GCP resources via IAP

To connect to a resource via your IAP secured GCP VPN it is recommended to use the gcloud SDK. Using the SDK, instead of a regular SSH client, allows us to benefit from the key exchange bootstrapping built into gcloud and the GCP Compute Engine when using OS Login.

gcloud compute --project ${PROJECT_ID} ssh --zone ${BASTION_ZONE} ${BASTION_INSTANCE_NAME} --tunnel-through-iap --ssh-flag="-L${LOCAL_PORT}:${IP_ADDRESS}:${REMOTE_PORT}"

Connecting to your GKE cluster via IAP

To connect to a GKE cluster via IAP and kubectl you need to complete a few steps as follows:

Update your ~/.kube/config (or use kubectl config set-cluster ${cluster_name} — server=https://kubernetes.default:8443) to use the IAP tunnel. Set server to https://kubernetes.default:8443 The DNS name used must be one of the SAN names in the Kubernetes certificate
Update your hosts file to map 127.0.0.1 to kubernetes.default
Use the gcloud ssh command as above to connect to the cluster’s management endpoint, for example

gcloud compute --project ${PROJECT_ID} ssh --zone ${BASTION_ZONE} ${BASTION_INSTANCE_NAME} --tunnel-through-iap --ssh-flag="-L8443:${CLUSTER_MANAGEMENT_ENDPOINT}:443"

When connecting to private GKE clusters you should use the dedicated private management endpoint. When connecting to non-private GKE clusters you can use the public management endpoint. You may have to enable Cloud NAT in your VPC to have access to the public endpoint as your bastion might not be routable from the public management endpoint. It is recommended to use a private GKE cluster with the special private endpoint so that Cloud NAT isn’t required for this scenario, you may however require Cloud NAT anyways for accessing external resources from your Kubernetes workloads.

The use of IAP and kubernetes might be simplified in the future to not require a bastion when and if this issue is fixed by GCP.

Conclusions

The attack surface of your GCP cloud infrastructure can be considerably reduced using IAP together with IAM and Google Groups. This together with the automated key management procedures of OS Login and simplified on and off-boarding of engineers makes migration to IAP a worthwhile investment. If you would like a cloud-agnostic solution it might be worth looking at Hashicorp’s Boundary which has similar goals to GCP’s IAP. Cloud specific bastion terraform modules for GCP and AWS are also available.