Google Kubernetes Engine (GKE) Security Best Practices

Published in

Google Cloud - Community

5 min readAug 16, 2021

In the cloud provider battle for the best managed Kubernetes services, Google Kubernetes Engine (GKE) takes the lead for many as their favorite pick. Kubernetes is constantly evolving, and GKE is quick to innovate. After all, Kubernetes was born at Google before being adopted by the Cloud Native Computing Foundation (CNCF).

As Kubernetes evolves, so does it’s security landscape. For example, the recent release of Kubernetes 1.21 brought along the deprecation of PodSecurityPolicies (PSP). With this constantly changing nature, this blog posts aims to answer what are the security best practices on GKE today?

GKE Security Overview

Security is a complex problem, there is no such thing as perfect, and there is no one size fits all. GKE and the Google Cloud Platform provide many features and services to enhance security at every layer of the stack. You should always apply the principle of least privilege, and apply security best practices wherever they can coexist with your application’s functionality.

Many configurations on GKE are secure by default, but it is important to be aware of what is not, and what options you have when configuring GKE. The list of Kubernetes security considerations in its entirety goes beyond the scope of this blog post, but rather a high level overview of some of the GKE security considerations will be discussed.

Automate CIS Benchmarking

The Center for Internet Security (CIS) provides benchmarks of security configurations for Kubernetes. Its important to understand the shared security model of GKE, as some kubernetes configurations, such as the majority of configurations of the control plane, is GKE managed. Therefore, it is best to use the specific CIS GKE Benchmark.

There are many tools out there for automating Kubernetes CIS Benchmark auditing. The GKE documentation references using the open-source kube-bench tool. Kube-bench scanning can be built into your infrastructure-as-code pipelines.

On the Google Cloud Platform, Security Health Analytics can be enabled to monitor and alert on some of the GKE CIS Benchmarks. Failed checks will be notified via the Cloud Security Command Center.

CIS Benchmarks | Kubernetes Engine Documentation | Google Cloud

This document explains what the CIS Kubernetes and Google Kubernetes Engine (GKE) Benchmarks are, how to audit your…

cloud.google.com

Authentication and Authorization

Use Google Groups for RBAC. This relatively new features allows fine grained permissioning at the google workspace level.

Use least privilege service accounts. By default GKE worker nodes will use the default compute engine service account. And by default the default compute engine service account has Editor level permissions. A custom service account with restricted access to the necessary logging roles (monitoring.viewer, monitoring.metricWriter, logging.logWriter, and stackdriver.resourceMetadata.write) should be configured. Additionally workloads should be configured to use Workload Identity, instead of directly using the worker node’s service account.

Using Workload Identity | Kubernetes Engine Documentation

This page explains the recommended way for your Google Kubernetes Engine (GKE) applications to consume services…

cloud.google.com

Control plane security

While the GKE control plane is managed by Google, there are a few configuration settings to be determined by the customer. The kubernetes API server should be configured for private access only. Authentication to the kubernetes api should be through Google Cloud IAM, while being sure to disable basic authentication and client certificate authentication.

Node security

Ensure that GKE worker nodes are using Google’s Container-Optimized OS. Google developed and maintains this OS specifically for running containers, and put security at the forefront of it’s design. It boasts: a minimal OS footprint, immutable root filesystem and verified boot, stateless configuration, security-hardened kernel, automatic updates, and much more. The full list of security features can be found here.

Enable Shielded GKE Nodes. This protects against attackers trying to impersonate one of your GKE worker nodes. It does so by cryptographically verifying several configurations.

Network security

Reiterating on a point mentioned in the control plane security section, access to your kubernetes resources should be restricted to private network access. A private GKE cluster will only have internal IP addresses, meaning they are isolated from the internet by default.

Private clusters | Kubernetes Engine Documentation | Google Cloud

This page explains how private clusters work in Google Kubernetes Engine (GKE). You can also learn how to create and…

cloud.google.com

In order to access your private GKE cluster, you should have a bastion host. For information on how you can configure a modern cloud bastion host check out my previous blog post.

Use network policy enforcement. Kubernetes network policies enable controlling the traffic flow of pods and entities within the cluster. Should your production service pod backend be able to communicate with a hello-world pod that just spun up unexpectedly? Probably not. Lock it down with a network policy.

Securing your workloads

Harden workload isolation with GKE Sandbox. GKE Sandbox prevents untrusted code from maliciously affecting the host kernel. Additionally, sandboxed pods are prevented from accessing other Google Cloud services or cluster metadata.

Workload Identity should be used for container workload authentication. This enables Kubernetes service accounts to authenticate as Google Service accounts. This allows you to create fine-grained identity and authorization for individual workloads, while removing the need for static service account credentials.

Audit logging

By default, Cloud Logging and Cloud Monitoring are enabled. This includes cluster audit logs, worker node logs, and application logs written to STDOUT/STDERR. These logs are stored for a default of 30 days. I recommend sending these logs to a Cloud Data Warehouse, such as Snowflake, for cost effective long term storage AND efficient querying.
A recent Lacework blog post highlights some interesting dangers of not enabling Kubernetes audit logging.

Finding your GKE logs | Google Cloud Blog

If the GKE integration is not enabled, you can enable log collection for the cluster by editing the cluster in the…

cloud.google.com

Supply chain security

Enable and implement binary authorization. This can ensure that only trusted container images are able to be deployed. Considering that running compromised container images is the most common attack vector, signing and verifying your images are great steps toward ensuring security.

Binary Authorization | Google Cloud

Deploy only trusted workloads for containers and serverless. View documentation for this product. Binary Authorization…

cloud.google.com

Scan container images for known vulnerabilities and malware. If you are using Google’s Container Registry, you can enable vulnerability scanning. Not only will this scan the image during upload time, but it will also continuously monitor the image’s metadata for new vulnerabilities.

Container scanning | Container Analysis documentation | Google Cloud

Software vulnerabilities are weaknesses that can either cause an accidental system failure or be intentionally…

cloud.google.com

GKE Autopilot

If all of the above is beginning to sound like a daunting amount of work to achieve, have no worries, GKE Autopilot can save the day. This new Google managed service takes care of the Kubernetes worker node management. And of course, it does so in a secure by default fashion.

Clusters created in the Autopilot mode are already in a hardened configuration.

Autopilot implements GKE hardening guidelines and security best practices like mentioned above.

Autopilot overview | Kubernetes Engine Documentation | Google Cloud

Autopilot is a new mode of operation in Google Kubernetes Engine (GKE) that is designed to reduce the operational cost…

cloud.google.com

Conclusion

Hopefully now it is clear how Google Kubernetes Engine is an industry leader in the security space. With many security features baked in by default, and plenty of additional configurable options, ensuring your Kubernetes cluster is secure becomes much easier on GKE.

Google Kubernetes Engine (GKE) Security Best Practices

GKE Security Overview

Automate CIS Benchmarking

CIS Benchmarks | Kubernetes Engine Documentation | Google Cloud

This document explains what the CIS Kubernetes and Google Kubernetes Engine (GKE) Benchmarks are, how to audit your…

Authentication and Authorization

Using Workload Identity | Kubernetes Engine Documentation

This page explains the recommended way for your Google Kubernetes Engine (GKE) applications to consume services…

Control plane security

Node security

Network security

Private clusters | Kubernetes Engine Documentation | Google Cloud

This page explains how private clusters work in Google Kubernetes Engine (GKE). You can also learn how to create and…

Securing your workloads

Audit logging

Finding your GKE logs | Google Cloud Blog

If the GKE integration is not enabled, you can enable log collection for the cluster by editing the cluster in the…

Supply chain security

Binary Authorization | Google Cloud

Deploy only trusted workloads for containers and serverless. View documentation for this product. Binary Authorization…

Container scanning | Container Analysis documentation | Google Cloud

Software vulnerabilities are weaknesses that can either cause an accidental system failure or be intentionally…

GKE Autopilot

Autopilot overview | Kubernetes Engine Documentation | Google Cloud

Autopilot is a new mode of operation in Google Kubernetes Engine (GKE) that is designed to reduce the operational cost…

Conclusion

Written by Joshua Stuts