Cruise
Published in

Cruise

Container Platform Security at Cruise

Best practices for enterprise-grade Kubernetes security.

Kubernetes Logo in Armor
  1. Building a Container Platform
  2. Container Platform Security
  3. Container Platform Networking
  1. Identity
  2. Authentication
  3. Authorization
  4. Secrets
  5. Encryption

Identity

To better understand how all of the different domains interact with one another, we first need to look at Identity. An identity is the representation of a person or program interacting with a system. They always take one of two types, users or services, and their type depends on their use case. Both types of identity include a compound unique identifier and a set of credentials made up of multiple factors.

Table describing user identity and service identity with example unique identifiers and credentials.

Identity Management

For identity management, we leverage Okta as our Identity Provider (IdP). Okta enables a Single Sign-On (SSO) experience for users between systems with Multi-Factor Authentication (MFA). Okta isn’t required for GKE or Kubernetes — we could have used another IdP or manually managed users within GCP itself, but Okta provides integration points and management tools that make it easier to secure a wide variety of systems.

Authentication

Authentication is the means by which we confirm an identity is whom they claim to be. Together, identifiers and credentials can be used to distinguish a given identity from another and establish non-repudiation: high confidence authenticity, proof of origin, and proof of integrity.

  1. Something you know (knowledge factor)
  2. Something you have (ownership factor)
  3. Something you are (inherence factor; most common with user identities)
  4. Somewhere you are (location factor)
Password, Secure Token Smartphone App, Fingerprint, Map Location
Multi-factor Authentication

Authentication Protocols

Google has invested heavily into OAuth2, so it may come as no surprise that GCP relies heavily on it for both user and service authentication alike. For users authenticating to GCP, this means authenticating with a password & second factor through an associated IdP. Behind the scenes, this does one of two things depending on if the user is authenticating manually via a browser, or programmatically via GCP’s CLI (gcloud), or API.

  1. Browsers: The browser Single Sign On (SSO) workflow utilizes the SAML protocol. Provided the user has properly authenticated, the SAML assertion is stored for the remainder of the session (or lifetime of the assertion, whichever comes first). Backend services then transparently validate the user’s session on each interaction using the assertion, rather than requiring the user to sign in on every request.
  2. Programs: The newer OIDC protocol is used for programmatic interactions. The user or service identity logs in with its credentials and Google generates a signed access token for use in subsequent interactions. The OIDC access token is the basis for API and CLI authentication, analogous to the SAML assertion stored in the browser flow. For terminal access, most users use the gcloud CLI, which handles the OIDC authentication flow and caches the access token.

Identity Translation

Once authenticated with the gcloud CLI, GKE users can use it to fetch kubectl credentials, allowing them access to the Cruise PaaS using kubectl, the Kubernetes CLI, provided their identity has the required role bindings. This allows users to only have to manage their GCP credentials, and generate Kubernetes credentials on-demand.

Workload Identity

Recently, Google introduced GKE Workload Identity, which allows Kubernetes SAs to act as GCP SAs, so that pods can authenticate with GCP. This replaces the legacy pattern of using GCE instance metadata, which would allow every pod on the node to have access to the same GCP SA credentials.

Authorization

Authorization is the means by which we enforce what an authenticated identity may access. There are many types of access control, but within the context of container platforms, we typically use Role-Based Access Control (RBAC).

Figure: Groups, Permissions, and Role Based Access Control (RBAC)
Figure: Groups, Permissions, and Role Based Access Control (RBAC)

Group Membership

Putting identities into groups makes it easier to bind permissions & roles without repeatedly assigning the same roles & permissions to each individual identity. Groups are generally a resource type provided by an IdP; for integration with GCP and GKE, we use Google groups provided by G Suite. In most authentication flows, group membership is a field located within the credential itself (such as a JWT’s claims), or is a property that’s possible to query against the associated IdP.

Roles & Role Bindings

As mentioned earlier, GKE integrates with GCP and G Suite to provide authentication, identity management, group management, and authorization within GCP.

Figure: RBACSync high level workflow and example config
Figure: RBACSync high level workflow and example config

Secrets

Secrets can be anything you want to keep private, but in the context of container platforms, it’s mostly just credentials: tokens, passwords, certificates, encryption keys, etc. Kubernetes comes with its own secret storage and injection mechanism, which is especially valuable for bootstrapping, but the built-in secrets solution is generally insufficient when platforms span multiple clusters.

  1. Authentication methods that leverage existing identity primitives from multiple IdPs (Okta users, GCP service accounts, Kubernetes service accounts, etc.)
  2. Authorization that supports RBAC and group membership.
  1. Authenticates to Vault by leveraging the Kubernetes service account bound to the pod
  2. Fetches secrets needed by the workload
  3. Writes the secrets to an in-memory volume (to avoid leaking to persistent storage)
  4. Shares the volume with the workload container
  5. Updates the secret at runtime, when it changes in Vault (optional)
Figure: Secrets injection with Vault and Daytona
Figure: Secrets injection with Vault and Daytona

Encryption

Encryption is a broad topic, but we can break it down into two categories:

  1. Encryption in Transit
  2. Encryption at Rest
Silly graphic of encryption in transit and at rest
Encryption in Transit & Encryption at Rest

Encryption in Transit

One of the more challenging parts of securing PaaS has been ensuring all of our services communicate in a secure manner. This typically means using Transport Layer Security (TLS).

  1. Kubernetes State Storage (etcd) API
  2. Kubernetes API
  3. Kubelet API
  4. Workload Ingress

Encryption at Rest

In transit, we can assume the public internet is not implicitly trustworthy, and if Zero Trust best practices tell us anything, we probably shouldn’t trust our private intranet either. Taking this a step further, implicitly trusting the people with access to our physical hardware (and their virtual cloud analogs) is also undesirable. With this in mind, we know that the following data resides on persistent storage, and as a result, we’d look to encrypt it at rest:

  1. Kubernetes State Storage
  2. Kubernetes Node Disks
  3. Kubernetes Service Account Credentials
  4. Workload Secrets
  5. Workload Volumes

Everything Else

Security concerns affect everything we do, but this post is already longer than most people will read and we do still have a self-driving car to build…

  • Auditing (Compliance, Threat Detection, Alerting)
  • Platform Hardening (SecurityContext, Node Metadata Protection, Network Policies, Pod Security Policies)
  • Secure Supply Chains (Trusted Image Building, Vulnerability Scanning, Attestation)
  • Patch Management
  • Zero Trust Networking

To Be Continued…

In the next blog post of the series, we will take a look at some of the networking challenges that come with building a platform. Stay tuned for more about observability and deployment after that!

--

--

Cruise is building the world’s most advanced self-driving vehicles to safely connect people with the places, things and experiences they care about. Join us in solving the engineering challenge of a generation: https://getcruise.com/careers

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Karl Isenberg

Cloud Guy. Anthos Solutions Architect at Google (opinions my own). X-Cruise, X-Mesosphere, & X-Pivotal.