Kubernetes as an Enterprise Platform

Kubernetes has undergone tremendous growth in the last few years and has emerged as a leader among container orchestration tools. It is an open source, modular, and extensible platform with a very vibrant community. Enterprises building modern applications can rely on Kubernetes to provide a cloud-agnostic solution that allows them to leverage the most cost-effective, secure and scalable cloud — or use a hybrid cloud model.

Photo by chuttersnap on Unsplash

My team has been using Kubernetes since v1.0 was launched in 2015, and is primarily responsible for offering Kubernetes as an enterprise platform for deploying containerized applications. We started with a handful of microservices that my team owned. As we showcased its benefits, other teams became interested, and soon Kubernetes grew into an enterprise-wide platform. For the last 3 years, we have been running Kubernetes as the container platform for hundreds of application teams at Nordstrom. As part of our service, we run large shared multi-tenant clusters across multiple clouds.

If you are considering using Kubernetes for your enterprise, there are two questions you may want to answer before you proceed.

Shared Clusters: Multi-Tenant vs Single-Tenant

A single-tenant cluster is a cluster dedicated for use by a single team. It’s a much simpler model since all code and API access comes from a single team - where people trust each other.

A multi-tenant cluster, on the other hand, is a shared cluster where tenants (application teams) within the organization share access to the same cluster. Usually, each team or application gets a namespace in the cluster which acts as the security boundary.

If you are a small organization with a handful of teams, letting each team build, own, and manage their own cluster may work out well. However, if you are a medium to large organization with several applications teams, cluster per-team mode may not scale well. A single platform team owning and maintaining few large multi-tenant clusters may be more efficient in this case.

Here are some of the advantages we’ve experienced running shared multi-tenant clusters over several single-tenant clusters:

  • Less maintenance overhead, as costs are shared
  • Better efficiency and resource utilization since cluster nodes are easier to fill up with pods
  • Consistent version and upgrade schedule which reduces the proliferation of versions
  • Easier integration with existing enterprise tools
  • Single, consistent platform for security policies and add-on features
  • Dedicated support staff

However, multi-tenancy creates additional concerns around access control, security and resource utilization. Here are some measures to consider for multi-tenancy:

  • Implement role-based access control to regulate access to team resources
  • Implement security policies to ensure pods run with appropriate privileges and access only a finite set of resources
  • Implement network policies to enforce how pods communicate with each other and other network endpoints
  • Implement a solution (e.g kube2iam) to provide IAM credentials to pods for accessing cloud resources
  • Enable resource quotas to ensure that teams don’t use more than its fair share of resources or implement cluster autoscaling

Once you have decided on the multi-tenancy aspect, the next big decision you want to make is whether to use managed Kubernetes or run your own.

Managed vs DIY Kubernetes

Managed Kubernetes is Kubernetes delivered as a service by either your cloud provider (Google GKE, Amazon EKS, Azure AKS) or other service providers (Redhat Openshift, Docker Enterprise, Platform9, etc).

DIY Kubernetes means downloading the open-source orchestration tool, setting it up, and running it yourself — or using one of the open source tooling (kubeadm, kops, bootkube) to run it.

My team has experience running both DIY Kubernetes and managed Kubernetes (GKE) over a period of time. Below are some of the issues we faced offering managed Kubernetes as an enterprise platform:

  • No support for custom authentication/single sign-on: GKE and other cloud providers rely on cloud IAM to authenticate with their managed service. While Kubernetes itself supports OIDC connect for authentication, this functionality is not exposed by managed services. As a result, you cannot have single sign-on experience for managed Kubernetes without authenticating with the cloud provider.
  • Not fully managed: Most managed Kubernetes provides a managed control plane but you still own and manage the worker nodes. There are certain cases, where this can become a pain point. For example, for our GKE cluster, we use Terraform to manage node pools and upgrading a node pool version can cause a serious issue as it results in deleting the entire node pool at once and replacing it. There are hacky workarounds but they have their own quirks.
  • No support for groups in role-based access control (RBAC): When using namespaces as a security boundary for multi-tenant clusters, you want to rely on groups instead of individual users to grant access to teams. Team membership changes often with users joining and leaving. Tracking these changes and updating RBAC policies across all your clusters can be difficult. Unfortunately, GKE does not support groups in RBAC today. This is a specific limitation of GKE and Google is currently working to resolve it.
  • No domain name for managed apiserver endpoint: For GKE cluster, the apiserver endpoint is a raw IP and doesn’t have a domain name. At Nordstrom, like most enterprises, we use a web proxy and all network traffic from the data center passes through it to gain access to the public internet. Google uses a self-signed certificate for their API server, and the proxy blocks access to raw IPs using a self-signed certificate for security reasons.

Deciding between managed vs DIY Kubernetes can be tricky and depends on your use case. Managed Kubernetes is a great way to get started and takes away much of the complexity of implementation and operations. Whereas, DIY Kubernetes can be difficult and time-consuming, but offers the utmost flexibility. When evaluating which solution works best for you, some things to consider are security, multi-tenancy, and integration with existing tools.

Once you have determined a course of action with the above two scenarios, you have set yourself up for long-term success and are ready to begin your adventures in the exciting world of Kubernetes and containers. There are tons of helpful resources available for getting started and lots of community support options available.

Here are few of the things that made us successful running Kubernetes as an enterprise platform over the past few years:

  • Automated, repeatable deploys of clusters
  • Centralized collection of logs and metrics
  • Unified credentials for all clusters
  • Backup and recovery strategy
  • Automated provisioning of namespaces across clusters