Managing fleets across our teams with GKE Enterprise

Published in

Globant

7 min readDec 27, 2023

Managing fleets across our teams with GKE Enterprise

In the realm of application management and container orchestration, Anthos And Google Kubernetes Engine (GKE) Enterprise merge their capabilities to offer a single pane of glass. This unifies the cloud operating model with consistent operations, security, and governance over our clusters.

Based on the multi-cluster management challenge, Anthos defines the concept of fleet, which is a logical grouping of Kubernetes clusters to make it easier to manage configuration and deployment across them.

Typically, organizations must comply with regulations related to their industry, business needs, or internal guidelines, which sometimes results in the implementation of multi-cluster approaches. Some of the most common use cases for multi-cluster management are environment isolation, scalability, regulations, tenant separation, backup and disaster recovery, and low access latency with cross-regional services, among others.

At this point, it’s worth mentioning that these approaches are not a recipe for meeting all technical challenges and business objectives at the enterprise level. On the contrary, it is considered good practice to use as few clusters as possible, for example, a multi-tenant cluster.

Considering fleets as logical groupings of Kubernetes clusters, the subsequent phase involves determining how these clusters will be grouped to align with our operational model, Software Development Life Cycle (SDLC), solution architecture, or business requisites. Hence, making critical decisions about whether these clusters share relationships, possess distinct resource owners, sameness, and so forth becomes imperative.

A fundamental concept in fleets is “sameness”, specifically when an Anthos Service Mesh or multi-cluster Ingress makes up the solution. This concept can be used in our fleet at the namespace, service, and identity levels and can help us know if we need more segregation within it. In that case, the Fleet team management feature offered by GKE Enterprise allows us to manage fleets in a more granular way and implement the multi-tenant concept across our fleet.

The fleet team management feature allows teams to manage and monitor workloads across their dedicated infrastructure within the fleet, such as clusters and namespaces. Therefore, it’s important to know these key concepts on which it is based:

Scopes: Mechanism that allows grouping a subset of clusters within a fleet.
Team scopes: Mechanism that allows grouping a subset of fleet resources on a per-team basis. A cluster can be associated with one team scope or more.
Fleet namespace: Mechanism to control who has access to a specific namespace within our fleet.

As the diagram shows, the non-production-fleet fleet hosted in the prj-np-clusters project on the Google Cloud Platform is composed of the dev-k8s on-premises cluster and the non-prod-gke-us-west1 GKE cluster that is on the same GCP fleet project. At this level, two Team scopes and fleet namespaces are created for these teams:

A team scope for the dev team and a microservice fleet namespace within that scope is configured to enable this team to run their workloads in the non-prod-gke-west1 GKE cluster in the microservice namespace.
A team scope for the DevOps team and two microservice fleet namespaces (monitoring and automation) within that scope are configured to enable this team to run their workloads in the dev-k8s on-premise cluster and non-prod-gke-us-west1 GKE cluster.

Note: The recommended practice to grant access to team scopes is using Google Groups and RBAC, although it’s possible to grant access to individuals.

Procedure

The following example illustrates configuring fleet team management via the Google Cloud CLI, outlining the essential prerequisites for executing this task and harnessing the capabilities of the Fleet API. This API facilitates the creation of the fleet responsible for overseeing related Kubernetes clusters and setting up access for the development team members to perform their activities and run their workloads within their designated namespaces across the fleet.

Pre-requisites

Set up the Google Cloud CLI.
Install kubectl. It’s recommended to install kubectl with Google Cloud CLI.
A fleet host project and IAM permissions.
- If you have the roles/owner IAM role on this project; no additional access permissions are required to perform this process.
- If not, the roles/gkehub.admin IAM role is required to register clusters and create and configure team scopes and namespaces.
Enable the following APIs in the fleet host project:
- anthos.googleapis.com
- iam.googleapis.com
- cloudresourcemanager.googleapis.com
- container.googleapis.com
- connectgateway.googleapis.com
- gkeconnect.googleapis.com
- gkehub.googleapis.com (Fleet API)
Grant the Kubernetes RBAC cluster-admin role for GKE clusters on Google Cloud or clusters outside Google Cloud.

Create a fleet

A fleet can be created by registering a cluster in a project that doesn’t already have a fleet defined, creating an empty fleet, or upgrading to the GKE enterprise edition.

For this example, the process of creating an empty fleet is put into practice:

gcloud container fleet creates \
  --display-name=non-prod-fleet \
  --project=prj-np-clusters

Register clusters in your fleet

There are different approaches to registering a cluster in a specific fleet that depends on where they are located:

GKE clusters on Google Cloud: GKE clusters must be added explicitly to a fleet using Google Cloud Console, Google Cloud CLI, Terraform, or Config Connector. Please review the Register a cluster on Google Cloud to your fleet documentation to learn more about it.
Clusters outside Google Cloud:

If you have third-party Kubernetes clusters, the following documentation may be helpful for manually attaching them to a specific fleet:
- Prerequisites for Kubernetes cluster type
- Attach an EKS cluster
- Attach an AKS cluster
- Attach third-party clusters
If you are using GKE on Google Distributed Cloud, which helps enterprises extend Google Cloud’s infrastructure to their data centers, GKE on VMware and GKE Baremetal are automatically registered to a specific fleet at cluster creation time.
If you use GKE Cluster on AWS or Azure, they will be automatically registered to a specific fleet at cluster creation. Please review the GKE Multi-Cloud API documentation to learn more about it.

Configure access control for clusters using Google Groups

The practice of managing memberships to our fleets based on Google Groups simplifies operational tasks, such as policy management and auditing. This approach eliminates the need to manually add or remove users from the cluster fleet when the individual leaves the team or promotes another. When the access control to our cluster in a fleet is configured, it’s recommended that users connect to their authorized clusters using the Connect Gateway for access control.

The following documentation provides instructions for configuring access control either for GKE clusters on Google Cloud or clusters outside Google Cloud.

Controlling team members’ access to our fleet

To ensure that our teams have access to our fleet based on the aforementioned practices, these commands let team members have access to the Google Cloud console, view all clusters within their fleet, and use the Connect Gateway to authenticate with fleet member clusters, utilizing Google Groups-based authorization:

gcloud projects add-iam-policy-binding prj-np-clusters \
  --member=group:dev-team@org.com \
  --role=roles/gkehub.viewer

gcloud projects add-iam-policy-binding prj-np-clusters \
  --member=group:dev-team@org.com \
  --role=roles/gkehub.gatewayEditor

Create a team scope

Up to this point, we have reviewed how to manage our clusters and members at the fleet scope level. This allows us to define the desired multi-tenant approach, restricting access to specific subsets of fleet resources for certain tenants or teams.

Continuing with the example of the dev team, they must only have access to the non-prod-gke-us-west1 GKE cluster within the fleet. This is why it’s necessary to establish a defined team scope, outlining those clear boundaries:

gcloud container fleet scopes create dev-team

gcloud container fleet memberships bindings create non-prod-gke-us-west1-dev-team \
  --membership non-prod-gke-us-west1 \
  --scope dev-team \
  --location global

Create a namespace in the scope

Now, we should define the namespace where the team members will deploy their workloads. In this case, the dev team must have access to the microservice namespace. The following command creates a Kubernetes namespace in each cluster that makes up the scope:

gcloud container fleet scopes namespaces create microservices \
  --scope dev-team

Managing grant scope access with RBAC

Finally, the team members need to have access to their scope using RBAC, following the best practices mentioned in this article by using Google Groups (dev-team@org.com):

gcloud container fleet scopes rbacrolebindings create dev-team-editors \
  --scope dev-team \
  --role=editor \
  --group=dev-team@org.com

Access fleet namespace using Connect Gateway

Now, members of the dev team can access the namespace configured within their scope by getting the specific cluster credentials through the Connect Gateway:

gcloud container fleet memberships get-credentials non-prod-gke-us-west1

Similar to how Google’s CLI works when we need to obtain credentials for the GKE Standard Edition cluster using the command gcloud container clusters get-credentials, Connect Gateway does it through the previously specified command. This implies that by executing these commands, the kubeconfig for the desired cluster is generated, allowing interaction with it through either kubectl or go-client:

kubectl get pods -namespace=microservices

Conclusion

Organizations often need to solve complex scenarios when isolation, location, and scale concepts become relevant to adopt a multi-cluster approach. Such scenarios can include segregating their environments and organizing their services into teams and tiers while also meeting business regulations and maintaining consistent interconnection, authentication, configurations, and policies. This article offers a comprehensive approach to setting up and managing fleets in GKE Enterprise, ensuring secure, organized, and granular control over cluster resources while aligning with best practices for access control and management. Highlighting the importance of making strategic decisions about cluster relationships, resource ownership, and the concept of “sameness” becomes fundamental when designing our fleets to achieve our technical and business objectives.

Once we have grouped our clusters into a fleet, decisions must continue regarding identifying which teams and members of those teams should have access to specific namespaces within a subset of clusters that make up our fleet. This granularity of administration of our infrastructure and management of identities and access is possible through the concepts of “team scope” and “Fleet Namespace” that GKE Enterprise and this article offer us as a good start.