Cluster Federation and Global Load Balancing on Kubernetes and Google Cloud — Part 2

I recently gave a talk at Google Cloud Next about using Kubernetes to deploy and manage microservices around the world using Kubernetes.

I’ve blogged about how you can set up MongoDB in a StatefulSet already, so now I’m going to deep dive on how you can set up cluster federation and deploy your services around the world!

We are going to be looking at three things specifically, Cluster Federation, Federated Ingress, and cross cluster service discovery. I’m going to split these topics into three blog posts so look out for the next ones soon!

I’m going to assume you know the basics of creating a project in Google Cloud and have installed and set up the Google Cloud SDK. If not, please check out my previous blog posts.

Part 1: Cluster Federation

Cluster Federation is what you can use to manage multiple Kubernetes clusters as if they were one. This means you can make clusters in multiple datacenters (and multiple clouds), and use federation to control them all at once!

Part 2: Federated Ingress

Federated Ingress is super sweet, and is currently only supported on Google Cloud. This allows you to spin up a single load balancer with a single global IP address that will dynamically send traffic to the nearest cluster automatically!

Part 3: Cross Cluster Communication

Cross cluster service discovery is a concept that let’s services in one Kubernetes cluster find services in other clusters automatically. We will also look at Google Cloud Pub/Sub to enable asynchronous microservices!

Part 2: Federated Ingress

Now you have a federated cluster, what can you do with it? One cool thing you can do is Federated Ingress. Let’s dive deeper into how it works.

One of the most powerful components on Google Cloud Platform is the network. This is something people commonly forget about when looking at the Cloud. Google has built one of the largest private networks in the world, which allows us to do some amazing things.

When you spin up a HTTP(S) Load Balancer on Google Cloud, you are given a single IP address. This IP address is anycast from over 100 points of presence around the world, and users automatically connect to the nearest one. Their traffic is then sent over Google’s private backbone network to the closest datacenter hosting your application!

So many lines!

This is all automatic. There is no DNS trickery, pre-warming, or anything complicated involved. Just one IP address. The best part is, Federated Ingress allows Kubernetes to take advantage of this without needing to perform any manual setup!

Note: If you haven’t completed Part 1, please do that first. Otherwise, make sure you know what you are doing!

Step 1: Create Global IP address

The first step is to create a global static IP address for our ingress to use.

gcloud compute addresses create k-ingress --global

This creates a new global IP address called “k-ingress”

Step 2: Create Federated Deployment and Service

Before you can create the ingress, you need to create a deployment and service to back it.

Let’s create a simple Nginx deployment and scale it to have 4 replicas:

kubectl --context=kfed create deployment nginx --image=nginx && \
kubectl --context=kfed scale deployment nginx --replicas=4

This command will create a deployment with 4 nginx replicas in the federated context. Because we have two clusters, each cluster will get 2 replicas! The federation control plane automatically distributes the pods evenly through the clusters.

Now create a service to expose this deployment:

kubectl --context=kfed create service nodeport nginx \
--tcp=80:80 --node-port=30036

In YAML form, the service definition looks like this:

apiVersion: v1
kind: Service
name: nginx
- port: 80
targetPort: 80
protocol: TCP
nodePort: 30036
app: nginx
type: NodePort

This will create a service exposing our nginx deployment via NodePort.

You might be used to exposing a service using a LoadBalancer, but that will create a load balancer in each datacenter with its own IP address, which we don’t want.

Instead, the NodePort directly exposes the service on all the VMs in the cluster (in this case on port 30036, but you can choose any valid port). The ingress load balancer will then route traffic to this port on the VMs.

Step 3: Create Ingress

The final step is to create the ingress load balancer!

Optional: I would recommend creating a firewall rule that will let you send traffic from the load balancer to the VMs. While this is not strictly necessary, as Kubernetes will create it’s own rules, it can help prevent issues with firewall rules. Run this command to create the rule:

gcloud compute firewall-rules create \
federated-ingress-firewall-rule \
--source-ranges \
--allow tcp:30036 --network default

Save the following to a file called ingress.yaml

apiVersion: extensions/v1beta1
kind: Ingress
name: nginx
annotations: “k-ingress”
serviceName: nginx
servicePort: 80

And create the ingress object:

kubectl --context=kfed create -f ingress.yaml

The annotation in the metadata ensures that the load balancer gets created using the static IP address you created earlier, and ensures that there is only one global load balancer created. The rest looks like a standard ingress YAML. We want traffic from all paths to go to the nginx service on port 80. You can customize this and use all the normal features ingress provides, such as routing, session affinity, etc, but now it’s federated!

If you want HTTPs support, I would highly recommend using kube-lego, which will automatically add Let’s Encrypt certificates to your ingress load balancer.

The ingress load balancer will take a few minutes to spin up and make sure all the backend health checks are up and running and firewall rules are created.

Step 4: Try it out!

If you visit your Load Balancer section in the Google Cloud console, you should see a HTTP(s) load balancer created for you. Here I’ve expanded the details on the load balancer:

You can see that there are two instance groups that are backing this load balancer, one in us-east and one in us-west. These are the two Kubernetes clusters. You can also see that all three VMs in each cluster are healthy, which means the load balancer is up and running!

To get the IP address of the service, run:

kubectl — context=kfed get ingress

When you visit the IP address, you should see the hello world from Nginx!

Depending on where you are located, your traffic will be routed from the closest datacenter to you. You can see an example of this in my demo on YouTube.


Though we only used two clusters in this demo, the ingress load balancer can easily scale to many more clusters. You can create huge clusters in each Google Cloud data center, and a single ingress load balancer can distribute your HTTP(s) traffic between all of them automatically. Two federated clusters or 20, all the steps are the same!

Of course, the anycast magic used by Federated Ingress requires you to run all your clusters on Google Cloud Platform. Currently, there is no support for other environments. If you are running a hybrid cloud, make sure your web frontends backed by the ingress load balancer all run on Google Cloud Platform.

In Part 3, I’ll show you how you can use cross cluster service discovery to access services that are not present in local clusters, which is perfect for these hybrid deployments!