Experimenting with Cross Cloud Kubernetes Cluster Federation

Samuel Cozannet
Feb 10, 2017 · 18 min read

What / why?

In a previous post I presented a way to deploy a cluster in an existing AWS environment as an answer to questions I got about integrability.

Preliminary words about Federation

What is a Kubernetes Federation? To answer this question, let’s just say that spanning a unique cluster all over the world is, to say the least, absolutely impossible. That’s why clouds have regions, AZs, cells, racks… You have to split your infrastructure so that specific geographic areas are the unit of construction to build the bigger solution.

The Plan

In this blog, we are going to do the following things:

  1. Create a Google Cloud DNS Zone with a domain we control
  2. Install a federation control plane in GKE
  3. Test what works and what doesn’t work out of the box

Requirements

For what follows, it is important that:

  • You understand k8s Federation concepts
  • You have admin credentials for AWS, GCP and Azure
  • You are familiar with tooling available to deploy Kubernetes
  • You have notions of GKE and Google Cloud DNS (or route53 or Azure DNS)

Foreplay

  • Make sure you have Juju installed.
sudo apt-add-repository ppa:juju/stable
sudo apt update && apt upgrade -yqq
sudo apt install -yqq juju
  • Finally copy the repo to have access to all the sources
git clone https://github.com/madeden/blogposts ./
cd blogposts/k8s-federation

Deploying

In this section we will

  • Install tools for the federation
  • Deploy the Federation

Microsoft Azure

Let’s spawn a k8s cluster in Azure first:

# Bootstrap the Juju Controller
juju bootstrap azure/westeurope azure \
--bootstrap-constraints “root-disk=64G mem=8G” \
--bootstrap-series xenial
# Deploy Canonical Distribution of Kubernetes
juju deploy src/bundle/k8s-azure.yaml

Amazon AWS

Now same course of action on AWS

juju bootstrap aws/us-west-2 aws \
--bootstrap-constraints "root-disk=64G mem=8G" \
--bootstrap-series xenial
# Deploy Canonical Distribution of Kubernetes
juju deploy src/bundle/k8s-aws.yaml

GKE

Here we deploy a DNS Zone and a small GKE Cluster

# Spin up the DNS Zone
gcloud dns managed-zones create federation \
--description "Kubernetes federation testing" \
--dns-name demo.madeden.com
# Spin up a GKE cluster
gcloud container clusters create gke \
--zone=us-east1-b \
--scopes "cloud-platform,storage-ro,service-control,service-management,https://www.googleapis.com/auth/ndev.clouddns.readwrite" \
--num-nodes=2

Federation

Installing kubefed

Since 1.5 Kubernetes comes with a tool called KubeFed which manages the lifecycle of federations.

curl -O https://storage.googleapis.com/kubernetes-release/release/v1.5.2/kubernetes-client-linux-amd64.tar.gz
tar -xzvf kubernetes-client-linux-amd64.tar.gz
sudo cp kubernetes/client/bin/kubefed /usr/local/bin
sudo chmod +x /usr/local/bin/kubefed
sudo cp kubernetes/client/bin/kubectl /usr/local/bin
sudo chmod +x /usr/local/bin/kubectl
mkdir -p ~/.kube

Configuring kubectl

On Azure, check that your cluster is now up & running:

# Switch Juju to the Azure cluster
juju switch azure
# Get status
juju status
# Which gets you (if finished)
Model Controller Cloud/Region Version
default azure azure/westeurope 2.1-beta5
App Version Status Scale Charm Store Rev OS Notes
easyrsa 3.0.1 active 1 easyrsa jujucharms 6 ubuntu
etcd 2.2.5 active 3 etcd jujucharms 23 ubuntu
flannel 0.7.0 active 4 flannel jujucharms 10 ubuntu
kubernetes-master 1.5.2 active 1 kubernetes-master jujucharms 11 ubuntu exposed
kubernetes-worker 1.5.2 active 3 kubernetes-worker jujucharms 13 ubuntu exposed
Unit Workload Agent Machine Public address Ports Message
easyrsa/0* active idle 0 40.114.244.142 Certificate Authority connected.
etcd/0 active idle 1 40.114.247.142 2379/tcp Healthy with 3 known peers.
etcd/1* active idle 2 104.47.167.187 2379/tcp Healthy with 3 known peers.
etcd/2 active idle 3 104.47.163.137 2379/tcp Healthy with 3 known peers.
kubernetes-master/0* active idle 4 40.114.243.251 6443/tcp Kubernetes master running.
flannel/2 active idle 40.114.243.251 Flannel subnet 10.1.96.1/24
kubernetes-worker/0 active idle 5 104.47.162.134 80/tcp,443/tcp Kubernetes worker running.
flannel/1 active idle 104.47.162.134 Flannel subnet 10.1.94.1/24
kubernetes-worker/1* active idle 6 104.47.162.82 80/tcp,443/tcp Kubernetes worker running.
flannel/0* active idle 104.47.162.82 Flannel subnet 10.1.58.1/24
kubernetes-worker/2 active idle 7 104.47.160.138 80/tcp,443/tcp Kubernetes worker running.
flannel/3 active idle 104.47.160.138 Flannel subnet 10.1.43.1/24
Machine State DNS Inst id Series AZ
0 started 40.114.244.142 machine-0 xenial
1 started 40.114.247.142 machine-1 xenial
2 started 104.47.167.187 machine-2 xenial
3 started 104.47.163.137 machine-3 xenial
4 started 40.114.243.251 machine-4 xenial
5 started 104.47.162.134 machine-5 xenial
6 started 104.47.162.82 machine-6 xenial
7 started 104.47.160.138 machine-7 xenial
Relation Provides Consumes Type
certificates easyrsa etcd regular
certificates easyrsa kubernetes-master regular
certificates easyrsa kubernetes-worker regular
cluster etcd etcd peer
etcd etcd flannel regular
etcd etcd kubernetes-master regular
cni flannel kubernetes-master regular
cni flannel kubernetes-worker regular
cni kubernetes-master flannel subordinate
kube-dns kubernetes-master kubernetes-worker regular
cni kubernetes-worker flannel subordinate
juju scp kubernetes-master/0:/home/ubuntu/config ./config-azure
juju switch aws
Model Controller Cloud/Region Version
default aws aws/us-west-2 2.1-beta5
App Version Status Scale Charm Store Rev OS Notes
easyrsa 3.0.1 active 1 easyrsa jujucharms 6 ubuntu
etcd 2.2.5 active 3 etcd jujucharms 23 ubuntu
flannel 0.7.0 active 4 flannel jujucharms 10 ubuntu
kubernetes-master 1.5.2 active 1 kubernetes-master jujucharms 11 ubuntu exposed
kubernetes-worker 1.5.2 active 3 kubernetes-worker jujucharms 13 ubuntu exposed
Unit Workload Agent Machine Public address Ports Message
easyrsa/0* active idle 2 10.0.251.198 Certificate Authority connected.
etcd/0* active idle 1 10.0.252.237 2379/tcp Healthy with 3 known peers.
etcd/1 active idle 6 10.0.251.143 2379/tcp Healthy with 3 known peers.
etcd/2 active idle 7 10.0.251.31 2379/tcp Healthy with 3 known peers.
kubernetes-master/0* active idle 0 35.164.145.16 6443/tcp Kubernetes master running.
flannel/0* active idle 35.164.145.16 Flannel subnet 10.1.37.1/24
kubernetes-worker/0* active idle 3 52.27.16.150 80/tcp,443/tcp Kubernetes worker running.
flannel/3 active idle 52.27.16.150 Flannel subnet 10.1.11.1/24
kubernetes-worker/1 active idle 4 52.10.62.234 80/tcp,443/tcp Kubernetes worker running.
flannel/1 active idle 52.10.62.234 Flannel subnet 10.1.43.1/24
kubernetes-worker/2 active idle 5 52.27.1.171 80/tcp,443/tcp Kubernetes worker running.
flannel/2 active idle 52.27.1.171 Flannel subnet 10.1.68.1/24
Machine State DNS Inst id Series AZ
0 started 35.164.145.16 i-0a3fdb3ce9590cb7e xenial us-west-2a
1 started 10.0.252.237 i-0dcbd977bee04563b xenial us-west-2b
2 started 10.0.251.198 i-04cedb17e22064212 xenial us-west-2a
3 started 52.27.16.150 i-0f44e7e27f776aebf xenial us-west-2b
4 started 52.10.62.234 i-02ff8041a61550802 xenial us-west-2a
5 started 52.27.1.171 i-0a4505185421bbdaf xenial us-west-2a
6 started 10.0.251.143 i-05a855d5c0c6f847d xenial us-west-2a
7 started 10.0.251.31 i-03f1aafe15d163a34 xenial us-west-2a
Relation Provides Consumes Type
certificates easyrsa etcd regular
certificates easyrsa kubernetes-master regular
certificates easyrsa kubernetes-worker regular
cluster etcd etcd peer
etcd etcd flannel regular
etcd etcd kubernetes-master regular
cni flannel kubernetes-master regular
cni flannel kubernetes-worker regular
cni kubernetes-master flannel subordinate
kube-dns kubernetes-master kubernetes-worker regular
cni kubernetes-worker flannel subordinate
juju scp kubernetes-master/0:/home/ubuntu/config ./config-aws
gcloud container clusters get-credentials gce --zone=us-east1-b
# Identify the cluster name
LONG_NAME=$(kubectl config view -o jsonpath='{.contexts[*].name}')
# Replace it in kubeconfig
sed -i "s/$LONG_NAME/gke/g" ~/.kube/config
# Install tool 
sudo npm install -g load-kubeconfig
# Replace the username and context name with our cloud names in both files and combine
for cloud in aws azure
do
sed -i -e "s/juju-cluster/${cloud}/g" \
-e "s/juju-context/${cloud}/g" \
-e "s/ubuntu/${cloud}/g" \
./config-${cloud}
load-kubeconfig ./config-${cloud}
done

Labelling Nodes

One of the goals of deploying Kubernetes federations is to ensure multi-region HA for the applications. Within regions, you would want also to have HA between AZs. As a result, you should consider deploying one cluster per AZ.

# AWS
kubectl --context=aws get nodes --show-labels
NAME STATUS AGE LABELS
ip-10-0-1-54 Ready 1d beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=ip-10-0-1-54
ip-10-0-1-95 Ready 1d beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=ip-10-0-1-95
ip-10-0-2-43 Ready 1d beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=ip-10-0-2-43
# Azure
kubectl --context=azure get nodes --show-labels
NAME STATUS AGE LABELS
machine-5 Ready 2h beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=machine-5
machine-6 Ready 2h beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=machine-6
machine-7 Ready 2h beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,kubernetes.io/hostname=machine-7
# Labelling the nodes of AWS for US West 2a (random pick)
for node in $(kubectl - aws get nodes -o json | jq --raw-output '.items[].metadata.name')
do
kubectl --context=aws label nodes \
${node} \
failure-domain.beta.kubernetes.io/region=us-west-2
kubectl --context=aws label nodes \
${node} \
failure-domain.beta.kubernetes.io/zone=us-west-2a
done
# Labelling the nodes of Azure for EU West 2a (random pick)
for node in $(kubectl - azure get nodes -o json | jq --raw-output '.items[].metadata.name')
do
kubectl --context=azure label nodes \
${node} \
failure-domain.beta.kubernetes.io/region=eu-west-2
kubectl --context=azure label nodes \
${node} \
failure-domain.beta.kubernetes.io/zone=eu-west-2a
done

Making sure clusters share the same GLBC

This page clearly states that our GLBC must absolutely be the same across all clusters. As you deployed 2 clusters with Juju and one with GKE, this is not the case at this point. It is almost the case though as Canonical’s CDK is 100% upstream. It’s just labels and a few tags not being the same.

for cluster in aws azure
do
# Delete old ones
kubectl --context ${cloud} delete \
rc/default-http-backend \
rc/nginx-ingress-controller \
svc/default-http-backend
# Replace by new ones taken from GKE
kubectl --context ${cloud} create -f \
src/manifests/l7-svc.yaml
kubectl --context ${cloud} create -f \
src/manifests/l7-deployment.yaml
done

Initializing the Federation

Now we are ready to federate our clusters, which essentially means we are adding a cross cluster control plane, itself hosted in a third Kubernetes cluster.

  • Failover of services between zones
  • Single point of service definition
kubefed init magicring \
--host-cluster-context=gke \
--dns-zone-name="demo.madeden.net."
Federation API server is running at: 130.211.62.225
  • a new API server for the Federation
  • a new Controller Manager for the Federation
  • a new context in your kubeconfig file to interact specifically with this uber layer of Kubernetes, named after the Federation name
kubectl config use-context magicring
Switched to context “magicring”.
# add AWS
kubefed join aws \
--host-cluster-context=gke
cluster "aws" created
# Now Azure
kubefed join azure \
--host-cluster-context=gke
cluster "azure" created
kubectl get clusters
NAME STATUS AGE
aws Ready 1m
azure Ready 1m

Deploying a multi cloud application

Now that the federation is running, let us try to deploy all the primitives that Federations are supposed to manage:

  • ConfigMaps and Secrets to share data
  • Deployments / ReplicaSets / Replication Controllers for apps
  • Services / Ingresses for exposition

Namespaces

Let us deploy a test NameSpace:

# Creation
kubectl --context=magicring create -f src/manifests/test-ns.yaml
namespace "test-ns" created
# Check AWS
kubectl --context=aws get ns
NAME STATUS AGE
ns/default Active 3d
ns/kube-system Active 3d
ns/test-ns Active 50s
# Check Azure
kubectl --context=azure get ns
NAME STATUS AGE
ns/default Active 2d
ns/kube-system Active 2d
ns/test-ns Active 1m

Config Maps / Secrets

Push a test configmap to the cluster to assert it works:

# Publish
kubectl --context magicring create -f src/manifests/test-configmap.yaml
configmap "test-configmap" created
# Check AWS
kubectl --context aws get cm
NAME DATA AGE
test-configmap 1 55s
# Check Azure
kubectl --context azure get cm
NAME DATA AGE
test-configmap 1 1m

Deployments / ReplicaSets / DaemonSets

First of all, deploy 10 replicas of the microbots (demo app for CDK):

# Note we are still in the Magic Ring context…
kubectl create -f src/manifests/microbots-deployment.yaml
deployment “microbot” created
# Querying the Federation control planed does not work
kubectl get pods -o wide
the server doesn't have a resource type "pods"
# Querying AWS cluster directly
kubectl --context=aws get pods
NAME READY STATUS RESTARTS AGE
default-http-backend-wqrmm 1/1 Running 0 1d
microbot-1855935831-6n08n 1/1 Running 0 1m
microbot-1855935831-fvd7q 1/1 Running 0 1m
microbot-1855935831-gg5ql 1/1 Running 0 1m
microbot-1855935831-kltf0 1/1 Running 0 1m
microbot-1855935831-z7zp1 1/1 Running 0 1m
# Now querying Azure directly
kubectl --context=azure get pods
NAME READY STATUS RESTARTS AGE
default-http-backend-04njk 1/1 Running 0 1h
microbot-1855935831-19m1p 1/1 Running 0 1m
microbot-1855935831-2gwjt 1/1 Running 0 1m
microbot-1855935831-8k3hc 1/1 Running 0 1m
microbot-1855935831-fgrn0 1/1 Running 0 1m
microbot-1855935831-ggvvf 1/1 Running 0 1m
E0210 10:35:53.691358 1 deploymentcontroller.go:516] Failed to ensure delete object from underlying clusters finalizer in deployment microbot: failed to add finalizer orphan to deployment : Operation cannot be fulfilled on deployments.extensions “microbot”: the object has been modified; please apply your changes to the latest version and try again
E0210 10:35:53.691566 1 deploymentcontroller.go:396] Error syncing cluster controller: failed to add finalizer orphan to deployment : Operation cannot be fulfilled on deployments.extensions “microbot”: the object has been modified; please apply your changes to the latest version and try again
kubectl --context=magicring create -f src/manifests/microbots-ds.yaml 
daemonset "microbot-ds" created
kubectl --context aws get po -n test-ns
NAME READY STATUS RESTARTS AGE
microbot-ds-5c25n 1/1 Running 0 48s
microbot-ds-cmvtj 1/1 Running 0 48s
microbot-ds-lp0j0 1/1 Running 0 48s
kubectl --context azure get po -n test-ns
NAME READY STATUS RESTARTS AGE
microbot-ds-bkj34 1/1 Running 0 53s
microbot-ds-r85z4 1/1 Running 0 53s
microbot-ds-w8kxg 1/1 Running 0 53s

Services

Now let us create the service :

# Service creation...
kubectl --context=magicring create -f src/manifests/microbots-svc.yaml
service "microbot" created
# On AWS
$ kubectl --context=aws get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend 10.152.183.100 <none> 80/TCP 3d
kubernetes 10.152.183.1 <none> 443/TCP 3d
microbot 10.152.183.173 <none> 80/TCP 1m
# On Azure
$ kubectl --context=azure get svc
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default-http-backend 10.152.183.103 <none> 80/TCP 2d
kubernetes 10.152.183.1 <none> 443/TCP 2d
microbot 10.152.183.153 <none> 80/TCP 1m
$ gcloud dns record-sets list  --zone demo-madeden
NAME TYPE TTL DATA
demo.madeden.net. NS 21600 ns-cloud-a1.googledomains.com.,ns-cloud-a2.googledomains.com.,ns-cloud-a3.googledomains.com.,ns-cloud-a4.googledomains.com.
demo.madeden.net. SOA 21600 ns-cloud-a1.googledomains.com. cloud-dns-hostmaster.google.com. 2 21600 3600 259200 300
microbot.default.magicring.svc.eu-west-2a.eu-west-2.demo.madeden.net. CNAME 180 microbot.default.magicring.svc.eu-west-2.demo.madeden.net.
microbot.default.magicring.svc.eu-west-2.demo.madeden.net. CNAME 180 microbot.default.magicring.svc.demo.madeden.net.
microbot.default.magicring.svc.us-west-2.demo.madeden.net. CNAME 180 microbot.default.magicring.svc.demo.madeden.net.
microbot.default.magicring.svc.us-west-2a.us-west-2.demo.madeden.net. CNAME 180 microbot.default.magicring.svc.us-west-2.demo.madeden.net.

Ingresses

Unfortunately this is not going to go as well…

# deploying Ingress...
kubectl --context=magicring create -f
src/manifests/microbots-ing.yaml
ingress "microbot-ingress" created
# Querying ing on AWS
kubectl --context=aws get ing
NAME HOSTS ADDRESS PORTS AGE
microbot-ingress microbots.demo.madeden.net 10.0.1.95,10.... 80 1d
# On AWS
kubectl --context=azure get ing
No resources found.
# Oups!!
kubectl --context=magicring get ing
NAME HOSTS ADDRESS PORTS AGE
microbot-ingress microbots.demo.madeden.net 10.0.1.95,10.... 80 1d
kubectl --context=magicring describe ing microbot-ingress
Name: microbot-ingress
Namespace: default
Address: 10.0.1.95,10.0.2.43,10.0.2.43
Default backend: default-http-backend:80 (<none>)
Rules:
Host Path Backends
---- ---- --------
microbots.demo.madeden.net
/ microbot:80 (<none>)
Annotations:
first-cluster: aws
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1d 3m 2 {federated-ingress-controller } Normal CreateInCluster Creating ingress in cluster azure
1d 1m 1 {federated-ingress-controller } Normal UpdateInCluster Updating ingress in cluster azure
1d 1m 6 {federated-ingress-controller } Normal CreateInCluster Creating ingress in cluster aws
# log from the Federation Controller Manager 
## And specific for the ingress creation
E0210 08:54:08.464928 1 ingress_controller.go:725] Failed to ensure delete object from underlying clusters finalizer in ingress microbot-ingress: failed to add finalizer orphan to ingress : Operation cannot be fulfilled on ingresses.extensions “microbot-ingress”: the object has been modified; please apply your changes to the latest version and try again
E0210 08:54:08.472338 1 ingress_controller.go:672] Failed to update annotation ingress.federation.kubernetes.io/first-cluster:aws on federated ingress “default/microbot-ingress”, will try again later: Operation cannot be fulfilled on ingresses.extensions “microbot-ingress”: the object has been modified; please apply your changes to the latest version and try again
for cloud in aws azure
do
kubectl --context=${cloud} create -f src/manifests/microbots-ing.yaml
done
# Identify the public addresses of the workers
juju switch aws
AWS_INSTANCES="$(juju show-status kubernetes-worker --format json | jq --raw-output '.applications.kubernetes-worker".units[]."public-address"' | tr '\n' ' ')"
juju switch azure
AZURE_INSTANCES="$(juju show-status kubernetes-worker --format json | jq --raw-output '.applications."kubernetes-worker".units[]."public-address"' | tr '\n' ' ')"
# Create the Zone File
touch /tmp/zone.list
for instance in ${AWS_INSTANCES} ${AZURE_INSTANCES};
do
echo "microbots.demo.madeden.net. IN A ${instance}" | tee -a /tmp/zone.list
done
# Add a A record to the zone
gcloud dns record-sets import -z demo-madeden \
--zone-file-format \
/tmp/zone.list

Tear Down

Before moving to the conclusion, let’s tear down the clusters.

kubectl --context=magicring delete clusters \
aws azure
kubectl --context=gke delete namespace \
federation-system
gcloud dns managed-zones delete demo-madeden
gcloud container clusters delete gce --zone=us-east1-b
# AWS
juju destroy-controller aws --destroy-all-models
WARNING! This command will destroy the "k8s-us-west-2" controller.
This includes all machines, applications, data and other resources.
Continue? (y/N):y
Destroying controller
Waiting for hosted model resources to be reclaimed
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines
Waiting on 1 model, 8 machines
Waiting on 1 model, 1 machine
Waiting on 1 model, 1 machine
Waiting on 1 model
Waiting on 1 model
All hosted models reclaimed, cleaning up controller machines
# Azure
juju destroy-controller azure --destroy-all-models
WARNING! This command will destroy the "k8s-us-west-2" controller.
This includes all machines, applications, data and other resources.
Continue? (y/N):y
Destroying controller
Waiting for hosted model resources to be reclaimed
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines, 5 applications
Waiting on 1 model, 8 machines
Waiting on 1 model, 8 machines
Waiting on 1 model, 1 machine
Waiting on 1 model, 1 machine
Waiting on 1 model
Waiting on 1 model
All hosted models reclaimed, cleaning up controller machines

Conclusion

Is it worth engaging with Federation right now on multicloud / world scale solutions?
Definitely yes. This part of Kubernetes is the one any enterprise is looking at right now, and having understanding of its behavior and architecture is definitely a plus.

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Samuel Cozannet

Written by

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.