Multi Cluster Support for Service Mesh with IBM Cloud Private

Guangya Liu
IBM Cloud
Published in
11 min readJun 1, 2018

Background

With the latest release of Istio 0.8, there is a great alpha feature Multi Cluster Support for Service Mesh. With service mesh available across multiple clusters, you can control traffic across clusters, as well as manage security. See the Istio 0.8 release notes for more details.

Detailed steps are documented here so that you can enable multi cluster support for service mesh with Kubernetes. In this blog, I will discuss the following:

  1. How to set up service mesh cross multi clusters with IBM Cloud Private 2.1.0.3 and Istio 0.8
  2. A demo case for traffic control cross two different IBM Cloud Private Clusters.

Before we dive into the detail of this article, I would like to take this opportunity to thank Steven Dake and Serguei Bezverkhi from Cisco, Costin Manolach from Google, and Lin Sun from IBM who hosted meetings about design, issues, and future works for multi cluster support in Istio!

In this article, the overall deployment topology will have two IBM Cloud Private Clusters with BookInfo deployed across those two clusters, as it is displayed in the following image:

Configure IBM Cloud Private Clusters

Set Up Two IBM Cloud Private Clusters

You can get detailed installation steps from IBM Cloud Private Knowledge Center. I already have two IBM Cloud Private clusters running.

Note: Ensure that each cluster has a unique Pod Classless Inter-domain Routing (CIDR) setting, as the multi-cluster support request pod communicates across different clusters.

In IBM Cloud Private, you can configure CIDR in cluster/config.yaml as follows:

## Network in IPv4 CIDR format
network_cidr: 20.1.0.0/16

Here I have two clusters and configured different network_cidras follows:

For Cluster 1, I have three nodes.

9.111.255.21 gyliu-icp-1
9.111.255.129 gyliu-icp-2
9.111.255.29 gyliu-icp-3

See the configuration of network_cidras follows:

## Network in IPv4 CIDR format
network_cidr: 10.1.0.0/16

And Cluster 1 is also running well.

[root@gyliu-icp-1 ~]# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
9.111.255.129 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.129
9.111.255.21 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,etcd=true,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.21,management=true,master=true,proxy=true,role=master
9.111.255.29 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.29

For Cluster 2, I also have three nodes.

9.111.255.152 gyliu-ubuntu-3
9.111.255.155 gyliu-ubuntu-2
9.111.255.77 gyliu-ubuntu-1

See the configuration of network_cidras follows, which is different from Cluster 1.

## Network in IPv4 CIDR format
network_cidr: 20.1.0.0/16

Cluster 2 is running as follows:

root@gyliu-ubuntu-1:~# kubectl get nodes --show-labels
NAME STATUS ROLES AGE VERSION LABELS
9.111.255.152 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.152
9.111.255.155 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.155
9.111.255.77 Ready <none> 3d v1.10.0+icp-ee beta.kubernetes.io/arch=amd64,beta.kubernetes.io/os=linux,etcd=true,gpu/nvidia=NA,kubernetes.io/hostname=9.111.255.77,management=true,master=true,proxy=true,role=master

Config Pod Communication Cross IBM Cloud Private Clusters

A Multi Cluster Support for Service Mesh request configures all pod CIDRs in every cluster routable to each other. Here I will show how to enable two IBM Cloud Private Clusters to communicate with each other.

IBM Cloud Private is using Calico Node-to-Node Mesh by default to manage container networks. The BGP Client on each node distributes the IP router information to all nodes.

In order to make sure pods can communicate across different clusters, you need to configure IP routers on all nodes in the cluster. You need to add IP routers in Cluster 2 to Cluster 1, and add IP routers in Cluster 1 to Cluster 2.

You can check how to add IP routers in Cluster 1 to Cluster 2. With Node-to-Node Mesh mode, each node will have IP routers connecting to peer nodes in the cluster.

First, get all of the ip routeon nodes in Cluster 1 with command ip route | grep bird . See the following example:

[root@gyliu-icp-1 mc]# ip route | grep bird
10.1.43.0/26 via 9.111.255.29 dev tunl0 proto bird onlink
10.1.158.192/26 via 9.111.255.129 dev tunl0 proto bird onlink
blackhole 10.1.198.128/26 proto bird
root@gyliu-icp-2:~# ip route | grep bird
10.1.43.0/26 via 9.111.255.29 dev tunl0 proto bird onlink
blackhole 10.1.158.192/26 proto bird
10.1.198.128/26 via 9.111.255.21 dev tunl0 proto bird onlink
root@gyliu-icp-3:~# ip route | grep bird
blackhole 10.1.43.0/26 proto bird
10.1.158.192/26 via 9.111.255.129 dev tunl0 proto bird onlink
10.1.198.128/26 via 9.111.255.21 dev tunl0 proto bird onlink

You can see there are three IP routers total for those three nodes in Cluster 1.

10.1.158.192/26 via 9.111.255.129 dev tunl0  proto bird onlink
10.1.198.128/26 via 9.111.255.21 dev tunl0 proto bird onlink
10.1.43.0/26 via 9.111.255.29 dev tunl0 proto bird onlink

Then, add those three IP routers to all nodes in Cluster 2 by the command as follows.

$ ip route add 10.1.158.192/26 via 9.111.255.129 
$ ip route add 10.1.198.128/26 via 9.111.255.21
$ ip route add 10.1.43.0/26 via 9.111.255.29

You can use same steps to add all IP routers in Cluster 2 to Cluster 1, as well. After the configuration finished, you will see that the pods in those two different clusters can communication with each other.

You can verify it by pinging pod IP in Cluster 2 from Cluster 1.

The following is a pod from Cluster 2 with pod IP as 20.1.47.150.

root@gyliu-ubuntu-1:~/cluster# kubectl get pods -owide  -n kube-system | grep platform-ui
platform-ui-lqccp 1/1 Running 0 3d 20.1.47.150 9.111.255.77

In one of Cluster 1 node, ping this pod IP; it should succeed.

[root@gyliu-icp-1 mc]# ping 20.1.47.150
PING 20.1.47.150 (20.1.47.150) 56(84) bytes of data.
64 bytes from 20.1.47.150: icmp_seq=1 ttl=63 time=0.759 ms

The above configuration actually configured full IP route mesh across all nodes in the two IBM Cloud Private Clusters, which enabled Pod communication cross clusters.

Deploy Istio Control Panel

OK, you have finished the configuration for IBM Cloud Private! Now you can deploy Istio 0.8 for multi cluster based on steps here; instructions are simple and straightforward.

In the following section, I will treat Cluster 1 as Istio Local Control Panel Cluster and Cluster 2 as the Isito Remote Control Panel Cluster.

For this demo, I am using istio-demo.yaml to deploy Istio Local Control Panel.

And after Istio Local Control Panel deployed, you can check its status as follows with `kubectl`, and you can also check it from IBM Cloud Private dashboard.

[root@gyliu-icp-1 istio-0.8.0]# kubectl get pods -n istio-system
NAME READY STATUS RESTARTS AGE
grafana-6f6dff9986-nqv4j 1/1 Running 0 5m
istio-citadel-7bdc7775c7-xlqp7 1/1 Running 0 5m
istio-cleanup-old-ca-9mprd 0/1 Completed 0 5m
istio-egressgateway-795fc9b47-kcvcm 1/1 Running 0 5m
istio-ingressgateway-7d89dbf85f-k4n7w 1/1 Running 0 5m
istio-mixer-post-install-kxc8k 0/1 Completed 0 5m
istio-pilot-66f4dd866c-k47jk 2/2 Running 0 5m
istio-policy-76c8896799-ntsff 2/2 Running 0 5m
istio-sidecar-injector-645c89bc64-r2ljb 1/1 Running 0 5m
istio-statsd-prom-bridge-949999c4c-pps8g 1/1 Running 0 5m
istio-telemetry-6554768879-4p96n 2/2 Running 0 5m
istio-tracing-754cdfd695-pxvqd 1/1 Running 0 5m
prometheus-86cb6dd77c-4qcdt 1/1 Running 0 5m
servicegraph-5849b7d696-8tt94 1/1 Running 0 5m
[root@gyliu-icp-1 istio-0.8.0]# kubectl get svc -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 10.0.0.24 <none> 3000/TCP 6m
istio-citadel ClusterIP 10.0.0.152 <none> 8060/TCP,9093/TCP 6m
istio-egressgateway ClusterIP 10.0.0.30 <none> 80/TCP,443/TCP 6m
istio-ingressgateway LoadBalancer 10.0.0.134 9.111.255.8 12345:31380/TCP,443:31390/TCP,31400:31400/TCP 6m
istio-pilot ClusterIP 10.0.0.79 <none> 15003/TCP,15005/TCP,15007/TCP,15010/TCP,15011/TCP,8080/TCP,9093/TCP 6m
istio-policy ClusterIP 10.0.0.67 <none> 9091/TCP,15004/TCP,9093/TCP 6m
istio-sidecar-injector ClusterIP 10.0.0.113 <none> 443/TCP 6m
istio-statsd-prom-bridge ClusterIP 10.0.0.213 <none> 9102/TCP,9125/UDP 6m
istio-telemetry ClusterIP 10.0.0.132 <none> 9091/TCP,15004/TCP,9093/TCP,42422/TCP 6m
prometheus ClusterIP 10.0.0.214 <none> 9090/TCP 6m
servicegraph ClusterIP 10.0.0.53 <none> 8088/TCP 6m
tracing LoadBalancer 10.0.0.122 9.111.255.9 12346:31134/TCP 6m
zipkin ClusterIP 10.0.0.21 <none> 9411/TCP 6m

You can also check Istio Remote Control Panel as follows. From the output, you can see that the Istio Remote Control Panel is actually connecting back to the Istio Local Control Panel with Pilot, Policy and StatsdPod IP.

root@gyliu-ubuntu-1:~# kubectl get svc -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
istio-citadel ClusterIP 20.0.0.156 <none> 8060/TCP,9093/TCP 34s
istio-pilot ClusterIP None <none> 15003/TCP,15005/TCP,15007/TCP,15010/TCP,15011/TCP,8080/TCP,9093/TCP 34s
istio-policy ClusterIP None <none> 9091/TCP,15004/TCP,9093/TCP,9094/TCP,9102/TCP,9125/UDP,42422/TCP 34s
istio-statsd-prom-bridge ClusterIP None <none> 9102/TCP,9125/UDP 34s
root@gyliu-ubuntu-1:~# kubectl get ep -n istio-system
NAME ENDPOINTS AGE
istio-citadel 20.1.35.198:9093,20.1.35.198:8060 43s
istio-pilot 10.1.158.234:9093,10.1.158.234:15010,10.1.158.234:15007 + 4 more... 43s
istio-policy 10.1.158.230:42422,10.1.158.230:9093,10.1.158.230:9094 + 4 more... 43s
istio-statsd-prom-bridge 10.1.43.45:9125,10.1.43.45:9102 43s

Test BookInfo Cross IBM Cloud Private Clusters

Now you can test how to do traffic control between those two IBM Cloud Private Clusters with BookInfo.

The demo case will deploy reviews-v3in Istio Remote Control Panel and others in Istio Local Control Panel. The following diagram is the deployment topology for BookInfo.

Deploy BookInfo in Istio Local Contro Panel Cluster

You can deploy BookInfo as follows based on the guidance here.

[root@gyliu-icp-1 istio-0.8.0]# istioctl kube-inject -f samples/bookinfo/kube/bookinfo.yaml -o inject.yaml[root@gyliu-icp-1 istio-0.8.0]# kubectl apply -f inject.yaml
service "details" created
deployment.extensions "details-v1" created
service "ratings" created
deployment.extensions "ratings-v1" created
service "reviews" created
deployment.extensions "reviews-v1" created
deployment.extensions "reviews-v2" created
deployment.extensions "reviews-v3" created
service "productpage" created
deployment.extensions "productpage-v1" created

Create the gateway and virtual services for BookInfo.

[root@gyliu-icp-1 istio-0.8.0]# istioctl create -f samples/bookinfo/routing/bookinfo-gateway.yaml
Created config gateway/default/bookinfo-gateway at revision 667003
Created config virtual-service/default/bookinfo at revision 667004

Apply the route rules of route-rule-all-v1.yaml, and this can make sure all requests to productpage will direct to reviews-v1.

[root@gyliu-icp-1 istio-0.8.0]# istioctl create -f samples/bookinfo/routing/route-rule-all-v1.yaml
Created config virtual-service/default/productpage at revision 667055
Created config virtual-service/default/reviews at revision 667056
Created config virtual-service/default/ratings at revision 667057
Created config virtual-service/default/details at revision 667058
Created config destination-rule/default/productpage at revision 667059
Created config destination-rule/default/reviews at revision 667060
Created config destination-rule/default/ratings at revision 667061
Created config destination-rule/default/details at revision 667062

You want to deploy reviews-v3 in Istio Remote Control Panel Cluster, so here you need to delete the deployment of reviews-v3, and re-deploy it in Istio Remote Control Panel Cluster.

[root@gyliu-icp-1 istio-0.8.0]# kubectl delete deploy reviews-v3
deployment.extensions "reviews-v3" deleted

After you deleted reviews-v3, you can see you only have reviews-v1and reviews-v2 in Istio Local Control Panel Cluster.

[root@gyliu-icp-1 istio-0.8.0]# kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE
details-v1-7f4b9b7775-cn4fr 2/2 Running 0 2m 10.1.43.51 9.111.255.29
productpage-v1-586c4486b7-qpqgn 2/2 Running 0 2m 10.1.158.236 9.111.255.129
ratings-v1-7bc49f5779-cgfkr 2/2 Running 0 2m 10.1.43.52 9.111.255.29
reviews-v1-b44bd5769-sc59q 2/2 Running 0 2m 10.1.158.235 9.111.255.129
reviews-v2-6d87c8c5-d7vjx 2/2 Running 0 2m 10.1.43.53 9.111.255.29

Now you can access the BookInfo dashboard. You will see you can only access reviews-v1 since you have defined virtual services, which will direct all request to reviews-v1.

Deploy BookInfo review-v3 in Istio Remote Control Panel Cluster

The reviews-v3.yamlis from istio-demo.yaml. You only need to put rating service, reviews service and reviews-v3 deployment to reviews-v3.yaml. You can also get the whole file from my github, which already has sidecar injected.

Deploy the reviews-v3.yaml in Istio Remote Control Panel as follows:

root@gyliu-ubuntu-1:~# kubectl apply -f review-v3.yaml
service "ratings" created
service "reviews" created
deployment.extensions "reviews-v3" created
root@gyliu-ubuntu-1:~# kubectl get pods -owide
NAME READY STATUS RESTARTS AGE IP NODE
reviews-v3-6dd497f5db-g8r5t 2/2 Running 0 24s 20.1.31.148 9.111.255.152

Create VirtualServices to Direct Request to reviews-v3

[root@gyliu-icp-1 istio-0.8.0]# istioctl replace -f  samples/bookinfo/routing/route-rule-reviews-v3.yaml
Updated config virtual-service/default/reviews to revision 667817

Now you will see red stars, which means your request have been directed to reviews-v3 in Cluster 2, the Istio Remote Control Panel Cluster.

You can try more route rules for Bookinfo at samples/bookinfo/routing.

Enable Internal Load Balancer (ILB) for Multi Cluster

You may see that there is actually a limitation for current multi cluster support in Istio: The following endpoints:pilotEndpoint, policyEndpoint, statsdEndpoint are now set as the Pod IP of pilot, policy and statsd. The problem here is that once the Pod is restarted, you need to re-install the istio-remote control panel. Check out the Istio documentation here for more details on configuration parameters.

For this issue, I have finished a demo by using ILB to resolve the Pod restart issue. Refer to here for more discussion. After ILB enabled, the remote control panel will use ILB address instead of Pod IP address. This can ensure that even if the Istio local control panel pod restarted, there is no need to re-deploy the Istio remote control panel.

You can get a Keepalived load balancer at my github to enable ILB in your cluster if you do not have a cloud provider, which can help you do the load balancer.

I also uploaded the service load balancer file at here, so you can download it and apply in your local IBM Cloud Private Cluster. Be sure to update the yaml template loadBalancerIP to your own IP address.

After the load balancer service was created, you can see I created some new LoadBalancer services, such as istio-pilot-ilb, istio-policy-ilb, istio-statsd-prom-bridge-ilb etc…

[root@gyliu-icp-1 mc]# kubectl get svc -n istio-system
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana ClusterIP 10.0.0.24 <none> 3000/TCP 1h
istio-citadel LoadBalancer 10.0.0.152 9.111.255.10 8060:32551/TCP,9093:31229/TCP 1h
istio-egressgateway ClusterIP 10.0.0.30 <none> 80/TCP,443/TCP 1h
istio-ingressgateway LoadBalancer 10.0.0.134 9.111.255.8 12345:31380/TCP,443:31390/TCP,31400:31400/TCP 1h
istio-pilot ClusterIP 10.0.0.79 <none> 15003/TCP,15005/TCP,15007/TCP,15010/TCP,15011/TCP,8080/TCP,9093/TCP 1h
istio-pilot-ilb LoadBalancer 10.0.0.222 9.111.255.5 15003:30346/TCP,15005:31112/TCP,15007:31162/TCP,15010:30701/TCP,15011:32398/TCP,8080:31718/TCP,9093:30401/TCP 15s
istio-policy ClusterIP 10.0.0.67 <none> 9091/TCP,15004/TCP,9093/TCP 1h
istio-policy-ilb LoadBalancer 10.0.0.192 9.111.255.7 9091:32276/TCP,15004:30535/TCP,9093:32740/TCP 15s
istio-sidecar-injector ClusterIP 10.0.0.113 <none> 443/TCP 1h
istio-statsd-prom-bridge ClusterIP 10.0.0.213 <none> 9102/TCP,9125/UDP 1h
istio-statsd-prom-bridge-ilb LoadBalancer 10.0.0.169 9.111.255.4 9125:30281/UDP 15s
istio-telemetry ClusterIP 10.0.0.132 <none> 9091/TCP,15004/TCP,9093/TCP,42422/TCP 1h
prometheus ClusterIP 10.0.0.214 <none> 9090/TCP 1h
servicegraph ClusterIP 10.0.0.53 <none> 8088/TCP 1h
tracing LoadBalancer 10.0.0.122 9.111.255.9 12346:31134/TCP 1h
zipkin ClusterIP 10.0.0.21 <none> 9411/TCP 1h
zipkin-ilb LoadBalancer 10.0.0.2 9.111.255.3 9411:30316/TCP 15s

You can then re-generate the istio-remote.yaml based on steps here with loadbalancer IP and deploy it in your remote IBM Cloud Private Cluster.

After re-deploy in my remote IBM Cloud Private Cluster, you can see all of the epare using new loadbalancer IP.

root@gyliu-ubuntu-1:~# kubectl get ep -n istio-system
NAME ENDPOINTS AGE
istio-citadel 20.1.35.198:9093,20.1.35.198:8060 1h
istio-pilot 9.111.255.5:9093,9.111.255.5:15010,9.111.255.5:15007 + 4 more... 1h
istio-policy 9.111.255.7:42422,9.111.255.7:9093,9.111.255.7:9094 + 4 more... 1h
istio-statsd-prom-bridge 9.111.255.4:9125,9.111.255.4:9102 1h

After this finishes, you can restart your Istio local control panel. After the Istio local control panel restarts, you can still access the reviews-v3 in remote cluster without re-deploying the remote cluster.

Future Works

As you can see, multi cluster support for service mesh is still alpha feature and there are still two major works to complete to improve quality and usability:

1) Enable ILB support, as I mentioned previously.

2) The current solution request Pods in different clusters can communicate with each other, this is also a limitation, and we are now working for a solution to use the istio-ingressgateway and istio-egressgateway to remove this limitation. There is a prototype at here.

3) More works traced in Istio github issues.

Meanwhile, IBM Cloud Private will keep integrating with Istio so users can use Istio for service mesh with ease!

--

--

Guangya Liu
IBM Cloud

STSM@IBM, Member - IBM Academy of Technology, Kubernetes Member, Istio Maintainer, Apache Mesos Committer & PMC Member.