Using Cilium to load balance GRPC requests on Kubernetes

Victor Muchiaroni
sysadmuchi
Published in
5 min readMay 8, 2024

Some years ago, I was deploying my first GRPC service on Kubernetes and found myself reading this blog article https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/ . Since this was the first of many other services that were coming, we decided to start to implement Linkerd as our service mesh to correctly load balance the GRPC requests internally in the cluster. As you may know, in order to achieve the L7 load balance, Linkerd will add a sidecar container to our pod called linkerd-proxy that will be responsible for proxying the conections to our application, just as shown in the mentioned kubernetes article.

Linkerd Proxy in Action

I know Linkerd is a pretty lightweight service mesh when compared to others like Istio, for example, and you can achieve so much more with it besides L7 Load Balancing, but there is a significant effort to run it in production as you may need to have multiple replicas for the dataplane componentes like the proxy injector, destination and identity and might need to split those replicas in different nodes for High Availability.

That’s when I found out about eBPF and Cilium and decided to give them a chance. It seemed a good idea to run the componentes in the kernel level and the performance benchmarks on AKS clusters(where we were running our workloads) were really exciting.

For this Medium article, I will use as an example a local Kind cluster and will be focusing on the networking components like the Cilium CNI itself together with the Envoy Proxy(that is another CNCF Project but it comes shipped with Cilium). You may use it as reference for cloud providers since we are using Helm to install the necessary components and a sample grpc client and server in order to test our scenarios.

Note: If you are deploying it to Azure Kubernetes Service(AKS), make sure you use the Bring Your Own CNI mode when creating the cluster: https://learn.microsoft.com/en-us/azure/aks/use-byo-cni

The grpc client and server applications and the kubernetes manifests can be found here: https://github.com/vimuchiaroni/cilium-kind.

For both scenarios, we will run 1 replica of the grpc-client service and 3 replicas of the grpc-server service. This way it is possible to demonstrate how the grpc long lived connections will behave with and without cilium and the envoy proxy.

Scenario 1: Kind Cluster without Cilium.

By default, Kind ships a simple networking implementation (“kindnetd”) based around standard CNI plugins.

  1. Create our kind cluster:
#This will create a default cluster with kindnetd

$ kind create cluster

2. Deploy our grpc server and client applications:

$ git clone git@github.com:vimuchiaroni/cilium-kind.git

$ cd cilium-kind

$ kubectl apply -f kubernetes/grpc-server.yaml

$ kubectl apply -f kubernetes/grpc-client.yaml

$ k get po -n grpc-server
NAME READY STATUS RESTARTS AGE
grpc-server-59f76c48cf-d6j5v 1/1 Running 0 9m25s
grpc-server-59f76c48cf-k5cn5 1/1 Running 0 9m25s
grpc-server-59f76c48cf-lshpt 1/1 Running 0 9m25s

$ k get po -n grpc-client
NAME READY STATUS RESTARTS AGE
grpc-client-77cfdf7664-xqflq 1/1 Running 0 13s

3. Get the client logs to check which pods we are connecting:




$ k logs -n grpc-client grpc-client-77cfdf7664-xqflq
call: 0
2024-05-08 14:47:00,683;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 1
2024-05-08 14:47:01,693;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 2
2024-05-08 14:47:02,700;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 3
2024-05-08 14:47:03,712;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 4
2024-05-08 14:47:04,732;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 5
2024-05-08 14:47:05,754;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt
call: 6
2024-05-08 14:47:06,779;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-lshpt

Notice in the logs that we are always hitting grpc-server-59f76c48cf-lshpt pod.

Only one server is getting the requests

Scenario 2: Kind Cluster with Cilium.

Now, we are not using Kind’s default CNI anymore. Instead, we are installing Cilium and the Envoy Proxy.

#Create the cluster
$ git clone git@github.com:vimuchiaroni/cilium-kind.git

$ cd cilium-kind

$ kind create cluster --config kind/kind_cilium.yaml

#Install Cilium
$ helm repo add cilium https://helm.cilium.io

$ helm install cilium cilium/cilium --version 1.15.4 \
--set envoy.enabled=true \
--set loadBalancer.l7.backend=envoy \
--namespace kube-system

$ k get po -n kube-system |grep -i cilium
cilium-envoy-gsh9p 1/1 Running 0 2m39s
cilium-envoy-lkzfp 1/1 Running 0 2m39s
cilium-envoy-npfd4 1/1 Running 0 2m39s
cilium-envoy-vrm8q 1/1 Running 0 2m39s
cilium-g2mtw 1/1 Running 0 2m39s
cilium-h4ltg 1/1 Running 0 2m39s
cilium-lfgtv 1/1 Running 0 2m39s
cilium-operator-756d954b64-6lpp5 1/1 Running 0 2m39s
cilium-operator-756d954b64-df9rb 1/1 Running 0 2m39s
cilium-sxx2s 1/1 Running 0 2m39s

Now that our CNI is installed, we can deploy our applications and check the logs:

$ kubectl apply -f kubernetes/grpc-server.yaml

$ kubectl apply -f kubernetes/grpc-client.yaml

$ k get po -n grpc-server
NAME READY STATUS RESTARTS AGE
grpc-server-59f76c48cf-8k5cg 1/1 Running 0 74s
grpc-server-59f76c48cf-d5z76 1/1 Running 0 74s
grpc-server-59f76c48cf-dsp59 1/1 Running 0 74s

$ k get po -n grpc-client
NAME READY STATUS RESTARTS AGE
grpc-client-77cfdf7664-tvbnm 1/1 Running 0 108s



$ k logs -n grpc-client grpc-client-77cfdf7664-tvbnm
2024-05-08 15:01:56,983;INFO;grpc-server.grpc-server:50051
call: 0
2024-05-08 15:01:58,890;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-d5z76
call: 1
2024-05-08 15:02:00,582;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-8k5cg
call: 2
2024-05-08 15:02:01,998;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-dsp59
call: 3
2024-05-08 15:02:03,043;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-d5z76
call: 4
2024-05-08 15:02:04,103;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-8k5cg
call: 5
2024-05-08 15:02:05,387;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-dsp59
call: 6
2024-05-08 15:02:06,437;INFO;Greeter client received: Hello from grpc-server-59f76c48cf-d5z76

We can see that now, our grpc requests are being routed to the 3 replicas :)

L7 Load balance in place

Note: In order for the L7 load balancing to work, you need to add the service.cilium.io/lb-l7: enabled annotation to your grpc server service. As you can see in the example:

Conclusion

Cilium seems to be a much simpler, faster and better way to achieve L7 load balacing than using a service mesh only for this purpose(as it was in my case). If your need is based on networking performance and routing, you should consider it.

The envoy proxy will run as a Daemonset on each node and there is no need for a side car proxy on each application pod. If you are running thousands or millions of pods on your cluster, this solution should save you some resources.

--

--