As we’ve been discussing about gRPC with spring boot, we have been understanding and reviewing a quickly introduction about gRPC, we have created a couple of componentes with gradle and java 17 using gRPC as a communication protocol between them, now we’re going to dockerize, and deploy the components that we had created before using kubernetes and minikube for test it locally and we’re going to verify that the communication is correctly balanced between the different replicated services
gRPC is one of the most popular modern RPC frameworks for communication between components, It’s a great choice to avoid the difficult process to define a REST layer, saving network binding, avoid the CPU intense consume parsing JSON, use the advantages of HTTP/2 and many more pros
Also, kubernetes, is one of the most popular ways to deploy a microservices application
Kubernetes allow us to deploy our components and replicate many instances of the same component and provide an interface that would ensure that when a component wants to communicate with other that is replicated, the communication would be balanced between all the replicates, this is called Load Balancing and kubernetes service like ClusterIP provide a load-balanced IP Addresses
Load Balancing
Load balancing is used for distributing the load from clients accross available servers
Load balancing had many benefits and some of them are:
- Tolerance of failures: if one of your replicas fails, then other servers can serve the request.
- Increased Scalability: you can distribute user traffic across many servers increasing the scalability.
- Improved throughput: you can improve the throughput of the application by distributing traffic across various backend servers.
- No downside deployment: you can achieve no downtime deployment using rolling deployment techniques.
There are many other benefits of load balancing. You can read more about load balancer here.
So, let’s remember something about HTTP1.1 and HTTP2
We can see, in HTTP 1.1, the communication is open and close for each request in HTTP/2, we have a single communincation and we send and receive everything that we want in the same TCP connection
So, the default load balancing in Kubernetes is based on connection level load balancing and won’t work with gRPC because kubernetes won’t be able to manage the connection, it would be a single connection that won’t be closed
Load Balancing options in gRPC
There are two types of load balancing options available in gRPC — proxy and client-side.
Proxy load balancing
In Proxy load balancing, the client issues RPCs to a Load Balancer (LB) proxy. The LB distributes the RPC call to one of the available backend servers that implement the actual logic for serving the call. The LB keeps track of load on each backend and implements algorithms for distributing load fairly. The clients themselves do not know about the backend servers. Clients can be untrusted. This architecture is typically used for user-facing services where clients from open internet can connect to the servers
Client side load balancing
In Client-side load balancing, the client is aware of many backend servers and chooses one to use for each RPC. If the client wishes it can implement the load balancing algorithms based on load report from the server. For simple deployment, clients can round-robin requests between available servers.
For more information about the gRPC load balancing option, you can check the article gRPC Load Balancing.
What are we going to do to prove what I’ve been saying?
- Dockerize
- Create namespace, deployment and services for kubernetes
- Deploy in minikube
- Test it
Dockerize
We have two microservices, the client and the server, the ones we built in the previous article
and this would generate our .jar file that we required inside of ./build/libs
Then, take the .jar file and place it inside of the folder ~/k8s/k8s-grpcserver-product/ci
The structure that we’re going to follow for the two projects is
Then, we should create the Dockerfile
And also we’re going to create a docker-compose.yml file, by the way the image could be replaced by . and it would be created each time
For simplicity, I created a Makefile, that would help us to remember the scripts and we’re going to be able to execute them with custom commands
Now, let’s execute build-and-deploy-docker in the folder ~/k8s-grpcserver-product and the result should be something like
We’ve to do the same with the client, all the files and config can be found here
Then, after executed the same for the client and executing docker ps we should have containers like
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
fae8b841e35a anderjvila/grpcclient-product:v1.0.0-SNAPSHOT "/bin/sh -c 'java -j…" 2 minutes ago Up 2 minutes 9090/tcp, 0.0.0.0:8081->8080/tcp ci-grpcclient-product-1
13f96a02c591 anderjvila/grpcserver-product:v1.0.0-SNAPSHOT "/bin/sh -c 'java -j…" 8 minutes ago Up 8 minutes 0.0.0.0:8080->8080/tcp, 0.0.0.0:9090->9090/tcp ci-grpcserver-product-1Let’s test it, directly against the grpcserver-product
And through the client grpcclient-product
Create namespace, deployment and services for kubernetes
Now, let’s create the files for kubernetes, these files should be placed under folder minikube
Also, all the steps that we’re going to do now, should be replicated to the client, files for the client can also be found here
First, let’s create the namespace
Then, we have to create the deployment.yaml
Now, we have to create the service.yaml, but we’re going to split these in two files
The first one, service-grpc.yaml and we’re going to specify in this one the clusterIP: none , this is to create a Headless service,
Why a Headless service?. We can see that in HTTP 1.1, the communication is open and close for each request; but in HTTP/2, we have a single communication and we send and receive everything that we want in the same TCP connection
So, the load balancing of Kubernetes won’t work with gRPC because it won’t be able to manage the connection, it would be a single connection that won’t be closed
What is Headless Service ?
Luckily, Kubernetes allows clients to discover pod IPs through DNS lookups. Usually, when you perform a DNS lookup for a service, the DNS server returns a single IP — the service’s cluster IP. But if you tell Kubernetes you don’t need a cluster IP for your service (you do this by setting the clusterIP field to None in the service specification ), the DNS server will return the pod IPs instead of the single service IP. Instead of returning a single DNS A record, the DNS server will return multiple A records for the service, each pointing to the IP of an individual pod backing the service at that moment. Clients can therefore do a simple DNS A record lookup and get the IPs of all the pods that are part of the service. The client can then use that information to connect to one, many, or all of them.
Setting the clusterIP field in a service spec to None makes the service headless, as Kubernetes won’t assign it a cluster IP through which clients could connect to the pods backing it.
Kubernetes in Action — Marko Lukša
You can do client-side round-robin load-balancing using Kubernetes headless service. This simple load balancing works out of the box with gRPC. The downside is that it does not take into account the load on the server.
Then we’re going to create two services in order to prove what we’ve been talking about
Now, for simplicity I create another Make file that should be placed in the folder minikube to deal with the build image and deploy inside of minikube
and in the folder ~/k8s/k8s-grpcserver-product we’re going to create another Makefile to unified the both of ci and minikube folder
Deploy in minikube
Now, we’ve to create the minikube profile where we’re going to deploy our services
In the folder ~/k8s we’re going to create a Makefile and a custom script that would help us to create this profile
and the custom script can be found here
Now, let’s execute minikube-create in order to create the minikube
make minikube-createThen the result could be something like
And if we execute minikube profile list
Excellent, now we’ve to deploy in our minikube profile
In the folder ~/k8s/k8s-grpcserver-product we would create this custom script, that would execute a custom script that delete the possibles deployment and services that the namespace could have and re-deploy them.
Now, let’s execute make build-and-deploy in the folder ~/k8s/k8s-grpcserver-product . This will generate a docker image inside of the profile context and would execute the custom script, the result should contain something like this
deployment.apps/grpcserver-product created
service/grpcserver-product-rpc created
service/grpcserver-product-service created
service/grpcserver-product-ip-service created
====================================Then, we should do the same for the client, as a reminder, all the files for both, server and client can be found here
Now, in the folder of ~/k8s if we execute make services we should be able to see all this services running well
Let’s test it, WDYT?
Test
In order to test this well, we’re going to inspect each log of each of the grpcserver-product instance
To do that, let’s list all the pods with kubectl get pods and then watch the logs of each one with kubectl logs {pod-name} -f
kubectl logs grpcserver-product-6d885fc689-b894f -f
kubectl logs grpcserver-product-6d885fc689-kdk6n -f
kubectl logs grpcserver-product-6d885fc689-sjv7c -fNow, we have three terminals watching the logs
Now, we’re going to expose locally the service of grpcserver-product-ip-service which have the spec.type: ClusterIP the default load balancer of kubernetes
The second endpoint that gave us, is the one that point to the port :9090 , we are going to execute some gRPC request and see the logs
As you can see, only the middle one terminal receive this request, which mean they aren’t been balance correctly
Great, this prove our idea 💪
Now, we’re going to expose the client service, grpcclient-product-service , but first let’s analyze the deployment file of the client
In the line 32 and 33, this deployment is declaring a env variable GRPCSERVER_PRODUCT_ADDRESS which is pointing to the service grpcserver-product-rpc:9090 . Which is the service with the headless service
And then, we’re going to execute some request using the REST operation that we created in the previous article to the endpoint 127.0.0.1:33633
Now, we can see that the three instances are receiving traffic correctly and this prove that with the headless service, the traffic would be balance
Verifying DNS
To confirm the DNS of headless service, create a pod with image tutum/dnsutils as:
kubectl run dnsutils --image=tutum/dnsutils --command -- sleep infinityand then run the command
kubectl exec dnsutils -- nslookup grpcserver-product-rpcThis return FQDN of headless service as:
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: grpcserver-product-rpc.n-anderjvila-learning.svc.cluster.local
Address: 172.17.0.2
Name: grpcserver-product-rpc.n-anderjvila-learning.svc.cluster.local
Address: 172.17.0.4
Name: grpcserver-product-rpc.n-anderjvila-learning.svc.cluster.local
Address: 172.17.0.7As you can see headless service resolves into the IP address of all pods connected through service. Contrast this with the output returned for non-headless service.
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: grpcserver-product-ip-service.n-anderjvila-learning.svc.cluster.local
Address: 10.98.131.116Code example
The working code example of this article is listed on GitHub
Summary
There are two kinds of load balancing options available in gRPC — proxy and client-side. As gRPC connections are long-lived, the default connection-level load balancing of Kubernetes does not work with gRPC. Kubernetes headless service is one mechanism through which load balancing can be achieved. A Kubernetes headless service DNS resolves to the IP of the backing pods.
