To mesh or not to mesh

Published in

tarmac

5 min readSep 9, 2020

Service Mesh

I was told that a Service Mesh such as Linkerd, Consul or Istio, adds a lot of overload in my cluster. Keeping this in mind, a Service Mesh is not suitable to a small deployment. Instead, you should consider a Service Mesh when you client is big enough to deserve it.

But, how big a client must be to deserve a Service Mesh?

And more important, how much overload a Service Mesh adds to my cluster?

The answer is: I don’t know.

Because of this, I’m starting this POC, to answer this question.

Resources

Files here: https://gitlab.com/post-repos/to-mesh-or-not-to-mesh

Requirements

To run this test you will need:

a k8s cluster (we will use GCP)
kubectl
locust
docker (or any container engine)
git

We will use Linkerd, so you will need to download the CLI.

BFF

A simple Python APP that exposes a simple API and hits BACKEND. Once it get the BACKEND‘s response, it enrich this response and sends it to the client.

BACKEND

The BACKEND just answers the request with its version number.

Service Mesh

I will use linker for this test.

What we will measure?

We will measure two items:

WEB response times with Locust
K8s resources usage

Then compare these metrics on two scenarios:

using the Service Mesh
using the raw k8s cluster.

Set up the environment

Cluster

This terraform template can be used: https://gitlab.com/templates14/terraform-templates/-/tree/master/gke

Then login into your cluster, e.g. for this case:

gcloud container clusters get-credentials kungfoo-test --region us-central1 --project speedy-league-274700

The tests

We’ll have two tests: one with and one without the service mesh.

No Service Mesh

Set env

Create the namespace to deploy the app into:

kubectl create ns kungfootest

Set the app and run tests

Go to Set up the app and the to Run the tests. After this come back here.

Clean up

Delete deployments:

kubectl delete -n kungfootest -f deploy-backend.yaml -f deploy-bff.yaml -f ingress.yaml

Delete the namespace so we’re clean:

kubectl delete ns kungfootest

Service Mesh

Linkerd

First, cli must be installed in your system. (more here )

Download binary from here and add it to your PATH.

Since we will be using GKE, we need to run these extra steps: https://linkerd.io/2/reference/cluster-configuration/#private-clusters

Check cluster is ready for linkerd:

linkerd check --pre

I got:

pre-kubernetes-capability-------------------------!! has NET_ADMIN capabilityfound 1 PodSecurityPolicies, but none provide NET_ADMIN, proxy injection will fail if the PSP admission controller is runningsee https://linkerd.io/checks/#pre-k8s-cluster-net-admin for hints!! has NET_RAW capabilityfound 1 PodSecurityPolicies, but none provide NET_RAW, proxy injection will fail if the PSP admission controller is runningsee https://linkerd.io/checks/#pre-k8s-cluster-net-raw for hints

The cluster lacks these capabilities. But probably when Linkerd is installed these will be installed as well. (https://github.com/linkerd/linkerd2/issues/3494)

Then install it:

linkerd install | kubectl apply -f -

…and wait until it’s installed:

linkerd check

Set env

Create the namespace to deploy the app into, this time we’ll need an annotation for linkerd:

kubectl create ns kungfootestkubectl edit ns kungfootest

and then add the annotation:

  annotations:    linkerd.io/inject: enabled

This will allow Linkerd to automagically inject the proxy in namespace’s pods.

Set the app and run tests

Now, go to Set up the app and the to Run the tests. After this come back here.

Note this time the pods will have two containers, since Linkerd is injecting the proxy.

Compare the test resuts

For my tests:

No Mesh

Total average requests: 33% CPU, 8% memory.

Total average usage: 12% CPU, 26% memory.

Avg response time: 204ms

Mesh

Total average requests: 35% CPU, 9% memory.

Total average usage: 25% CPU, 38% memory.

Avg response time: 206ms

Conclusion

The mesh configuration we used is very basic, but it adds interesting services to our deploy with no need to modify code. (e.g. secure internal connections, metrics…)

From the client’s point of view the time was only 1% more in the meshed version.

From the server side, we’re using 100% more of CPU and 46% of memory.

Does it worth?

As usual, it will depend. Can you afford the CPU and memory usage increase? Then you can have all the service mesh pros at almost no cost on the client side. Anyway, it deserves more tests if you are thinking on it.

But let me read your opinions on this, drop here a message.

Set up the app

Under source directory there are two subdirs. One for the BFF and one for the BACKEND (inside the later you will have two more dirs, versions 1 and 2… a.k.a. stable and canary, for now we will use just the stable version).

Build the app

Backend

On both cases you must proceed the same way, varying only the version number.

Into source/backend directory you will see the Dockerfile and the two version directories.

CD into your source/backend directory and run:

cd source/backend/1.0 && \GOOS=linux GOARCH=amd64 go build -tags netgo -o app && \docker build -t backendapp:1.0 . && \cd ..

…and:

cd source/backend/2.0 && \GOOS=linux GOARCH=amd64 go build -tags netgo -o app && \docker build -t backendapp:2.0 . && \cd ..

Bff

Cd into sources/bff directory and run:

docker build -t bffapp .

Push them all

Ok, now you have the images… push them all to a reposiroty of your choice and keep their names so we can set them into the k8s manifiests.

Or use these already built images:

docker.io/juanmatias/canary-app:1.0
docker.io/juanmatias/canary-app:2.0
docker.io/juanmatias/canary-app:bff-1.0

Deploy the app

We will deploy all the elements into kungfootest namespace.

CD into the root project directory and then:

cd manifests

Deploy the backend apps:

kubectl apply -f deploy-backend.yaml -n kungfootest

Deploy the bff:

kubectl apply -f deploy-bff.yaml -n kungfootest

Deploy the ingress:

kubectl apply -f ingress.yaml -n kungfootest

Test the app

Get the public IP:

kubectl get ing -n kungfootest

You can test your app with this command:

curl http://$PUBLICIP/kungfutest/mytest

You should have an output like this one:

{"id": "mytest", "response": "Congratulations! Version 1.0 of your application is running on Kubernetes."}

Run the tests

We’ll run two tests, locust to know response times, and resources to know the used resources.

Locust

From the project root dir:

cd locust

If the first time, create a virtual env and install locust:

pip install locust

Now, run the locust server:

locust -f kungfootest.py

This will open Locust server listening on localhost:8089… open it with your browser.

There, you must add the host (e.g. http://$PUBLICIP), the max number of users and the users spawn rate. Then begin your tests.

I’ll test it with 100 users and a rate of 10 and let the test run for 2 minutes.

Resources

While locust test is running run the script resources.sh. When finish, just type CTRL+C and it will show the AVG mem and cpu.

NOTE: It’s important to keep in mind that this script will get the resources requested for nodes, and the real use only under kungfootest namespace.

References

Monitoring Kubernetes cluster utilization and capacity (the poor man’s way) | Jeff Geerling

First version of this post was published in my blog here.

To mesh or not to mesh

Service Mesh

Resources

Requirements

BFF

BACKEND

Service Mesh

What we will measure?

Set up the environment

Cluster

The tests

No Service Mesh

Service Mesh

Compare the test resuts

No Mesh

Mesh

Conclusion

Set up the app

Build the app

Deploy the app

Test the app

Run the tests

Locust

Resources

References

Written by Juan Matías de la Cámara Beovide