Optimizing Interzone egress cost by compression and zone-aware traffic routing

Sivadeep Nallana
3 min readApr 8, 2023

--

Recently in the organization I'm working in we faced a very interesting problem with cost optimization. Other than the infrastructure cost that we usually see, we also saw a huge inter-zone egress cost (our infra is hosted on GCP). This is how we proceeded on solving this issue.

  1. GRPC compression
    The first thing that we tried was GRPC compression. Most of our microservices use GRPC and we thought compressing the network data on the client side and decompressing it on the other side might reduce a lot of network data that we are sending. We read that using gzip compresses the data by 95% and guess what !!!! Our network egress between the components which enabled compression reduced by 90%.
    This resulted in a reduction of almost 40% of our actual cost

Although we achieved a lot of optimization from this change alone, there are places where we couldn't enable compression, like when our services talk to Memcached. And this Network Egress was as big as what we found in other services.

So to fix this we tried the same zone Memcached services. Right now all our Memcached services were distributed over different zones and there is no rules when a service fetches/puts in Memcached.

  1. Have 3 different Memcached deployments in 3 different zones
  2. Identify the zone when our service pods are running
  3. Configure the services to talk to Memcache which is within the same zone

Although there is no direct way to get the zone information inside the pod, we created a simple docker image (https://github.com/sivadeepN/k8s-pod-zone) that fetches the zone information and adds it as a label to the pod. This image can be used in an init container.

initContainers:
- name: get-pod-zone
image: docker.io/sivadeep/k8s-pod-zone:latest
command: ['/script.sh']

Later this zone which is set on the label can be used as an env variable and can be used to configure the service inside the container

env:
- name: POD_ZONE
valueFrom:
fieldRef:
fieldPath: metadata.labels['zone']

Currently, we are testing this on our testing servers and we are already seeing a large reduction in our inter-zone egress we estimate that this might result in another 30% of the cost at least.

PS: Compression is not for free. CPU usage increases and you pay for the extra CPU. In our case, our CPU usage increased by 1.5 times but the reduction in cost was far more so it was fine. The above techniques are limited to systems showing high network egress and are not a generalized cost optimization solution.

--

--

No responses yet