Better Cost Control in Kubernetes

Although they have the pricing models are easier to work, Google Cloud (probably all cloud provider ) have a lot of hidden costs. We need to put a lot efforts on optimisation of our deployments, and reviewing our costs to ensure that we have well optimised usage of resources

prediction of costs:

With you pay as you go, you will be billed only when you consumed a service but it necessarily leaves you open to unexpected cost spikes if something unexpected happens. from where it is really important to monitore your daily spend with reports, google cloud offer detailed billing analysis and you can export your billing data into BigQuery and have visualizations.

Deployment of Hundred Services:

when using Kubernetes with Microservices Architecture, where each services has a multiples type of deployments, statefulsets, load balancers, … it is really tricky to estimate the cost of our eco-system compared on monolith system where you can roughly estimate based on number of components you have in your system.

Building and Owning a service is not a bunch of line of code you deploy inside docker containers and ship it to cloud! The ownership of a service on Microservices architecture means you will be on top on of your service in all aspects ; design , code, resources management, motoring metrics, costs.

  • Service Essentials : you’ll be running many services. many can mean tens, hundreds or even thousands. Each service encapsulates a piece of business capability into an independent package. they should be focused on a single purpose and big enough to minimize interactions
  • Configuration Management:

Does each service needs it’s own data server ?

Not necessarily. possibly colocated within a shared data server. The key point it that the services should have no knowledge of each other’s underlying database. This way, you can start out with a shared data server and separate things out in the future with just a config change.

Resources need to use : Requests and Limits ?

K8S has possibility to decide on resources need for your containers:

For each resource, containers can specify a resource request and limit, 0 <= request <=Node Allocatable & request <= limit <= Infinity. If a pod is successfully scheduled, the container is guaranteed the amount of resources requested. Scheduling is based on requests and not limits. The pods and its containers will not be allowed to exceed the specified limit

— Start Small —

Make sure you request the minimum resources (requested resources will be allocated to your deployment even if it is not used and you will billed for it) needed to start you service in lowest load! and Use HPA or VPA to scale up and down when it is needed.


The Horizontal Pod Autoscaler automatically scales the number of pods based on CPU or costumed metrics based on business values of your services (number of requests, lag of consumer in message handlers, …..). it is really important to have autoscaler, so you will not request the maximum resources forever.


Vertical Pod Autoscaler (VPA) frees the users from necessity of setting up-to-date resource requests for the containers in their pods. When configured.The inconvenient of VPA Whenever VPA updates the pod resources the pod is recreated, which causes all running containers to be restarted. Also it should not be used with HPA on CPU or Memory at the same time.

‘I would prefer to do VPS manually, always monitor usage of resources on your monitoring tool then just update requests and limit of resources by yourself!


Storage Type : When using volumes, it is really important to understand disk utilisation (IO) in your Statefulset! Not all your Statefulsets need an SSD !

K8S make it easy for you to create your PVC ( PersistentVolumeClaim ). In additional of decision of your storage type have a big impact on your cloud bill.

Size: Also of your volume can be another factor which can inflate your bill as well. e.g: size of redis cache should not be more then 2 GB. But for your main database you will need to do a good projections!

Expanding Persistent Volumes: expanding PersistentVolumeClaims (PVCs) is now enabled by default from Kubernetes 1.11. So when you request size request the minimum possible, as you will be always able to extend it without down time and with no need of data migration.

Reclaim policy in K8S tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Deleted.

  • Retained : means even if you delete your pvc, the disk still available on your cloud. So you will need to clean it else you will be paying for it.
  • Deleted: deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset. This not recommended in your productions.

Service Scheduler :

  • Node Selection:

Choosing the right node while scheduling your pods is really an important factor on good optimisation of resource allocation! Google Cloud provide multiple type of machine : n1-standard-1 , n1-highmem-2 , n1-highcpu-2 ,...

So better distribution of pods inside our nodes, increase the efficiency of resource allocation on our cloud.

e.g : Mongodb is using memory more then cpu, so using n1-highmem-2 could be a good fit for Mongodb statefulset.

Always, monitor your node resources utilisation using some monitoring tools :

Inter-zone networking :

We use multiple zones to ensure redundancy and reliability, networking can introduce surprisingly high costs, particularly for noisy components like Apache Kafka/Storm/… To avoid that you will need to review your fetch and pull strategy! Also review networking utilisation over time in your system:

Better Scheduling pods might help as well to reduce Inter-zone traffics: Group pods that have high interactions in same zone

Search for the discounts:

Google offers discounts which are automatic discounts and other need committed contract:

Price Calculator :