How to save money fast with Kubernetes — Do FinOps
At ADEO, my team is responsible for managing large Kubernetes clusters, mostly.in Cloud environments. Now that our apps have been deployed, it is time to optimize the consumption and the cost!
Cloud operating costs are not always the major preoccupation of app developers. Many factors come into play, making cost evaluation as opaque as the far side of the moon. We propose a set of tools supporting the FinOps methodology. The key is to monitor each app’s operational requirements and fine-tune the infrastructure to avoid over-provisioning.
In this paper, we describe how to save money on the cloud and share tips on cloud infrastructure optimization.
Step 1: Measure and log your resource consumption
To understand what you gain or lose, you must measure how much the app costs, how much resources are consumed and who consumes those services.
For each app, the operations team works out the set of metrics to measure the usage and the cost of running the app. On GCP, this amounts to CPU and memory consumption in our current simplified model. For the next year, we will work on more precise metrics.
We created a tracking system to monitor detailed resource consumption hour by hour using custom scripts deployed as cronjob inside each cluster. Then, we aggregate these values into a BigQuery database, enriching the data with details on Business Units, platforms, domains, products, etc. Month after month, our tool logs consumption and proposes recommendations based on real consumption and reservation.
Of course, everything can be filtered and we now have several years of history.
Step 2: Optimize pod configuration based on resource requests and limits
The main optimization is related to pod configuration: resources requests and limits.
In addition to the invoice view presented above, we offer an operational dashboard implemented with Datadog allowing us to centralize all Kubernetes metrics. The latter can also be collected with open-source tools like Prometheus and Grafana, although with some hassle.
Pods are configured with CPU and memory requests. We take the example of a rather classic Java app managed by an ADEO team:
resources:
requests:
cpu: "2"
memory: 2GiAfter deploying this pod, the operational dashboard shows how much actually Kubernetes reserves in terms of CPU / Memory:
The graph on the left shows CPU reservation vs usage. On average, the pods are using only 5.52% of the CPU reserved. The graph on the right shows Memory reservation vs usage. On average, the pods are using 46.08% of the amount of memory reserved.
Clearly, we are over-reserving resources and wasting money on unused resource capacity on the cloud. We would work together with the Product teams to reduce the resource reservation of each app and find a good balance between resource reservation and effective average usage on different workloads. For example, we change previous application values to:
resources:
requests:
cpu: 200m
memory: 1100MiBy decreasing resource reservation in the pod specification (the blue graph) from 1 core to 0.6 on average, the result now is that the reservation and effective average usage are now close enough, while still keeping a reasonable safety margin. Finding the right values requires testing, measuring, and more testing under different loads.
Step 3: Repeat for all apps
We evangelize teams with dashboards and key metrics, so they can immediately see the gains. Also, we organize meetings or create a notification system, etc. Our goal is to make the user aware of possible economies by fine-tuning resource parameters.
In addition to the resource graphs, our complete Datadog dashboard includes 2 tables
The thresholds we have chosen are:
- 0 -> 40% = RED this means that you over provision resources. Consider reducing the requests on your deployment. It means that pods requests are too high rather than the usage.
- 40 -> 100% = GREEN means you have an optimized configuration. Usage and requests are close enough.
- 100% and beyond = YELLOW means you consume more resources that you reserve for your pod. Consider adding more reservations. Indeed, maybe you will suffer from the Quality of Service of kubernetes scheduling.
We know that the application evolves over time: its consumption changes during the day and at night. The proposed 0–40%, 40–100%, and +100% thresholds are arbitrary and serve as a guide for the teams.
Going further
FinOps doesn’t just stop at fine-tuning pod resources on Kubernetes. To go further, you may consider:
- A more accurate “internal invoicing” dashboard. This dashboard is sent to the product teams directly, so they know the costs hour by hour.
- Decrease pods provision at night and on weekends, depending on your real business requirements. By keeping the apps online only from 8:00 a.m. to 7:00 p.m. Monday to Friday, instead of 24/7, we save 70% on the total bill.
- Use Kubernetes autoscaling. The HPA can start pods with only 1 replica and scale up if the load increases. You might consider external metrics for metrics more suited to your needs. We also started to use WPA (from Datadog) as an alternative to HPA for some use cases.
- Also setting up limitRange and Resource Quotas on namespaces would lead to limiting the number of pods within a namespace.
- Check whether or not the “safe-to-evict” option is activated on the cluster. It may impact your application, be sure to read doc carefully.
I hope these tips will be useful for you! I’d love to talk with you about this: don’t hesitate to contact me on LinkedIn!

