Cost models in cost management for OpenShift
I am working on a new series of articles about this topic in linkedin. Please go there for updates
From charging a flat fee to complex cross-product discounts in your telco bill for a tier-priced service, modeling the real costs of running a service in the cloud is a challenge.
Let’s start by stating that this article is not about creating an invoice, or about providing carrier grade billing service but about a good way of modeling your costs or your team costs for chargeback and showback. There are many considerations that would need to be added to cope with a real B2B billing engine that won’t be taken into consideration here (like regulatory and legal requirements in each of the countries that your users can be billed)
OpenShift includes access to a new cost management tool, at no additional price, and we have been working hard to make sense of all of those little things, so it will be great to get your feedback to firstname.lastname@example.org
A cost model that is simple enough to work and complex enough to be accurate should…
- … support different sources of information, like metrics, costs from providers, labels, etc.
- … be capable of adapt to different environments without needing to code the login
- … should be as simple as possible so it is easy to understand
- … should be capable of modeling most of the scenarios (Pareto)
But, what do I mean when I say cost model?
Let’s think of a cost model as a framework we use to define how to go from metrics and cloud costs, to something that is closer to the real costs that you would be wanting your customers to use.
In cost management, our cost model considers three types of inputs:
1. Raw costs provided by the cloud. All cloud providers give you access to the itemized bill, and a lot of information about what composes the bill (basically one line item per hour or day for any object being used)
2. Inventory items. OpenShift does provide you with an inventory of the components that are part of an installation, including those parameters that make sense to size the cluster (i.e. the number of CPU or GB of memory per node). They are used for capacity based charging.
3. Metrics. Internal metrics allows usage based charging, that we gather through the cost management operator (in its current form: Koku metrics operator)
Step 1: Gather input data
We need to gather all data from different sources and ingress it into the tool for processing. Every source is different, but the results are the same: we get all that line item data uploaded into our database to make sure you can see what is needed.
OpenShift. OpenShift data is gathered through the OpenShift operator that can be optionally installed in your OpenShift cluster.
Cloud. We want to support as many clouds as possible, and currently we support AWS and Azure. In order to reduce the permissions needed and facilitate processing, we basically ask you to give us read only access to the Cost and Usage files of AWS or equivalent.
Next step: calculations
Calculating charges then depends on the type of source you use for infrastructure and how accurate you want to be. Clouds provide a lot of information but they can’t show you every single cost you have. Even if you can know your direct and operational costs for your cloud assets, you also need to take into account some indirect costs too (like the cost of the systems you use to automate your complete data center or the amount of associates that are responsible for keeping everything running).
So we’ve taken a blunt approach and modeled these different scenarios in the a simple but effective way:
- Where you are comfortable charging per usage (CPU and Memory, Storage), we allow you to define a price list that is using the data metered.
- If you prefer to charge by big numbers, you can define in your price list a price per node (and soon you will be able to do it per project and cluster).
- If you don’t really want to go into detail and maintain the lifecycle of your price list, we support adding a markup to the cloud cost.
And as we want to improve the model to support new scenarios, we have also added the capability to adapt your rates to your labels:
- Create a label: “storage-tier” and associate it to your storage
- Create a rate that depends on the label “storage-tier”, and define different rates for “SSD”, “HD”, etc.
- Mark one of them as the default when no label has been found.
Example 1. You are comfortable with pricing based on metering, and you can select a proper price per CPU and memory for all your OpenShift projects. Just define a price list and associate the rates to your cluster.
Example 2. You don’t know how much a CPU costs, but you know how much your cluster is costing you. Put a price on the cluster, monthly, and see how cost management distribute the cost into projects and nodes.
Example 3. You believe that approximately AWS real costs are double the bill you receive from Amazon. You add a cost model with a markup of 100%, and let cost management distribute the costs.
Getting the real costs of running Kubernetes in the cloud can be complicated, but there are tools that can help you get as close as you need to a real number to be sure that you can make business decisions.
Cost management for OpenShift provides different components that you can use to create a full cost model, based on actual bill data, metrics or inventory items, and is capable of adapting the rate using labels to reflect different scenarios
OpenShift includes cost management at no additional cost from it version 4.3 (for supported versions), a cost management tool that can help you to understand and manage your costs.
You can start using it today to understand your costs when running OpenShift, get a better understanding now how the costs distribute into your business elements, and collaborate with us to make it the best open source cost management tool for Kubernetes.