CostOps vs FinOps

Tim Prentice
Cloud Financial Management for Kiwi’s
8 min readApr 21, 2023

I first started working with Public Cloud in 2014. The role was with the first-ever AWS distributor trying to bring Public Cloud to a curious but worried set of Managed Service Partners. Back then, I spent a lot of time explaining I was not asking teams to sell books on the internet. Instead, I was tasked with educating people on what could be a huge opportunity for them and their customers. In just eight short years and many roles later, those initially tentative conversations have come to dominate infrastructure strategy for most businesses. Now more than just a big opportunity, there is a genuine sense of urgency for many companies trying to capitalise on the new capabilities that define modern IT delivery.

Throughout all this change, I have also seen “FinOps” grow from obscure references in niche blogs to well-defined roles in large organisations. The resulting community has been working hard to unlock the power of consumption economics and augment the technical capacity of Public Cloud with meaningful business process improvements. However, while the progress so far has been exciting, most of the hard yards are still ahead for the FinOps community. One of the most pervasive blockers is the confusion between FinOps practices and what, for the purposes of this blog, we will call CostOps. A colleague of mine dubbed this term during a session to try and define our most significant challenges, and it fit the bill nicely.

CostOps

One of the most significant barriers to Cloud financial maturity is the illusion of progress. The simplest success metric for Infrastructure teams and MSPs alike is cost reduction. If your team is skilled, it may even be a reduction in spend without a reduction in capability, e.g., pure cost out. Whenever I get access to a new public cloud environment, it is a safe bet there are at least some inefficiencies to be hunted down and cleaned up. This experience has created a bunch of very talented CostOps practitioners that can look at Cloud usage data and output a set of actions that will rapidly bring down any wastage. If your leadership is motivated by cost at the time, it looks like a great outcome when engineers step in and make the changes.

The only problem is, what about the time before the cost review? All that time when inefficiency was incurring costs unchecked. What about the time between the end of the current review and the next one? Will the costs spike back up, waiting to be cleaned up again the next time you engage an expert?

You do not have to look too far to find examples of a better approach. By integrating cost awareness directly into both development and operational teams, waste can be identified and remediated before it has a chance to incur significant costs. In fact, most born in the cloud companies do this reflexively. DevCostOps, if you will.

DevCostOps, as with all the stacks of DevXXXOps, is as much a cultural shift as it is a process and tooling one. Costs are tracked not at a project or department level but integrated into the engineering teams’ design, deploy, and manage processes. This level of integration allocates the time and resources required to review inefficiencies as utilisation data becomes available rather than posthoc. This type of engagement integrates the efficiency of Public Cloud use into a company’s definition of quality engineering. Teams that do this well post metrics on efficiency publicly and compare them the same way a DevOps team might look at Velocity, Uptime or Performance metrics.

The problem?

It’s pretty hard to prioritise and fund the work needed to upgrade a team’s practices to allow this kind of approach. It takes skills and experience that come in rare and expensive heads. This is often made worse because most businesses are unaware of this issue until they have their first significant budget blowout, which creates an urgent need for cost-out — time to climb back onto the CostOps roller coaster.

Photo by Jp Valery on Unsplash

FinOps

In its best iteration, FinOps includes all the components of DevCostOps and is an umbrella term for any of the practices and roles mentioned here. That said, there is one aspect of this space that many FinOps consulting partners and infrastructure teams have yet to engage with: reliably integrating Cloud Operational insights into Finance processes.

I will freely admit that before my time working inside a large enterprise, I was ignorant of this issue myself. I would regularly be heard complaining, “I can save them $XX if only I could get their team to pay attention to this issue”. It took stepping into the world of enterprise finances to realise the real problem. It was not a lack of interest in the engineers or ops teams but that the business had no conduit to get the information to the right people. Outside the quarterly or annual budget process, most operations teams did not interact with finance that much. The demands of the rest of the business were much louder. Then when the end of a period was closed, finance reconciled budgets, and the reality of the costs became apparent. With this data in hand, they exert their often potent influence. It’s time to cost out, in a hurry!

So, what does a healthy FinOps progression look like?

There is no one answer. Every organisation we have worked with has needed a different kind of help. That said, I can share a few insights from the common patterns we are seeing:

Insight 1:

Engage with finance directly and early before trying to establish any of the following ideas. Upcoming migrations can provide an excellent opportunity to explain how cloud financials work and the commercial levers available to measure and mitigate Public Cloud costs. Most finance teams have a regular rhythm for collecting information from the business. These processes usually focus on budget changes and shifting funding from one area to another, but this rarely works at the cadence that suits Public Cloud spend. Discuss the opportunity for a more granular rhythm for updating the forecasted spend on Public Cloud. The process for oversight and approval comes later; this part focuses on information gathering. Useful things to know from your finance teams:

· What format should data come in for finance to integrate it into their workflow

· Are there CAPEX or OPEX concerns (speak to your Public Cloud provider if you want more insight into how to best cater to these)

· What are their KPIs around forecast and budget accuracy? How can being more transparent and granular help them achieve their goals?

Insight 2:

Get to understand the approval and oversight mechanisms for forecasting and budget allocations. This is the heartbeat of the business’s resource flow, and it is also the source of pain when it comes to misaligned spend. To integrate cost awareness without stifling innovation, teams need a way to quickly request more funding or be recognised for returning planned spend to the business.

Do not underestimate the value of celebrating a reduction in planned costs! This should have a forum where the message can be received loudly and widely. If you do not celebrate it and make returning an allocation easy, teams will not prioritise the activity outside of large-scale (and therefore very distracting) pushes for reduced costs.

Insight 3:

Establish granular ownership and accountability of cloud costs. We have discovered that the most logical grouping is often per Application. We recommend designing your resource bucket structure to match. This can make holding teams accountable for the cost they incur easy. That said, it is not always that simple. Many companies have already grouped resources in less transparent ways. Getting the required clarity in these situations might take some rather sophisticated tagging.

However you go about it, getting insight from the people closest to how the cost is incurred is critical. This can only be achieved by having either an engineer or an engineering team lead generating cost held accountable for that cost.

The ideal is both a “Technical” and “Business” owner for each resource bucket. The Technical owner dictates who has access to resources and can incur cost. The Business owner owns the forecast of the spend in that resource bucket and any requirements for asking for more money when needed. They work together to communicate the expected changes to cost over time and own the investigation process when costs are not as expected.

Insight 4:

The most effective glue to hold this process together is a monthly public cloud forecast. It is where the technical team’s insights met finances' need for predictability and transparency. The effort required is minimal when a team has an environment with little variation over time; they do not necessarily need to participate every month. Usually, just a 15-minute conversation every quarter to ensure the long-term outlook remains the same. However, where teams experience a lot of growth or volatility, they can proactively check in to ensure they have the resources they need or hand back funding to other teams or projects that could benefit. One unexpected benefit of our first true implementation of this approach was it also made cost issues very easy to spot. If something was way off forecast, we could investigate and remediate it quickly. It is amazing how much a rogue cloud watch metric can cost when left unattended.

It is not to say this process is for everyone, but it does highlight a few ways to make good financial hygiene a “win-win” for both tech and financial teams.

Bringing it all together

Every business has a team that cares about cost and financial waste, and for good reason. There is no better way to increase profitability than to cut operating costs. Unfortunately, the problem for many companies today is not that no one cares. It’s that the issue must get so large and noisy before it can be prioritised. What’s worse, when the budget overrun does bite, the requirement from the business to cut costs can be hugely disruptive and damage the credibility of tech teams.

In cases where we got the new, more granular level of communication in place with finance, we never had another budget blowout. For example, I saw first-hand the power of a financially aware technical team when we had to tackle the massive downsizing of our estate because of COVID-19. We never once had a conversation involving broad-brush shutdowns of services or demands for arbitrary reductions in cost. This was because Finance could see that massive efforts were underway, and they could see it in almost real-time. That won us a lot of trust when setting the next round of budgets under new financial constraints.

FinOps can be defined as better communications between Finance and Ops and is how you generate the “why” teams need to engage in CostOps. In return for the relatively modest effort, finance teams can comfortably manage the variable spend of Public Cloud, and technical teams gain substantially more freedom to innovate.

--

--