Open your eyes and start cleaning: The first method of the FinOps strategy.

9 min readSep 26, 2022

Hi, I’m Anton Grishko, Cloud Architect at Profisea Labs. We continue to talk about FinOps, the cultural model of cloud financial management. With FinOps, businesses can maximize profits as their teams (technology, engineering, finance, etc.) make spending decisions based on precise analytics.

I’ve decided to write a series of articles about cloud management to help readers find the answer to perhaps the most popular question I regularly receive from cloud infrastructure owners. Faced with cloud budget overruns and inflated cloud service bills daily, they wonder what specific steps to take to reduce these costs. They need clear answers. Today, continuing the topic of effective FinOps methods started in the previous article, I want to talk about the first method of FinOps strategy.

The first method of the FinOps strategy is to provide full-scale visualization and visibility of the cloud infrastructure. As a result, full-fledged cloud waste management is implemented, contributing to the first (often significant) reduction of costs in the cloud environment.

This information will be helpful for cloud infrastructure owners, IT managers (CTO, CFO, CIO), and engineers (DevOps, developers, and others) who care about the problem of rapidly growing cloud costs and are looking for ways to optimize cloud costs.

FinOps is a living model of cloud-based cost management

Today, as companies consider their IT infrastructure options, whether to stay on-premises or move to the cloud, FinOps clears up many doubts.

The most important reason any company decides to move to the cloud is the desire to take advantage of the cloud infrastructure innovation, scalability, and speed, reducing time to market and increasing competitiveness by expanding the user base. Gartner predicts that by 2025, 80% of enterprises will move to the cloud.

These percentages are well above the median, but some organizations still hesitate — the most common concerns about security and cost. So let’s focus on the problem of the high price of the cloud because the fear of high costs is not unfounded, as you know from previous articles.

Organizations continue to increase spending on cloud technologies, although the rate of growth is slowing. Flexera respondents reported that their average public cloud spending was 13% over budget. Additionally, they expect cloud spending to grow by 29% over the next twelve months.

No one wants to pay more than they have to. That’s where FinOps comes in, revolutionizing the culture of financial reporting in a cloud variable cost model and enabling distributed engineering and business teams to balance speed, cost, and quality in their cloud environment decisions.

Given that when moving to the cloud, everyone who uses cloud resources must share responsibility for costs, existing software asset management processes designed for on-premise technologies do not consider the cloud’s heterogeneous nature. As a result, a new strategy is needed to ensure that all users are accountable for their cloud spending.

And while creating a mature FinOps strategy takes time, effort and collaboration, it will pay dividends in the future. A sound FinOps strategy will help maximize every dollar spent in the cloud by creating transparent accountability processes while employees make cloud transformations.

Stop moving in the fog — shed some light

Providing total visibility and visibility into your infrastructure is critical at the start of your FinOps journey. Although it doesn’t take much effort to open an AWS account and start spinning up instances, professionals often don’t do the proper initial preparation to keep costs from spiraling out of control. It is further complicated by the fact that many departments and individuals often work in the cloud. As a result, a lack of understanding of what is happening inside the infrastructure will lead to a cost sprawl problem.

Data analysis and visualization from cloud providers

AWS, Google Cloud, and Microsoft Azure have made sure their customers keep track of changes and costs in cloud infrastructure and offer the following solutions:

Amazon QuickSight makes it easy to develop and host unified dashboards and get instant responses to local language queries.
Amazon CloudWatch supports custom metrics and a possibility to graph/control billing-related information.
Azure Data Explorer is a highly scalable data mining service used to build complex data processing solutions. Azure Data Explorer is highly integrated and can also integrate with visualization tools.
Azure Monitor enables monitoring of Azure services and first-party solutions, providing detailed cloud infrastructure monitoring for deep insight.
Data Studio from Google Cloud can help make sense of data and aid in interactive analysis. In addition, data Studio helps keep field trends under the radar, react quickly, and leverage data forecasting.
Google Operations (formerly Stackdriver) provides metrics, logs, trace support, and visibility into Google Cloud platform audit logs.

Even though cloud technology providers provide their users with cloud visualization tools, most of them have several significant disadvantages:

No real-time statistics. Most reports are updated every 24 hours or less often, causing a significant lack of data to understand the real-time picture.

Too complicated for non-specialists. In most cases, organizations need to turn to cloud experts to customize, automate visualization dashboards, and, more importantly, translate data from expert language.
No single dashboard for all statistics, so there are many reports from different datasets. It takes a lot of time and energy to find the necessary information and keep track of all the updates.

No complete/visual view of your cloud environment in the form of lists and diagrams so that experts could control all the resources used, understand their place in the infrastructure, and catch all the dependencies between machines.

No auto-pull feature for each instance provides one-click access to all machine details and data management right on the charts, including PDF or Excel downloads.

Lack of multifunctional sharing/reporting in other environments. Business users spend most of their time translating and developing reports and lack versions where they can push notifications from one domain to another for quick and straightforward interpretation. Also, sharing information with a user not added to IAM is too complicated.
No immediate cloud management feature. Just imagine how cool it would be if you could have a diagram of your cloud infrastructure to view and manage it right there. You can deactivate mismanaged instances and resize or convert them to spots, studying all the details in the interface. This advanced functionality is a scarce advantage among existing cloud visualization tools.

No possibility of simultaneous monitoring of several cloud environments. Let’s imagine that you have purchased services from several providers. I think setting up the process of tracking each infrastructure will be, put it mildly, very time-consuming.
Lack of functionality to track system changes, including details of who did what and when

Deficiency of cloud security threat monitoring with automatic advice to resolve and predict threatening situations

No advice function. Modern cloud visualization tools should provide accurate information about your cloud environment and actionable advice on optimizing your cloud infrastructure, upgrading it, saving money, time, and more.

DevOps professionals have used monitoring solutions such as Grafana and Prometheus for metrics storage (Prometheus) and data visualization (Grafana) for eight years. Although working with these tools is quite adequate, planning, installing, configuring, and maintaining these monitoring pipelines requires considerable DevOps expertise and experience. Not to mention the lack of the above functions (all or some) and the ability to immediately correct infrastructure deficiencies based on the conducted analytics right on the spot.

Everything is clear. Let’s start cleaning!

Like any other waste, cloud waste accumulates when an organization acquires more capacity than it needs and when health checks and regular cleanups do not occur.

Cloud waste occurs when:

Resources are unmanaged. We can blame data and other services involved in testing and development for accumulating debris, allowing instances to run full time, which is entirely unnecessary.
Resources are too significant or too many of them. IT leaders tend to think “the bigger, the better” when predicting how many and what type of cloud instances they need. As a result, they often choose a more prominent instance than they need.
Resources are lost. Resources become homeless and wander the infrastructure alone. This happens when a virtual machine ceases to exist, but the resources attached to that instance continue to run and spend money. The main problem is that it is difficult for cloud infrastructure owners to detect these volumes and shut them down.
Neglecting saving options. Cloud providers offer different pricing options to save significant money when used correctly. As such, pricing options should be used no matter what they are called, whether they are Reserved Instances in AWS or subscription-based pricing models in Alibaba Cloud.
Neglecting regular health checks and cleaning. When was the last time you cleaned cloud waste? And the problem isn’t even that cloud infrastructure owners don’t want to clean up. They can’t see the whole picture, don’t identify the missing elements, and end up with colossal cloud bills that require specially trained experts to read.

Therefore, we can see, analyze and detect unmanaged and lost resources by setting up the correct visualization and visibility in the cloud environment with the help of modern tools and platforms.

What to do when cloud waste is discovered? After all, health checks and cloud waste cleaning to eliminate old volumes, snapshots, and machine images should become a regular procedure. And this, of course, is quite a process that requires time and effort from a cloud of experts. However, the solution can be a platform that helps visualize the cloud infrastructure and manages cloud junk right there, without leaving the register, with 1–2 clicks of the mouse.

The Waste Manager detects unattached, duplicated, and incorrectly sized resources and checks their creation time. If they are n+ days, the Waste Manager marks them as waste. You will have them all listed in the Waste Manager with all the details, including where to find that instance.

The situation may be different if the Waste Manager handles actual backups. For example, we have 2 AMIs for one EC2 instance. One is 14 days old, and the other is 34 days old. The Waste Manager will mark as waste only 34 days old. However, it will leave one backup, no matter how old, because it is the only backup of the instance. In addition, instances of incorrect size are also displayed in the Waste Manager so that the user can quickly see this and apply the correct size recommendations.

And the best part is that users don’t need to do health checks manually. Instead, the Waste Manager automatically collects all cloud waste and waits for the cloud owner to come and clean it up in minutes, saving you hundreds or even thousands of dollars.

To be continued

Today, I shared with you the first practical method of FinOps strategy to provide modern visualization/visibility of your cloud infrastructure and manage cloud waste, which will help reduce cloud costs significantly.

In the following article, I will reveal the second FinOps method — how to choose the right family, size, and type of machines that will not lead to a sharp increase in cloud computing bills.

I hope this information was helpful. Let’s Meet Again in a couple of weeks!