Azure Cloud Cost Optimization

Abdelhalim Addad
OCP digital factory
7 min readNov 22, 2022

If you’re in the DevOps business, working on a project that deploys to the cloud, chances are you’re asking yourself: “Why should I care about cloud cost? People in finance should handle it, it’s not my job, right?”. No, that’s wrong! Finance team has most likely no clue about that 16 cores VM you tested and forgot to shutdown or delete. They might be accountable and pay the bill, but you’re responsible for making that bill reasonable.

We now rely more on the cloud, because it is convenient, simple to use and not every company has the technical or financial requirements to run its own datacenter. As companies continue to grow, they start creating more projects, deploying more workloads, implementing new prototypes and handing out access to new hires and interns.. Without a proper resource optimization strategy in mind, the cost will eventually skyrocket.

In this article I’ll share with you some ways to optimize your cloud consumption, Although the tips brought here only target Azure Cloud, most of them are still valid for dealing with other cloud providers.

Identify unused resources

The most obvious approach to decrease your cloud consumption is to find out which resources are created and not being utilized. This can happen when you try something out and forget to clean it after.

Many engineers provision high compute VMs to perform some functions or run a batch and as soon as they are happy with the result, they forgot about the VM. Most of the time the cloud provider won’t warn you about using high compute VMs so this usually results in higher bills about resources that were used only once.

A good cost optimization strategy will repeatedly look for this type of resources, stop them or remove them completely. It is also recommended to set quotas or Azure policies preventing users from purchasing high compute resources unless they are approved by the cloud administrators.

Rightsize your workloads

When we first move an application to production, we want it to be available, able to handle heavy loads and serve all concurrent connections from our users and never crash. This can often lead to resource Oversizing especially if the application is open to the public, because we don’t have a clear estimation about the incoming traffic or the expected number of users. We don’t want to have more resources than required because it is incurring an additional cost.

One way to handle this situation is by using autoscaling. Autoscaling let us provision and free up resources on demand, we can have a desired number of instances (virtual machines, web apps, Kubernetes nodes, etc..) with a minimum size and as the traffic increases, more instances are being spin up. In the other hand, when the load decreases, those instances are freed. We can also set up a maximum number of instances to insure we are staying on budget.

Many Azure services come with autoscaling capabilities, we can use Virtual Machine Scale Sets for VMs, and for the other resources (AKS for example) we can enable the autoscaling option.

In addition to autoscaling, we have to understand our application needs in terms of resources, is it Compute intensive? Memory intensive? This will let us choose the right minimum size (In Azure this means which VM family to use in the case of virtual machines). Also, there’s always a room to optimize the code to use less resources.

Use Dev/Test subscriptions

Many Azure resources we use aren’t for production, they don’t require the same level of availability and reliability. For that purpose, Azure offers a Dev/Test subscriptions for provisioning those resources.

Dev/Test subscriptions as the name implies, are designed for developing and testing applications, you can use any azure resources for testing new prototypes or making dev environments, available resources can be virtual machines, managed databases, web apps etc..

Dev/Test subscriptions saves up to 57% the actual cost for a normal subscription, which is a great percentage to optimize your cloud cost. On the other hand, there’s no guarantee or SLA for uptime, so you can’t risk using them for production workloads.

Use reserved instances

If you are committed to Azure cloud for the long term, you probably should consider investing in reserved instances. These are larger discounts based on upfront payment and time commitment. Reserved Instances savings can reach up to 75%, so this is a must for cloud cost optimization.

Reserved instances are available for many Azure resources, but let’s take the example for Azure D3v2 Windows VM, as you can see, it saves 46% if reserved for one year compared to “pay as you go” plan. And 64% if reserved for 3 years.

Before going for reserved instances, make sure you are committed to stay on Azure, also your architecture is not going to change, if you plan to move soon from virtual machines to containers you probably should’t use reserved instances.

Merge resources when possible

This is important if you have many projects in the development phase. All cloud providers charge for under used resources, even if you do not use them at all. You can optimize costs by identifying and merging these resources to reduce costs.

For example, if you have 3 applications in ongoing development, and you dedicate one Kubernetes cluster to each one. You might notice that the CPU utilization is under 10% for each cluster but the provider will charge for 100%, you are wasting a significant amount of computing resources and you can save this money by provisioning one cluster and using namespaces to logically separate these projects. Another example is App service, you can use a single App Service Plan to host many apps in you your DEV environment and are not under heavy loads and the same approach can be applied to managed databases.

Combining this strategy with Dev/Test subscriptions will definitely reduce your cost and save you a ton of money.

Use Azure Cost Analysis

Before you can control and optimize your cloud cost, it is important to understand where costs originated within your organization in the first place. Cost Analysis is a great tool to explore and analyze your organizational costs. You can view aggregated costs or break them down to understand where costs occur over time and identify spending trends.

Views can be grouped by service types, subscriptions, resource groups and many other criteria found here to better understand your spending. Once you spot the costly services, you can set budgets to get notified as the cost exceeds specific thresholds.

In my experience, Cost Analysis was a big help to understand the expensive bill, caused by some Azure services notably Application Gateway and Azure Front Door, which allowed us to make some architectural decision to leverage multi-cloud based architecture which I’ll be discussing in the next tip.

Consider multi-cloud

A company may find themselves locked into a certain cloud provider. Vendor lock-in can become an issue in cloud computing because it is very difficult to move some resources once they are set up, there’s also a case where a third party’s software is incorporated into a business’s processes which makes it dependent upon that software.

Cost wise, the vendor lock-in is not desirable because vendors may impose massive price increases for the service, knowing that their clients are locked in even if the quality of service declines, or never meets a desired threshold to begin with, the client will be stuck with it paying a high price.

To avoid those risks, one’s should evaluate cloud services carefully before making a commitment, ideally with a proof of concept deployment, also the process of migration to another cloud provider should remain possible, by insuring the data can be moved easily and backups are regularly taken.

The best scenario to achieve cost-effectiveness while insuring a decent quality of service for your applications is by using multiple cloud providers and taking the best offer from each and be open to change it overtime.

In the case of Azure we can substitute expensive resources like Application Gateway and Front Door and use our own gateway implementation or kubernetes ingress, and for CDN and WAF we can use Cloudflare for example which is known for its large content delivery network with improved performance, reduced load times and lower to no cost for the free plan.

Wrap-up

In my opinion, Cloud cost optimization does not have to be complicated, but it does require a disciplined approach to establish a good rightsizing habits, continuously check for leaks and drive insights and action through analytics while avoiding vendor lock-in to lower your cloud bill.

--

--