Shedding the Pounds: Azure Cost Optimisation

Steven Munsey
Version 1
Published in
8 min readMar 25, 2021
Photo by Paxson Woelber on Unsplash

Day-to-day, working as a Microsoft Azure Consultant at Version 1, the questions that I tend to get asked are largely around technical scenarios in Azure, how would I implement this? What features would this provide? Why is this not working as expected?

More often, the questions are now starting to become more focused around cost consumption and optimisations, for example, why is this costing me a lot of money? Is there a cheaper alternative with similar functionality? If we do this, how would that affect my bill for this month, or over the next year?

Organisations are becoming more aware of the potential cost savings that are available by moving to cloud-based infrastructure, and cloud costs are starting to become a large part of IT budgets. For those who have already started on their journey to the cloud, cost savings may have been the main driver for them to start the move in the first place.

But moving to the cloud and expecting to see continued cost savings, without conducting continual assessments of applications and infrastructure for optimisation, can lead to an expensive bill. Part of my role with Version 1 is to help advise on some of these opportunities that can help organisations “shed the pounds” from their Azure bill.

Here are 12 optimisations and best practices I have applied over various projects, some of these may be obvious, some not so obvious, some could take minimal effort and some of these optimisations may require detailed planning to implement. It is the combination (and not limited to) all these ideas that will help drive down costs and help develop a culture of continual optimisation.

1. Shutdown Unused Resources

A simple one you might think, but there have been multiple occasions where I have come across Virtual Machines (VMs) that have been created for testing and they have been left running for days, or even weeks before users come back to complete that testing.

You are billed for every minute of uptime in Azure, and this scenario can cause that bill to creep up, so if you are not using it, and it is not a workload that requires to be running, shut it down to stop the meter running.

Even better, automate this task by using scheduled start-up and shutdown runbooks. Turning off VMs between 6pm and 8am on a weekday, and a full shutdown over the weekend from Friday 6pm to Monday 8am would see a 60% reduction in compute price, opposed to leaving the VM powered on 24/7.

Startup and Shutdown Runbook Example

2. Removing Unused Resources

In an on-premise environment, you can leave resources switched off for no real extra cost, unless you need the capacity you can keep those resources. You may not want to do this in Azure, if you shut down a VM that meter stops and so does the charge, however, you are still charged for storage attached to that VM. So, if you are keeping that machine for a rainy day, but looking for cost optimisations, delete those resources and re-provision them when they are required.

When thinking about cleaning up and deleting resources, you should also think about applications in your IT estate that can be decommissioned, sometimes that decommissioning gets pushed to the bottom of the priority list. Auditing infrastructure and removing resources that are no longer required can lead to great savings if performed regularly.

3. Right-Size Virtual Machines

The more resources a Virtual Machine has, the more it costs. Performing a regular assessment of VM usage and considering recommended VM sizes from application vendors, can help with making sure that you only pay for what you are using.

Here at Version 1, we make great use of native tools in our discovery and assessment phase, looking at Azure Migrate Assessments if an organisation is looking to migrate to Azure or continual monitoring of Azure Advisor for performance and cost optimisations.

4. Azure Spot Instances

A relatively new feature available is Azure Spot Instances, a spot instance that makes use of unused capacity in Azure at a discounted rate of up to 90% compared to pay-as-you-go instances. However, these instances can be turned off and evicted at any time if Azure needs the capacity, and therefore is only suitable for low priority test and development servers.

Spot instance configuration in the Azure Portal

5. Enterprise Dev/Test Pricing

If you have an Enterprise Agreement, you can make use of Enterprise Dev/Test offer subscriptions. Virtual Machines deployed in these subscriptions can run Windows and SQL Server with no charge for the Microsoft software, the price offered is the same as a Linux machine which can save around 50% of the overall compute cost.

If you do not have an Enterprise Agreement, Visual Studio subscribers can sign up for the Pay as You Go Dev and Test offer.

6. Selecting the Correct Storage Options

Azure Premium Storage (SSD) is the best performing disk currently widely available in Azure, however, this performance comes at a high price compared to standard (SSD or HDD) storage.

Premium storage is recommended for production workloads and applications where disk performance is key, but development and test workloads should initially utilise standard storage in Azure for lower costs, a disk can always be upgraded to premium storage if it is required.

A recent assessment I carried out showed that 65% of an organisation’s spend in Azure was storage, of that storage 78% was virtual machine disks allocated premium storage. Only using premium storage where required is essential for future cost optimisation.

7. Azure Hybrid Benefit

Azure Hybrid Benefit (AHB) allows you to leverage licenses with active Software Assurance that have already been purchased, to be applied against Azure Virtual Machines. This results in the Azure VM base price being applied without the cost of any additional licensing, which is usually around a 40% saving, compared to the pay-as-you-go compute prices.

Azure Hybrid Benefit has also been expended for SQL workloads whether there has been an investment in SQL Server Standard licenses or SQL Server Enterprise licenses. The below table shows how many vCore/vCPU licenses you get for each core SQL Server license:

Qualified licenses to Azure vCores table

8. Reserved Instances

A reserved instance (RI) is committing to using a Virtual Machine in Azure for either 1 or 3 years, in exchange for a discount on the cost of that machine, it is important to note that you are only “reserving” the compute portion of the virtual machine with an Azure RI. You can also reserve some managed disk storage too, but this comes under a separate Azure RI.

Microsoft has also branched out their reserved offerings to include Azure Database for MySQL, SQL Database, SQL Datawarehouse, PostgreSQL, MariaDB, and Cosmos DB.

Azure RI can be paid monthly or upfront and there is no contract lock-in. If required an Azure RI can be moved to another VM or cancelled altogether, however, if cancelled, there is a 12% cancellation fee. In my experience, RI’s fit best with production applications that organisations know are longstanding and are not going to be upgraded or replaced any time soon.

Utilising Azure RI can reduce compute costs up to 72% over a 3-year period, and when bundled with Azure Hybrid Benefit this can increase to up-to 80%.

9. Budgets

Budgets are essential to help planning spend and driving optimisation, with budgets, you can account for the Azure services you consume or subscribe to during a specific period. They can help you inform others about their spending to proactively manage costs, and to monitor how spending progresses over time.

When the budget thresholds you have created are exceeded, notifications can be triggered to the relevant stakeholders, but note that none of your resources are affected and your Azure consumption will not be stopped. If you are constantly going over your thresholds, then budgets can help you track where the excessive costs are coming from.

10. Use of Backup and Azure Site Recovery

This may be slightly controversial, but not everything needs to be backed up or have a disaster recovery plan, should an organisation experience a site outage. Yes, Azure Backup and Azure Site Recovery should be set up for all production machines as a best practice, but remember that these are both chargeable services. Setting these services up on non-production workloads that are rarely used or can easily re-created, will cost money that could be saved and used elsewhere.

11. Taking advantage of PaaS services

Platform as a Service (PaaS) offerings are not always the answer, there are many variables that contribute to whether a PaaS solution is the right solution for your workload, but when it is, the cost savings can be dramatic, when compared to its Infrastructure as a Service (IaaS) or its on-premise equivalent.

Two examples of this are Azure SQL and Azure Files, both services potentially take away the most expensive elements of the solution, the Virtual Machine, and its license. Here you will move to a model where you only pay for what you utilise, for example Azure SQL database resources or Azure Storage, rather than having a Virtual Machine running 24x7 in case the resources are required.

12. Making use of SKUs with Temporary Storage

The temporary disk is located on the hypervisor where the Azure VM is hosted, and data on this disk can be lost at any time due to activities such as maintenance, VM resizes, and redeployments to a different host. Therefore, temporary disks should not be used to store persistent data.

After a conversation with my colleague Chris Marshall (thanks Chris!) we worked together on a migration of around 450 Virtual Machines to Azure, we noticed that each of these machines had a disk dedicated for the page file. By removing these disks and using the temporary disk provided on most VM SKUs for the page file, we will remove 450 P3 disks from the estate, equating to a saving of around £1000 a month.

Another potential use of the temporary disk is for the TempDB on SQL Server. For SQL Server on D and G series VMs, the temporary disk is SSD based, and an application that makes heavy use of the TempDB could benefit from higher throughput and lower latency, if switched to the temporary disk.

About the Author

Steven Munsey is a Microsoft Azure Consultant, currently working in Version 1’s UK Digital Data & Cloud practice. Follow Version 1 and Steven for more blogs around Microsoft Azure.

--

--