Sequencing Cloud Migration to Reduce Cost: What to Migrate and When
Overview
Is your enterprise cloud migration costing more than you expected from your business case? Is your TCO not coming down as you expected? Is it going to take years before you’re into positive return-on-investment?
If so, this guide will give the secret sauce you need to reduce your TCO and start seeing a return much sooner.
A Really Obvious Statement
You should establish your Cloud Strategy BEFORE you undertake a large scale migration of your enterprise’s workloads to public cloud.
And Yet…
Many organisations manage to get years into their cloud migration journey before realising they haven’t actually got a clear cloud strategy! What does this look like?
There are common symptoms:
- Migrations are not going as quickly as they had hoped.
- The migration is not achieving originally anticipated TCO reductions.
- The current on-prem + cloud estate is turning out to be pretty expensive, with many workloads creating double run costs.
- A number of “problem” workloads have been discovered, which are costing a lot and proving difficult to migrate. And they don’t know what to do with them.
- And even where legacy on-premises applications are being successfully decommissioned, these don’t seem to be converting to proportional on-prem cost elimination.
Does any of this conversation sound familiar?
And it’s at this point that the organisation realises that their original cloud strategy — and specifically, their path to cloud — was not sound.
The good news: if this is your organisation, it’s not too late to fix it. But to do so, you need to reevaluate your cloud migration purpose, scope and objectives. You may need to reset your approach. And if you ignored the related strategic considerations at the start — such as data centres, mainframe exit, commercial hosting licenses, and embracing open source — then you need to make sure you don’t ignore them the second time around!
Why Are You Migrating?
Let’s do a brief recap of the rationale for doing cloud migration in the first place. At the very beginning of my Cloud Adoption Series, I wrote an article on the topic of How to Create Your Cloud Strategy. In that article I summarised the value of cloud with these images:
These benefits will can be organised into a few categories:
- Business value — i.e. where we can: get an application to market faster; improve reliability, stability, and availability; improve the customer experience; use business data to drive better insights, make better decisions in real time, as well as tailor customer experience.
- Leverage cloud infrastructure and cloud scale — i.e. where we can scale beyond what was feasible on-premises; where we can use elasticity to perform just-in-time provisioning of the right amount of infrastructure to handle current demand; the ability to eliminate operational overheads with managed services (and serverless infrastructure).
- Improved security — i.e. where you can leverage the integrated, out-of-the-box security capabilities of cloud, including (but not limited to): Google’s secured data centres and secure hardware; Google-managed or customer-managed encryption; secrets management; zero-trust access paradigm; strong identity and access management; organisational policies; DDoS protection; firewalling and microsegmentation; web application firewalls; access and audit logging; and data exfiltration capabilities.
- Total Cost of Onwership (TCO) reduction.
How Does Cloud Improve Value and Reduce TCO?
Let me focus on TCO. Those folks that say migrating to Cloud is not about cost are probably not leveraging cloud correctly. Everything you do as a business is about maximising value. Value is about return on investment. You want to do more, whilst spending less.
Cloud enables us to improve value — i.e. improve ROI — in many ways:
- Agility — Getting to market quicker obviously gives competitive advantage.
- Automation through Infrastructure-as-Code (IaC) — this contributes to our agility, but also massively reduces our operational overheads. If we can reduce operational overhead, then we can reduce our reliance on operational engineers, or redeploy these engineers to more value-adding activities. But ultimately, reduced operational overhead means reduced workforce cost.
- The same automation and IaC contributes to repeatable, consistent environments. This translates to reduced configuration drift, fewer infastructure related defects, and ultimately: improved availability with reduced operational overhead to achieve this availability.
- Improved availability results in happier customers, and improved conversion for new customers. You’re seeing the virtuous circle, right?
- Cloud provider (e.g. Google) managed services also massively reduces operational overheads. We no longer need to pay our workforce to do things like… Install operating systems, install middleware, install databases; and we don’t need to pay the workforce to continue to secure, patch and upgrade these platforms. (This is a huge pain-in-the-butt for most enterprises that still operate out of on-premises data centres.)
- Cloud elasticitiy — coupled with Pay-as-you-Go services — means we can significantly reduce our overall like-for-like infrastructure spend. We don’t have to provision huge amounts of infrastructure in advance; infrastructure that only gets close to full utilisation when we hit our occasional peaks.
- And because we don’t need to provision all this infrastructure in advance, we no longer need to pay our workforce to install and maintain hardware in our datacentres.
- And because we’re no longer installing hardware in our data centres, we can potentially eliminate our data centres! (Or, at the very least, downsize private data centres to the point that we can leverage colo facilities instead.)
- Commercial license cost elimination. THIS IS PERHAPS THE BIGGEST SINGLE COST REDUCTION OPPORTUNITY.
But You Can’t Benefit From Most of These If…
If you fail to eliminate your existing commercially licensed enterprise platforms, then many of the above benefits will be sacrificed.
I’ll recap why:
Commercially Licensed Products Make Up Most of Your Cost
Commercially licensed enterprise hosting products — like Oracle Database, Microsoft SQL Server, IBM DB2, Oracle Weblogic, IBM Websphere, Red Hat JBoss, Red Hat OpenShift, VMware —typically represent somewhere in the region of 75% of your total on-prem hosting spend.
If you add up ALL your costs for things like data centre occupancy and facilities, infrastructure engineers (OS, middleware, databases), physical hardware and associated maintenance, those still only represent a small fraction of what you’re spending on the aforementioned software products.
And guess what? If you lift-and-shift those same software products to cloud then you’re doing nothing to eliminate the bulk of your costs. In fact, there’s a good chance they will go up, and this article explains why.
You are Unlikely to Benefit from Pay-as-you-Go Pricing
This is because many organisations purchase these software licenses on an enterprise contract, in order to get discount. But these enterprise licenses are then typically sized to your organisation’s peak requirement. So you’ve effectively paid upfront, based on the maximum that you think you’ll need.
Sure, you can still pay for your cloud infrastructure using PAYG, but this only represents a very small proportion of your cost, relative to the licensing.
Some of these commercially licensed products ARE available in cloud using PAYG models. But these PAYG models are typically very expensive on a per-unit basis, versus what organisations were paying on-prem in their enterprise agreements.
You Cannot Leverage Cloud Elasticity
This links to the previous point. If you’re already committed to paying for your peak levels of consumption, then this eliminates most of the benefit of elastic use of cloud.
You Minimise Your Ability to Reduce Operational Overheads
This is because most managed and native offerings in the cloud are based on open source. Consider these examples in the Google Cloud ecosystem:
- Google Kubernetes Engine (GKE) is managed Kubernetes
- Cloud Dataproc is a managed Apache Hadoop ecosystem
- Cloud Dataflow is managed Apache Beam
- Cloud Composer is managed Apache Airflow
- AlloyDB is managed, serverless 100% Postgres compatible
- Cloud SQL provides managed Postgres and MySQL offerings
- Cloud Run is built on Knative
Kubernetes, Hadoop, Beam, Airflow, Postgres, MySQL and Knative are all mature open source solutions.
But if you choose to run your existing commercially licensed software in the cloud, then you’re typically not going to be able to leverage a fully-managed (or serverless, or native) solution. You will have to install this software, manage it, patch it, and routinely upgrade it. This level of DIY severely limits the value of cloud. In many cases, all you’re going to get from Cloud for these products is IaaS — i.e. the cloud provider managing your underlying VMs.
Conclusions So Far
Nothing surprising! We need to eliminate commercially licensed software as part of our cloud migration. Many organisations understand this.
Yet many organisations are still finding that their TCO isn’t coming down. Why is this?
A Couple of Reasons
Assuming you’re not making the WORST mistake of running commercially licensed hosting products in the cloud, then one reason for poor ROI is simply: inefficient use of cloud. This can be attributable to many factors such as:
- Running multiple under-utilised GKE clusters, rather than hosting multiple tenants on shared clusters.
- Not leveraging spot VMs.
- Not leveraging newer machine types.
- Not leveraging CDN to reduce egress costs and reduce compute demand.
- Not implementing storage lifecycle management (even though the cloud will do it for you automatically).
- Creating too much logging, and not using exclusion filters.
- Oversizing workloads.
- Running hot standby environments rather than leveraging IaC for on-demand DR.
- Not implementing a clear labelling strategy to allow attribution of consumption to application owners.
- Not having a FinOps function that analyses, reviews and acts on cloud consumption data.
I’m going to cover all of these in this series, in an article packed full of FinOps best practices. So I won’t cover them in any more detail now.
But the other big reason: double-run costs!
Double-Run Costs
The issue is that you’re paying for your existing on-prem workloads in parallel with cloud. You were expecting to decommission on-prem as you migrated to cloud, but you’re finding you can’t.
Perhaps you have a 5 year cloud migration program, and you were expecting to see your costs change like this:
In the graph above we’re migrating our workloads from on-prem to cloud at a linear rate over 5 years. We’ve made some assumptions:
- Our target run costs will be 60% of our existing on-prem run costs.
- With each application we migrate, we are able to remove 100% of that application’s on-prem costs.
- We are eliminating 100% of our on-prem workloads.
- We are not decommissioning as part of our migration.
- Migration costs are fixed at annual 5% of current run cost.
But what we actually see is typically more like this:
We can see that annual TCO actually goes up and doesn’t break even until year 5. Why is this? There are a number of reasons:
- In the first year or two of our migration program, enterprises are often still building foundational capability in the cloud.
- In the absence of strong foundational capability and good practices for migrating at pace, initial migration pace is slow. So we find that the migration to cloud is not linear. It accelerates over time.
But here’s the biggest reason:
Our exiting on-prem workloads are sat on top of big commercially licensed platforms. These are effectively multi-tenant platforms, i.e. multiple applications sat on the same platform. Something like this:
But unlike multitenant platforms in cloud, we cannot typically get any savings by simply removing some of our on-prem consumers. This is because:
- The on-prem platforms have sunk infrastructure costs. The infrastructure has already been bought, paid for, and costs typically depreciated over 5 years.
- The on-prem commercially licensed software is typically provided as part of an enterprise agreement with a fixed term. It is not usually possible to reduce the size of the on-prem footprint, and then simply pay a proportionally lower license cost to the vendor.
So in our picture above: removing application A does almost nothing for our sunk infrastructure and license costs. Same for B. It’s not until we move all three that we can decommission the platform.
And even when we can decommission the platform, we will not save immediate costs. WE ONLY AVOID FUTURE COST, in the form:
- Hardware renewal we don’t need to do.
- Maintenace contracts we don’t need to renew.
- Enterprise licenses and support contracts that we no longer need to renew.
This cost avoidance cannot happen until our contracts are terminated, or renewed at much lower volumes. This could happen after the end of our hypothetical 5 year migration program.
What’s the Solution?
It’s actually fairly simple!
- First, gather all your enterprise hosting contracts. Things like Oracle, IBM Websphere, Microsoft SQL Server, Red Hat OpenShift, Red Hat JBoss Application Server, Broadcom VMware, Red Hat Enterprise Linux., Microsoft Windows Server. Order them in terms of overall cost, highest to lowest. This forms our prioritised list of software we want to eliminate as we migrate to cloud. There are two of three in this list that will probably be much bigger than the others!
- Next, observe the dependencies. For example, using my picture above as an example: you can’t start with VMware. There’s too much running on it. But you can start with your database products, your application server products, and your container hosting products.
- Determine how many of our applications are dependent on these hosting products. If you have a reasonable CMDB, you should be able to use your appication dependency mapping and application-to-infrastructure mapping to do this. See my guide here on this topic.
- Finally, determine the contract expiry dates. Let’s face it: if you do nothing, these are just inevitable contract renewal dates. (And you likely have no leverage.) If you have any renewals in the next 12 months — and more than one or two applications sat on top of them — then you’ve already missed the boat for these contracts. There’s too much risk in trying to move off these quickly. But everything else is fair game.
At this point you have a prioritised list of contracts to eliminate, and milestones by which you need to do this by.
Here is an extremely simplified, fictitious example of what this might like like, with some made-up numbers:
In this fictitious example, we would probably start by prioritising Red Hat OpenShift, IBM Websphere, and SQL Server. Oracle contract is too near, and VMware is out for three reasons: 1) contract renewal is too near, 2) the dependency stack, 3) the sheer number of applications we would have to deal with.
We also now have a list of the applications that are dependent on these contracts. THESE ARE THE APPLICATIONS YOU MUST MIGRATE!
I can’t stress this enough: this ordered list is the biggest single factor that should be driving your migration sequencing, in order to reduce on-prem costs and drive down your overall TCO.
And we end up with a TCO chart that looks something like this:
If we compare the cumulative TCO over 7 years for our fictitious 5 year migration program, we get a graph like this:
This graph shows our cumulative TCO relative to the DO NOTHING option — i.e. the option where we chose not to migrate.
- In our initial business case we expected to break even around year 2, and with increasing positive ROI from then onwards.
- In our reality, we don’t see positive ROI until around year 10!
- But when we optimise our sequencing as per the guidance I’ve given you, we end up with a line that approximates our original business case. We break even around year 3, and see increasingly positive ROI from then onwards. By year 10 we’ve reduced our cumulative TCO by around 25%, including all migration costs.
Of course, you need to plug in your own assumption and costs for your own organisation. But you get the idea!
Overall Migration Filtering Approach
Of course there are many other considerations, like close-coupling of applications, data gravity, latency, simple vs complex. Also, a good cloud strategy will usually see a number of existing legacy applications replaced by SaaS alternatives. And there’s usually a LOT of redundant applications, redundant environments and duplication discovered during migration. So the reality is that we don’t end up migrating everything.
If we’re designing our migrations properly from the beginning, we end up with a filtering process that looks a bit like this:
Note that “Retain on-prem” is our last resort, and depending on how much we have, we may have the option to exit dedicated data centres and simply rent some space in a colo facility. (Have a look at my article here, which shows how we can evaluate the tipping point for when we should leverage colo.)
And after filtering, we might typically end up with proportions like this:
When we’re prioritising our migrations based on hosting contracts we can ignore 3rd party and SaaS applications. All the other categories are simply methods by which we can eliminate our dependency on these contracts.
Other Prioritisation Considerations
You Need a Data Centre Strategy
You can’t make cloud migration scope decisions in the absence of a clear data centre strategy.
For example: if your intent is to maintain some on-prem workloads (whether in dedicated data centres or colos) in the long term, then a strategy of Migrate everything to cloud is nonsense. If you’re retaining some data centre hosting, then you’ve presumably identified clear rationale for doing so. And if you’re keeping DC presence, then there will be some workloads that will always remain cheaper on-prem.
Create Your Factories
In order to exit the commercially licensed hosting products we identified earlier, you will need to:
- Identify your target products in the cloud. Check out my guide here which will help you map the existing on-prem technologies to idea technologies in the cloud.
- Where you have many applications that need to follow this migration route, establish migrations factories to achieve the migration in a streamlined and repeatable way. See my guide: Accelerating Cloud Migration with Migration Factories and Migration Tools.
Summary
If your cloud migration program is dragging its heels, burning cash, and falling short of the ROI you expected, you’re not alone. But it doesn’t have to stay that way. The biggest lever you can pull to fix this isn’t technical — it’s strategic. By sequencing your migrations based on contract expiry and platform dependency, you can shift the curve on your TCO and start delivering value much sooner.
Focus your efforts on the applications that are sitting on top of your most expensive, inflexible, commercially licensed platforms — and sequence them around contract renewal timelines. This approach isn’t just about cost avoidance. It’s about creating options, unlocking savings, and bringing your cloud strategy back in line with your business case. Plug in your own numbers, map your own dependencies, and you’ll find your path to meaningful, early ROI.
Before You Go
- Please share this with anyone that you think will be interested. It might help them, and it really helps me!
- Please give me claps! You know you clap up to 50 times, right? Just hold the button!
- Feel free to leave a comment 💬.
- Follow and subscribe, so you don’t miss my content. Go to my Profile Page, and click on these icons:
Series Navigation
- Series overview and structure
- Previous: Accelerating Cloud Migration with the Migration Factory and Migration Tools
- Next: FinOps and Operational Efficiency (Coming Soon)
References and Useful Links
- Reducing license costs in cloud with open source
- How to Accelerate Cloud Migrations at Scale
- Discovering and Mapping Your As-Is to Cloud
- Data Centre Exit, Mainframe Exit, and Colo Tipping Point
- Mapping As-Is to Cloud Target: OS, Application Servers and DBs
- Accelerating Cloud Migration with Migration Factories and Migration Tools