Cloud Native, or cloud hosted?

John Ogden
7 min readApr 4, 2022

--

We hear the phrase “cloud native” all the time, its almost as ubiquitous as “agile”, and just like with with agile, tacking the phrase onto the end of a project does not make it so.

So what does cloud native actually mean, why would somebody want it, and what is the baggage?

The TL;DR is cloud native should mean elastically scalable, event-driven services, that are charged based on resources consumed rather than capacity provisioned, that support high frequency changes, and that have drastically lower operational overheads like design, testing and patching.

That is to say, cloud native, is a little more onerous than whacking a traditional application onto some IaaS VMs in the cloud. That’s just “Cloud Hosted” & we can all spot an impostor.

It’s old-man COTS from the on-premesis data centre!

This post is going to look at what cloud hosted involves, and the problems it presents. An alternative is then presented and justified, built using cloud-provider cloud native services. A wrench is then chucked in the gears, by mentioning the CNCF.

Cloud Hosted

Most companies have a lot of applications that have been running on-premises for the last decade or so, using a mix of virtual and physical machines, COTS products, and large application binaries. They’ve also often had someone run the AWS Server Migration Service against them, and now those applications run in the cloud.

Is there any benefit to moving applications into the cloud in this way? Yes, there is some:

  • It is a relatively cheap and easy step in the right direction. In many cases it’s an urgent step when on-premises facilities must be exited due to age & closures
  • It is considerably more environmentally friendly than legacy data centres, with better energy-efficiency and consuming renewable power
  • It is more elastic that on-premises, with some ability to scale services down or turn them off when not in use
  • It is more resilient to outages, as replacement instances can easily be created, and the cloud provider handles cross-availability-zone failover and ensures the underlying infrastructure is operable / refreshed
  • It moves IT service provision from legacy or proprietary Infrastucture (mainframes, mid-range, HP-UX, Solaris, AIX, AS/400, etc.) to more modern, commoditised and portable infrastructure: x86 and ARM running Windows and Linux

However it doesn’t give the really big benefits of the cloud:

  • The applications are likely to remain large, complex and tightly coupled, making them difficult and expensive to understand, monitor, scale, and change. Thus they are likely to be tied into the once or twice per year “enterprise release” which carries vast risk of failure, and delays releasing business value for months or years
  • Levels of automation are likely to be low, with error prone, manual testing + release and deployment processes, slowing change, and wasting huge amounts of effort trouble-shooting “environment-issues”
  • Support teams will have to invest a large amount of time “feeding and watering” the infrastructure: designing commodity services like databases and message queues; building monitoring, backup and recovery, and security services; securing the infrastructure; and patching/refreshing a wide range of COTS products — or perhaps more likely, not patching, and thus leaving the service more vulnerable to attack
  • The applications are likely to be very long-lived, operating at a stead state 24x7, with the “state-of-the-art” ambition being to turn-off test environments when not used over the weekend — meaning the run cost is frequently high; especially where applications have numerous test environments, requiring live-like infrastructures.
  • Moving the applications to another cloud provider is very difficult, as all the bespoke services (monitoring, backup/recovery, CI/CD) would first need to be recreated
  • High Availability and Fault Tolerance is likely to be achieved in a traditional manner, using N+1 services, and an ambition to failover to a second full-sized Disaster-Recovery site — though the actual failover process is unlikely to be tested frequently and the DR site is unlikely to be cost-efficient.

Cloud Native

A truly cloud native solution would address these limitations by:

  • Decomposing the large applications into micro-services designed to be changed and scaled independently, and shift over to high-frequency releases, deploying many (hundreds) of small releases annually — each of which is lower risk and easier to trouble-shoot/back out. This may also be associated with agile transformation, and moving from project teams to product mode.
  • Building and deploying the micro services using highly automated CI/CD pipelines, ensuring builds happen quickly and repeatably, and that pervasive automated testing validates applications are behaving as expected
  • Making use of pre-packaged PaaS services wherever possible, freeing the support teams from much of their feeding and watering duties, and avoiding effort configuring COTS products to support the required backup/recover, scalability and security regimes. Consuming equivalent pre-packaged PaaS services, whenever an application is ported to an alternative cloud provider.
  • Refactor the applications to be event driven, and take advantage of serverless solutions which have only the minimum required infrastructure provisioned at all times — and if end users are not actively interacting with the system, only minimal costs are generated; but that can scale very quickly to internet-scale volumes
  • Fault Tolerance and failover to a second region is achieved using in-built facilities of the PaaS products — which are designed and proven to be effective by the cloud providers, who operate at the scale where such testing can be pervasive.

That sounds amazing, I’ll take two!

Unfortunately, there is quite a lot of work in getting to the promised land of truly cloud native applications, and before attempting the journey a few questions should be asked:

  1. Do we really need this application; could we replace it with a SaaS service somebody else maintains on our behalf, and freeing us from worrying about it further?
  2. Do we really need everything this application does — can we make it smaller, simpler and easier to live with?
  3. Do we have the right skills, and risk-appetite to transform the application ourselves, or should we get an expert-team to do it for us?
  4. Do we have a bullet proof business case for the transformation: are we sure the return will be worth the costs — for systems that change very infrequently, that have very stead state loads, and that are air gapped from the internet might not return such big payoffs. The business case needs to cover the full costs. At first glance PaaS services tend to look expensive compared to roll-your-own services; it’s only when you tot up what it really costs to roll your own, that PaaS looks attractive. That includes all the stuff you never quite get round to: patching & tech-refresh, testing disaster recovery, and ensuring encryption at rest is in place with keys rotated annually, etc.
  5. Are we sure this is the most urgent fire to put out? There may be other applications returning bigger payoffs for cloud-native transformation — or indeed other kinds of transformation. So an accurate and timely catalogue of applications and their live-with-costs is a very useful piece of input information when considering whether to transform

And another thing…

So far we’ve only talked about cloud-provider, cloud native things. There is also the Cloud Native stuff the Cloud Native Computing Federation are working on, and they have a list of products…

The CNCF Cloud Native Landscape

…which products isn’t getting any shorter; nor does it cover all the products the cloud providers offer. Hence, before a cloud-native journey is started, the strategy should be sorted, and a horse-backed: is it going to be a cloud provider view of cloud native, consuming PaaS and building quickly; or a CNCF view deploying & managing a larger range of products across several providers?

There is never going to be a right answer here, it’s horses for courses, so I’ll do the usual architect thing of saying “it depends” how you want to optimise: to avoid lock-in to a provider look towards CNCF; to build faster & optimise around your chosen provider, look to that provider’s PaaS

Either way, there will be plenty of opportunity to use both sets of products, and the two shouldn’t be seen as contradictory or competing — they’re both building products that help with the journey towards high-frequency changes being made to highly elastic, event-driven, micro-service architectures, and its likely that even a fully cloud-provider PaaS view of the world will use a lot of the products from the CNCF landscape — just don’t use them all!

Going forward

Being realistic, we should also recognise that Cloud-Native is the latest in a very long line of “the latest trends” that everyone must immediately follow.

There is probably another trend right around the corner, that will make cloud-native look like cloud-hosted. That Future might be web3, the metaverse, quantum computing, artificial intelligence or a host of other concerns. It might have new value-adds, or might just make the current set of products simpler and easier to manage.

All we can say, is at least half an eye should be kept open for it, and a healthy dose of scepticism applied to it’s inevitable claims.

--

--