Multicloud: Taming the Rookery

Don’t let your frogenator get you down.

David Feuer

Published in

APIs and Digital Transformation

7 min readDec 5, 2017

%^@&*!!!

It was late. I was with my old platform team in a previous job, working with a client development group on a design for a fairly straightforward migration from a local application to a cloud service, and we couldn’t come up with an elegant design the client would agree to.

There were just too many systems, too much spaghetti code — it seemed like a wholesale cutover (along with associated significant downtime) was the best way to do it. The planned downtime would be bad for business and bad for the customer experience, but there just didn’t seem to be another way. The client architects and developers felt that it was just not feasible to “lift & shift” the back end system(s) without taking time for the “lift”; when the data was in-transit, so to speak, there was no way to take traffic. Even though we stated the inevitable consequences, this was the required path that the “architecture review board” approved.

So, that was the path we took. It was painful — expected downtime of eight hours turned into almost 24 hours, because of configuration issues with downstream systems (why didn’t anyone document IP whitelisting in the modulated frogenator?*). By the time the problem was solved, customers were frustrated, as downstream consumers weren’t getting the alerts they signed up for, stores weren’t getting the foot traffic, and the stores were doubtlessly looking for alternatives.

If I had to write a Root Cause Analysis in a post mortem for this event, it would be a one-liner: “The hairy backend systems were just too hairy.”

When we think of all of our backend systems — some of which may span hundreds or even thousands of services — the image of multiple albatross comes to mind. When looking up the term for a group of albatross, my Googling landed me on the word: it’s called a “rookery.”

Revised Root Cause Analysis: “too many albatross in the rookery.”

Sometimes it seems like the multitude of interdependencies that have been built in monolithic backend systems — frequently depicted as an onion of layers, from the network core to various networking layers at the edge — is too great to assess and fully document. That sort of project can sometimes take person-years of effort, is prone to human error, and may still result in service interruption. The ever-pressing need to move to the cloud and deliver cloud-based platforms that support platform business models cannot wait person-years for a documentation effort.

There has to be a better way…

Wrangling Complexity: The Future of Business IT

In our last article, we talked about the homogenous backend systems that make IT horror stories like the one I just shared so common: why older systems are homogenous silos that exist as “stacks” from large enterprise software vendors, and why they are frequently tightly coupled together.

As computing moves to the cloud, many enterprises are wary of tying themselves to single approaches or solutions that could cause a new generation of tightly-coupled, inflexible, enormously complex systems. Surveys suggest the vast majority of enterprises have either already adopted hybrid architectures or are in the process of doing so within the next few years. For many businesses, the approach is neither one thing nor another — it’s aspects of both.

Notwithstanding the challenges of assessing all the options in cloud hosting and associated infrastructure — reliability, features, security, geographical distribution, etc. — there’s also the matter of application design. Is it possible to design and deliver applications to be multi-cloud native while also providing business value beyond simply making your whole stack compatible with more than one cloud vendor?

In other words: rather than just obviate the regular sources of supply chain risk (capacity risk, quality risk, financial risk, market risk), is it possible to mix-and-match cloud software components to create a “killer stack” for a given application?

The API Façade in Multi-Cloud Environments

Looking at a given application, understanding the levers — specifically, what areas of flexibility are the most important to preserve — is key to architecting a backend that reflects the needs of the developer and the company exposing the application. No single “one size fits all” solution will exist for every company’s every need.

But by leveraging the API façade design pattern, enterprises can introduce an abstraction layer in front of storage, compute, and networking resources, whether those resources are in a company data center, Microsoft Azure, Google Cloud Platform, Amazon Web Services, or another cloud provider. By introducing an opaque layer between applications and the underlying cloud providers, companies can position themselves to move workloads intelligently based on strengths and opportunities of the providers, in a manner that is frictionless to services and applications, whether they are internal or external.

In other words, beyond taming the rookery of legacy infrastructure, APIs also provide an opportunity to build a best-in-breed platform using the best cloud components on the Internet. Not only are we moving away from aging and stoic systems of record (and the process associated with updating and maintaining them) — we’re also shifting the producer/consumer power dynamic: an enterprise can now choose to engage multiple providers for their applications and choose whichever ones make the most sense for those applications, services, and development team.

Don’t Get Rustled

Some customers may find embracing a single provider to be the best solution. But as products and services proliferate, so do use cases, which increase backend technical complexity. Whether organically or by design, many organizations will find themselves dealing with multi-cloud scenarios. The best solution may not always be in the same provider’s stack. Introducing even a single component of new technology may then cause friction as additional time and cost are invested in interoperability.

A multi-cloud strategy, therefore, should be designed API-first and developer-first, from the ground up. A core tenant of every modern infrastructure solution, whether cloud-based or on-premise, should be designing and developing solutions to be leveraged by developers. This does not necessarily mean recreating the existing services interface of the cloud providers, but rather, expressing the application experiences themselves as APIs. If those APIs are hosted in a façade, an opportunity arises. That façade can intelligently broker API transactions between diverse backend systems, compute and storage resources, and other cloud services.

If my /customers resource is a thin veneer over CRM data, and I am having challenges with obtaining and integrating other customer data, the answer is not a dichotomy between moving all that data out of the CRM resource, or putting it all into the CRM resource. By introducing an API façade, I can take requests to the /customers resource and then make two requests: one to a CRM resource and one to other local infrastructure. As I move my local infrastructure to the cloud, the API façade then requests that same data from the cloud resource — with no interruption to the developer. The developer just gets the customer record requested, but this time, the enterprise can breathe a deep sigh of relief that the 25 year-old customer relationship management system written in Delphi (with a Paradox backend, I’m sure) can be retired to the museum where it likely belongs.

The API façade is one of the most elegant ways to wrangle all of these backend systems. This is a common and ideal scenario where an API platform can provide direct economic and functional benefit — it can be configured initially to pass through traffic and not treat it, and that offers some very significant advantages:

it becomes a silent disintermediator, able to receive and absorb requests and, in the future, respond to them differently than the stoic backend systems are designed to respond to them
it can collect analytics to easily identify the sources of potential problems, assess higher-risk and higher-value requests and transactions, identify which systems to move first, and suggest how to best prepare for the migration.

Once that façade is in place, the existing rookery of infrastructure is more easily both assessed, using API analytics, and tamed, using cloud computing components to aggregate and enrich data in a way that offers the most value to the business.

An API façade can offer additional benefits to a multi-cloud strategy. By leveraging the façade design pattern and an API-first development culture, organizations can design and develop the solutions of the future, creating a significant source of business and technology leverage. The business leverage comes from supply chain independance: enabling seamless transitioning of workloads to highest-value providers, and elastic growth without relying on a single source. This supply-chain risk mitigation both creates bargaining power for the organization and hedges against provider regression or degradation. The technology leverage comes from moving workloads to providers that are performing the best (lowest latency, closest to customers, highest performance, etc.) or that have the most compelling features.

Use APIs to Avoid the Rookery

It can be an overwhelming task to inherit on-premises infrastructure and associated projects. By leveraging an API façade, businesses can create leverage, introduce independence from lock-in scenarios, migrate to multiple clouds, and focus on what really matters — creating market experiences that have an impact. Don’t let your rookery take over!

*Modulated Frogenator: A completely made-up term to represent some legacy backend system that was over-engineered, and is now simultaneously needlessly complex and understaffed