Thoughts on The Future of Hybrid Cloud IT

Part 1: Key considerations for IT organizations, architectures, and technology

Zorawar Biri Singh
10 min readMay 7, 2019

Hybrid Cloud IT in 2019 is well on its way to being one of the most significant architectural and operating-model developments in enterprise/B2B markets over the past 50 years. I think of it a fundamental building block of the Sixth Era of IT evolution — what I call the Age of Shared Cognition (amongst humans, machines and autonomous systems)[link].

Figure 1: Six Eras of IT

This progression strengthens on a decade of massive technological innovation, capacity, and capability buildout by hyperscale cloud platforms. This arms race has now quickly moved past providing infinite, cheap computing and storage, to processing massive data pipelines, building out scalable machine learning (ML) and AI inferencing capabilities. Consider that in less than the past four years, the cumulative CAPEX spending of Amazon, Apple, Baidu, FB, Google, Microsoft, and Tencent is over $212 B, close to 30% y/y growth!

The concurrent digital transformation strategy ‘boom’ seemingly occurring everywhere, reflects a real self-aware business desire to benefit from faster product development cycles and to get to the data-driven real-time applications of tomorrow. Today we see almost every business in any industry, shaping clear, boardroom-backed strategies to adopt mobile-first, AI/data-driven, and cloud operating models to accelerate its own digital journey.

In response, enterprise/B2B IT teams are growing the sophistication and complexity of their cloud deployments across multiple cloud services, and re-balancing their on-premises and off-premises workload architectures to account for global, remote employees and partners, working seamlessly across their core and ‘edge’ networks with 24/7 access.

Looking back between 2015–18, we observed a much wider adoption of open source software (OSS) at all levels of cloud infrastructure, data processing, and apps., as IT orgs. sought to better serve LOB and their developers with better operational agility, and increased developer velocity. Best personified by the adoption curves of projects like RHEL (Redhat), Android, Github, Kubernetes, and Tensorflow, we will continue to see meaningful growth in other OSS projects across all aspects of contemporary technology infrastructure.

We also saw cloud innovators raise the bar in the simplicity of their products and messaging to IT, particularly in the fields of machine learning/AI silicon, machine-learning as-a-service (ML-aaS), and data-driven developer tools and frameworks.

In parallel, fundamental data center components like compute fabrics, data stores, data processing, security, and networking started undergoing significant transformations:

Figure 2: Evolution of Computing from Bare Metal to Serverless

The decade-long disaggregation of traditional VM-centric data centers into smaller, modular units of computing really took shape with the maturity around containers and Kubernetes as well as the recent, rising popularity of serverless computing. Today, it’s fair to assume that most enterprise IT organizations are purposefully refactoring and migrating meaningful portions of their workloads into containerized and cloud-native microservice architectures.

In fact, while security concerns around public clouds for C-suites have largely dissipated over the past decade, ‘data-gravity’ risk and ‘vendor’ lock-in on a particular cloud platform are now significant concerns. As their own choices and experience with SaaS, managed services and infrastructure svcs. evolve, most enterprises recognize the value and necessity of a hybrid cloud IT strategy.

Defining Hybrid Cloud, and/or Multi-Cloud.

I first came across ‘hybrid cloud’ and ‘hybrid IT’ discussions back in early 2009 while developing cloud strategy at IBM. Even though early days, there was a sense within enterprise CIOs and Chief Architects that sooner rather than later, they would be managing some combination of private (on-premises or managed/hosted) and public clouds in unison with their traditional IT workloads and data centers. Indeed since then, decades-old centralized IT operating models of ‘command and control’ had started giving way to popular cloud-enabled, ‘bring-your-own-device’ (BYOD) self-service models amongst lines-of-business (LOB) and developers.

Figure 3: Evolution of On-premises and Off-premises vs. COre IT and Shadow IT

Terms like ‘hybrid cloud,’ and ‘hybrid IT’ were generally accepted in enterprise IT by 2014 (link), and the term ‘multi-cloud’ emerged in uniform usage around 2015-16, largely as a function of the fact that Microsoft Azure, Google Cloud, Salesforce, etc., had emerged to share the hyperscale/public cloud stage with AWS. While today hybrid and multi-cloud may seem interchangeable, those terms carry with them subtle and important distinctions worth noting:

Hybrid cloud: There is no ‘hybrid cloud’ per se. Enterprises don't deploy hybrid clouds, they instead manage and operate complex, heterogeneous ‘hybrid cloud environments’ across several locations. These are best described where CIOs and core IT teams seek to deliver + manage legacy resources and cloud-native services across on-premises and off-premises consumption models for a wide variety of workloads and apps.

In other words, hybrid cloud environments are best recognized where IT teams manage not only infrastructure/apps/services they run themselves but *also* when these infrastructure/apps/services are run by someone else.

Pay-as-you-go or elastic consumption of whatever service is increasingly a given assumption, and a host of managed services and on-premises tools will close the gap between legacy and cloud-first workloads.

Multi-cloud: recently being described as using multiple public cloud platforms for either moving workloads between clouds, or provisioning compute, data/data pipelines and networking substrates and their workflows for use across clouds and/or geographies (quite difficult today because of networking and data constraints).

My observation here is that cloud-first and data-driven orgs., increasingly developer-centric, tend to gravitate towards the term ‘multi-cloud’ when describing their workflow, whereas more traditional IT teams will use the term ‘hybrid cloud’ or ‘hybrid IT’ for the inherent heterogeneity and history of their practice areas.

For purposes of this discussion, I use the term ‘multi-cloud’ as a sub-part of ‘hybrid cloud environment’ and not interchangeable.

So, what do Hybrid Cloud IT environments look like?

Figure 4: Summary of Key Elements for Hybrid Cloud IT

The ‘Holy Grail for Hybrid Cloud IT orgs. = increase their business agility + developer productivity across multiple cloud platforms while consistently improving the automation *and* repeatability of their core IT operations securely.

Figure 4 above is a high-level illustration of what I consider the eight (8) key, deterministic elements of future hybrid cloud environments:

[1] IT Roles & Org,

[2] On-premises + off-premises infrastructure ‘stacks’ and services,

[3] Container, Kubernetes, and server-less platforms,

[4] Three different yet converging, software-defined ‘control planes’ for Security, Automation and Orchestration/mgmt.,

[5] Infrastructure specific ‘substrates’ around data, networking and compute,

[6] Different types of workloads,

[7] Edge and IOT networks and,

[8] Machine Learning (ML) and AI apps. and services, as well as their necessary data pipeline architecture and infrastructures.

So, with that high-level introduction, let’s dig into a bit more detail below. I pulled together a comprehensive view to better frame the discussion; and yes, it’s quite the spectacle! Figure 5 below…the ‘Battlestar Galactica’ version of Hybrid Cloud IT futures!

Figure 5: Hybrid Cloud IT in Detail

[1] IT Roles & Org: For business IT teams coping with the myriad of hybrid cloud environments, it comes down to providing clear business value and efficiency in running/maintaining all types of workloads while accelerating mobile + data streams + cloud-first innovation. As the past decade of mobile-first & cloud-first buildout demonstrated, IT teams are increasingly valued for their ability to drive agility over any other cost factor, including high utilization. Today, business agility equals ‘cloud + software agility,’ i.e. how efficiently can a business adapt its IT resources to new business needs in fast-changing, disruptive markets. And hybrid cloud IT is fundamental to accelerating cloud + software agility. For these organizations to succeed and scale, their key roles and practice disciplines must be prioritized by CxOs with access + visibility as follows:

a> IT Ops,

b> Cloud Architects,

c> Security (DevOpsSecurity),

d> Developers and,

e> ML/AI mission-based, data-centric architecture, platform, and apps teams.

[2] On-premises + off-premises infrastructure ‘stacks’ and services — from legacy/traditional IT vendors, as well as from public cloud and SaaS platforms. The evolving ‘stack wars’ between the Big 3 public clouds’ own gateways (Anthos, AzureStack, and Outposts) vs. the on-premises heritage of VMware, Red Hat/IBM, vs. other projects like OpenStack/Canonical, promise to have very consequential outcomes.

[3] Container, Kubernetes, and server-less platforms — as well as a host of related, managed services (independent of on or off-premises) and CI/CD and DevOps best practices. Example: how are DevOps teams managing continuous integration/continuous deployment (CI/CD) workflows and code repositories with Kubernetes as a core orchestration framework across ‘private’ and public cloud services.

[4] Three different yet converging, software-defined ‘control planes’ for Security, Automation and Orchestration/mgmt — (also includes cloud/SaaS gateway and/or proxy services). Flattening out business networks with software-defined networking (SDN) and SD-WAN , as well as abstracting away data center complexity through software code + ‘fabrics’ are some of the most critical hurdles for enterprises and their technology partners to solve. These control planes pulling together the orchestration and automation substrates for Hybrid Cloud environments, securely, will require much more robust APIs and ‘infrastructure-as-code’ interfaces that are both platform agnostic, and enhance central policies to span and reinforce trust and immutability across the entire hybrid environment. How AI/ML will aid in simplifying and automating the complexity of these multi-domain architectures is already a meaningful topic in 2019.

[5] Infrastructure specific ‘substrates’ around data, networking and compute — the three control planes mentioned above are closely coupled and interdependent with infrastructure specific ‘substrates’ around data, networking and compute. Specifically, these infrastructure ‘fabrics’ are a combination of various silicon, appliances, devices, and software-defined ‘meshes’ designed to enable and orchestrate ‘portable’ compute to the edge and across vital, encrypted data stores. Also, given finite limits to Moore’s Law as well as Dennard Scaling over the past decade, there has been an extraordinary renaissance in hardware to take compute beyond x86 processors into specialized ASICs, GPUs, FPGA, TPUs and other custom silicon. [4] above and [5] are two areas I’m currently quite enthused about.

[6] Different types of workloads — traditional VM-based, pure SaaS, cloud-native, data-driven, etc. I think important issues that lie ahead for IT professionals and line-of-business (LOB) app. owners to sort out are (1) lift-and-shift vs. other migration options for legacy workloads vs. ongoing SaaS ‘sprawl’, (2) data gravity/data risk compliance, and (3) embracing data pipelines and incorporating relevant machine learning approaches;

[7] Edge and IOT networks — grouped here together for simplicity but will break out in more detail in future posts. Implications for processing massive, real-time data streams, and cost/complexity of implementing ML-inferencing at the ‘edge’ and across IOT networks are enormous. Significant disruption to Telcos/SPs possible here with 5G + effective “inference per watt” systems gathering momentum.

[8] Machine Learning (ML) and AI apps. and services, as well as their necessary data pipeline architecture and infrastructures — AI research and techniques have been around for decades. Current day machine learning (ML) and deep learning (DL) technologies have continually evolved from knowledge-based AI systems of the late 1990s, neural networks in the 2010s to recent rapid improvements in reinforcement learning (Google AlphaGo, Oct 2015). Combined with truly massive data proliferation everywhere, vast, cheap, ubiquitous pools of computing and a renaissance in hardware/silicon accelerators, ML/AI has taken hold in theory and practice for businesses and industries everywhere. The current ML/AI hype-cycle is unrelenting and loud. Yet it’s early days, and end-to-end data + analytics architectures/infrastructure within enterprises are simply not mature enough, or consistent enough to deliver sophisticated ML solutions for broad business needs. If we recall the evolution of modern web applications, there are clear waypoints that show how organizations used tools to push applications to production, monitor performance, and reliability, and increase the frequency of their deployments.

The modern equivalents of DevOps best practices culture, and mature CI/CD toolchains simply do not currently exist for ML/AL ecosystems; we don’t yet have the equivalent of a web ‘LAMP’ stack for ML/AI. But these tools and best practices are coming. I believe a CI/CD and DevOps platform(s) equivalent for AI developers is a tremendous opportunity for enterprises everywhere, and its another area I’m personally very excited about.

It’s been over a decade since the Apple iPhone and AWS were introduced, and during that time we endured one of the worst financial crises in history. Despite that, there has been a phenomenal buildout since and the rate of technological progress has only accelerated. We are fast approaching a post-mobile and post-cloud world. What’s happening next is a tsunami of data-driven, learning algorithms and AI-first orientation everywhere, especially in enterprise IT. Existing business models are due for refactoring and re-balancing courtesy of real-time data and inference-based decision making.

In my ongoing series of posts here, I hope to cover each of the segments above in more detail and discuss how their trends and issues continue to inform our understanding on the evolution of Hybrid Cloud IT.

Thanks for visiting and please do share your feedback!

--

--