Scaling Hybrid/Multi Cloud Infrastructure Efficiently at Enterprises — Q4 2021 EAB Meeting
We hosted the 2021 Q4 Executive Advisory Board (EAB), where 25 corporate executives got together, sharing challenges and exchanging insights on enterprise cloud management.
Adoption of hybrid/multi cloud in the last decade
Data shows the cloud infrastructure service spend in the last decade has been increasing, and it reached $125 billion in 2020 and marked $41 billion during the first quarter of 2021. In contrast, the spend on data center-related hardware and software stayed flat in the last ten years hovering around $90 billion a year. Additionally, in the last decade, the global VC investment in the cloud and DevOps sectors has grown from almost zero in 2010 to already at $15 billion in the first half of 2021. (source: pitchbook)
Within the cloud spend, public cloud platforms are the most significant piece. AWS was the start of the cloud shift back in 2006 and still is a dominant player in the market. However, Microsoft Azure, Google Cloud Platform (GCP), and Oracle Cloud have quickly caught up, and we are seeing signs of slowing down in AWS’ dominant position. Within our own survey of the attendees of this event, it showed 84% AWS, 58% Azure, 37% GCP, and 16% other CSPs (and 32% on-premise), for the question of “Which Cloud Service Provider(s) does your company currently use?” In terms of the spending on each platform, nearly 10% of the AWS users are already spending more than $12 million annually, Azure and GCP show 36% are spending more than $1.2 million annually, so it is becoming a meaningful spend on the public cloud or enterprise. (source)
Challenges seen in hybrid/multi cloud environments
While the whole cloud shift happened in the last decade, many early adopters started to realize new challenges to work on. The highest priority has always been “to optimize the cloud spend” at least in the last four years, as people started to realize that 30 to 50% of the spend in the cloud typically goes to waste unless you optimize it at a deeper level. (source) There was a big belief (and still is the case) that cloud migration would bring a significant reduction in CAPEX, moving on to a smaller OPEX spend model, however, early/big adopters have started to realize that the cloud spend has gone out of control to the point that it impacts their gross margin, thus the optimization of cloud infrastructure has become a critical demand.
So, what does “optimization” mean here? This concept chart above is a good example from one of the largest streaming companies. The left chart shows the actual traffic patterns, which equals the CPU consumption on the infrastructure side and you can see that it is highly concentrated on some of the time slots of a day for the obvious reasons for streaming. However, without thinking of optimization, one would want to maintain the capacity of the cloud at the dotted lines to avoid any latency during peak time. But, you could quickly point out the underutilized blank spots on the right and left within the left chart; that’s the waste. On the right chart, with appropriate forecast and precision of optimization, the allocated capacity (blue line) is slightly over the actual usage of workloads (white line), therefore accomplishing nearly zero waste. This is a very simplified example of optimization, and in reality, it involves optimization in the cloud instance type, CPU type & volume, memory, redundancy, geo setting, etc… but you get the message: “you’d better optimise the infrastructure.”
Underlying elements of the cloud shift
There are other key underlying technology evolutions that pushed enterprises to adopt cloud, and also to further shift to hybrid/multi cloud. First, the way an application is run and managed. In a traditional deployment, applications ran on hardware and operating systems. Then, the “virtualization” technology was introduced, which allowed enterprises to utilize hardware as multiple applications/VMs could spin up and down based on the demand on top of the cluster of hardware. From there on, there was an introduction of the container, which has similar properties of VM but brings in lots of other benefits such as lighter, agile application creations and development, app observability, fits with DevOps cycle with high frequent releases, isolation from OS dependency, etc. Most importantly, it enabled portability to a broader environment (cloud!). Finally, the Kubernetes came out and enabled enterprises to run and manage multiple containers in a cluster in a reliable way. It’s a relatively new layer of technology but is quickly being adopted in the enterprise DevOps space. Currently, 88% of enterprises use container orchestration as the Kubernetes layer, and 74% are already using Kubernetes in the production environment. (Source)
The other element is that the way applications are architected has been shifted from a traditional monolithic architecture to the modern applications that are built with microservices architecture with the containers running each of the workloads. It is a very fragmented unit, like a micro application running in a large application. The modern DevOps cycle is a very fast cycle of development; they code, build, test, release, then the DevOps teams monitor the actual production environment and then with the feedback looped into the team again and they reiterate the code to improve on the performance, for better UI and better feature in near real-time. This frequency of release used to be once every three to six months but nowadays it is multiple times a day. Our internal survey also reveals that some are doing releases as short as an 11-minute cycle. And they are all happening in the cloud.
Business impact — increased cost and decreased velocity
Although cloud migration is in motion with all those underlying technology shifts as laid out above, we have started to see some early adopters of cloud having a reverse migration, pulling out of the public cloud. The biggest issue is the economic impact. Dropbox is a great example; they used to be one of the biggest cloud spend among the enterprises, then they realized that it’s impacting gross margin so they had to move back from the public cloud to their own infrastructure. As a result, their gross margin improved from 33% to 67% (a16z has a great blog post on this). We’re starting to see some cloud-centric service companies do the same. By doing so, the cost improvement is going to be 30 to 50%. This is one of the biggest pushbacks against cloud migration, that it’s not as simple to move the whole workload into the cloud; you have to be smart about how you set up the whole environment so that you could run it efficiently.
Summary — All coming in at once
Hybrid/multi cloud is becoming the mainstream IT infrastructure for enterprises. It’s one of the few effective ways to have flexible deployment style while having great benefits in risk reduction, backup, redundancy, DR recovery, and cost efficiency improvement.
- There was a strong belief that the migration of workloads from on-premises to the cloud would give a reduction in CAPEX, but many started to realize a significant increase in overall cost.
- Application architecture has shifted from monolithic to containerized microservices, and made it much easier to port to cloud environments.
- Modern DevOps teams are going through a hyper frequency of releases and need to have robust ways of dynamically optimizing their workloads in the cloud in the most efficient way.
- DevOps talent shortage is pushing enterprises to manage their development environments more efficiently, preferably in an automated way.
More and more enterprises will migrate their containerized workloads to the cloud, most likely in multiple CSPs. There will be multiple clusters of Kubernetes and legacy VM workloads in the environment. With the shortage of talents and the high pressure of a high cycle of releases, you need to find a way to manage and optimize uniformly in a single pane of window, and with full automation. That’s a lot of asks…. It requires next-generation tools and that’s why we believe companies like CloudNatix could contribute to the market.
Accelerate innovation with efficient compute infrastructure anywhere — CloudNatix
We had Rohit Seth, Founder, and CEO at CloudNatix, speaking and joining the discussion during the meeting. Rohit has spent tackling IT infrastructure challenges in the last 20 years. He was responsible for enabling the Linux ecosystem at Intel in the 2000s, then moved to Google and co-invented containers in 2016, as “even a tech-savvy company like Google was only using 15% of its capacity on a good day.” With that deep understanding of the sector and expertise in technologies, Rohit launched the CloudNatix platform that can bring not only Business Intelligence but can optimize capacity and availability, and scale DevOps through autopilot.
Then, we had a discussion where the EAB members shared their thoughts on cloud management from their perspectives. They have various points of view in terms of existing challenges, as well as the potential for a breakthrough. Some of the points that were brought up during the discussion were:
- Migrating to the cloud and managing it in heterogeneous systems across business units or applications can be challenging because of the difference of how DevOps work in different infrastructures and the shortage of talent of those who can operationalize the changes from the migration.
- Operating across VM’s containers and the cloud-native environments in a seamless fashion is key.
- DevOps framework is different when it operates in the cloud, which can cause security problems, so enterprises need to be aware of it and take action accordingly.
Hybrid/multi cloud migration and management remain an ongoing big discussion for every enterprise, but understanding the point that it is not just about moving to the cloud and managing the infrastructure, but actually changing the architecture to be well suited for the cloud is the key.