I used to work on Eucalyptus, one of the earliest private cloud implementations, out of which OpenStack was born. Eucalyptus used AWS API and service compatibility as it’s core differentiation. Eucalyptus was to AWS what Azure Stack is to Azure: a subset of AWS services delivered on-premise with hardware. I’ve found the parallels uncanny and I’ve been pretty surprised about the lack of attention Eucalyptus has received as folks debate the merits of Azure Stack, it’s a reference point.
Certainly there is something very appealing about being able to deliver public cloud on premise.
“What if we could put the public cloud into our own facility?” — said many, often
The quick route to this is to think about packaging up public cloud services into some sort of software stack to deploy onto your usual kit (servers, network switches, storage arrays). Sounds straightforward but in reality it’s very difficult to be successful with this approach because of the long tail. There are a tonne of technical and commercial issues which follow.
It was along these lines that a bit of a barney erupted a while back on the interwebs, the focus being on the relevance of Microsoft’s Azure Stack. Snake oil, legacy company, that sort of stuff. Is it just more of the same private cloud stuff or is it really something ground breaking?
I’m a big fan of continuous learning and experimentation and so I love seeing past learning being applied to similar problems in future. I think that some of the lessons learned at Eucalyptus have direct relation to the journey Azure Stack will take. In no particular order I figured I would try to describe these. For the record I generally think Azure Stack in current form is a waste of time and effort and should be avoided by most.
At Eucalyptus we started with “pure” private cloud and infrastructure-heavy messaging. Over time we began to talk more about “workloads” rather than “virtual machines”. We began to speak use-cases and how a user could extend AWS back on-premises for niche workloads. This was a complete pivot because very early on (~2008) the public versus private debate was still very heavily weighted towards the latter. Anyhow, the key word was “extend” and not “replicate”. Unfortunately “replicate” is the viewpoint many take, particular with Azure Stack and especially with hosting companies and service providers (more on that later). In our later-formed vision AWS was the centre and Eucalyptus deployments would be satellites, deployed into countries with no AWS region, yet. We also began to talk more about template re-use (CloudFormation templates) across both platforms and sharing tools and architecture patterns. All of this appears similar to Microsoft and their Azure Stack messaging: started as partner-heavy infrastructure pitch but now you hear more about edge use-cases, serverless and architectural practices.
Another component of popular messaging around Azure Stack is the notion of “dev in public, prod in private”, that is to say conducting development of a software solution in Azure public cloud and then deploying that solution onto Azure Stack when entering production, presumably due to security concerns. On paper this sounds great. Use some elastic utility infrastructure for your development purposes but when it comes to sensitive data, bring it in-house. This tends to ignore feature parity. If I want to develop my solution using Managed SQL, a bit of Stream Analytics and some Machine Learning, how can I bring that in-house when Azure Stack doesn't, and probably won’t, have those capabilities? I need to build to the lowest common denominator. Once again this was exactly the issue we faced in the field with Eucalyptus. Many customers want to exploit “higher level features” because these are the very capabilities which are harder and more complex to build themselves, thus they turn to cloud primitives or building blocks for speed and reduced time-to-market.
I mentioned the satellite scenario earlier. This is a challenge for some customers and service providers. Rather than seeing it as just that, they will see it as a method by which to compete with Azure, the underlying objective being to keep customers on a platform operated by them and not Microsoft. In many ways I see this as creating unwanted competition between Azure Stack and Azure. In the early Eucalyptus days we spent many cycles working with service providers and customers (mostly traditional IT departments) who figured that deploying a Eucalyptus cloud gave them an ability to offer something more compelling and bespoke than AWS. This was the “compete” rather than “complement” nature of an opportunity or lead which would often get us worried. Over time we learned to avoid these and saw “compete” as a warning sign during sales qualification. We forcibly de-prioritised the service provider market because of it but still, 5 years later, we still had much the same regularity in these scenarios hitting the pipeline. Point is that the “private cloud software on some servers” approach feeds old behaviours and can stall cloud adoption (i.e. new tech, new methods etc.) for legitimate workloads.
This leads on nicely from my last component. I expect that part of the reason Azure Stack exists is to please traditional on-premise hardware and software partners as they, in some cases, get left behind by the move to cloud. The problem of “compete” versus “complement” is accentuated by these partners. Many partners have legacy hosting businesses which ultimately compete with Azure, even when not in the same league. They are hearing the familiar sound of a vacuum cleaner sucking up workloads from one to the other. On top of a conflict of interest these partners have sales teams who are used to selling hypervisor management software and servers. Much easier than selling this cloud adoption and transformation malarky, that’s hard and those deals are smaller. Big fish to little fish and a huge rebalancing act which will never truly match up: you’re not likely to find that old 5-year commit and $500m deal. It’s too easy for Azure Stack to become the default for partners rather than Azure because (1) it’s like the old hardware, software and hosting stuff (2) it’s going to perpetuate some managed hosting sales (right?).
At Eucalyptus we prided ourselves on ease of installation and operation. We had some larger deployments (100+ servers) whereby the platforms were run by small numbers of staff but these teams needed to be *very* highly skilled. Running a feature-rich distributed system is hard. As you add more features, complexity increases. As complexity increases you need more and more smart people and domain experts (e.g. networking, DB, storage) to keep it running nicely and you need to maker bolder and bolder design and build decisions. This is a scale or economy of scale issue which suits public cloud managed services so well.
This was the future facing Eucalyptus: take a different approach to making it simple (refuse to implement some services, focus on niche use cases, create a fully integrated appliance etc.) or have a customer hire more and more clever folks to keep it running. It taught us some valuable lessons on qualification, sizing, staffing, integrations and ops.
We also spent large amounts of effort trying to enable partners and customers to operate Eucalyptus effectively, let alone use it. Something Microsoft also seem to be articulating. They say you will need a new breed of sysadmins:
Azure Stack will need special sysadmins, says Microsoft
Microsoft reckons its forthcoming Azure Stack on-premises cloud needs a special breed of sysadmin to keep it humming…
Do you really want to focus business effort on building teams around the plumbing?
It’s on this basis that some of the main implementation issues with partners and customers come in. We struggled to find those who had technically capable consultants and architects who could make a Eucalyptus deployment successful. It’s not to say there weren't any, it’s just that an implementation covers a broad set of technology disciplines, at depth. Much of the burden fell on ourselves to pick up the slack. I have to give some serious kudos to members of my team and our customer success team who worked hard on making customers successful. When it comes to Azure Stack, I would expect Microsoft to have to put a tonne of effort into implementations, perhaps disproportionate levels of effort, often taking the reigns on failed or struggling deployments. It’s going to be more popular than Eucalyptus, I expect so if we were busy they’ll be very busy. However, this could be effort better placed somewhere else in the business. This has technical implications too: a service team working on Azure Service X spends 50% of it’s time trying to fix issues with just getting it’s service packaged, deployed and running reliably in Azure Stack, let alone used by anyone. Is that sensible? They could be spending that 50% working on new features and capabilities within public.
It’s also worth mentioning the serverless aspect to Azure Stack. It’s a nice marketing slide in that Azure Functions is shipped with the platform. Serverless is the next level in abstraction away from server management but private cloud by it’s very nature demands that you operate the underlying infrastructure you would otherwise have someone else handle. So you've merely shifted the problem around internally rather than removed or externally offloaded it via a managed service.
I think over time Microsoft might also see that there is a strong case to be made for Microsoft ditching Azure Stack partners and producing an integrated and limited appliance which they fully own and control. This could increase their chances of success as they find partners unable to implement effectively.
With Eucalyptus we also found that implementing some AWS-compatible services presented higher order challenges. A good example would be RDS. It wasn't so much the technical challenges with creating the service but more the ongoing support and maintenance component of such a service. You need a tonne of automation and some very good SRE’s with very strong database knowledge to keep the RDS service running on a daily basis. Not all customers or users have those. Microsoft will experience the same: customers will want more and more service implementations but it opens Pandora’s box should they implement them. This is a part of why you don’t see that many Azure services implemented in Azure Stack. Some just don’t fit easily — a service which requires a tonne of management components and custom hardware just isn't easy to compress into a typically smaller private implementation on ISS (Industry Standard Servers) over which there is considerably less control (change/config control, hardware etc.). Could CosmosDB be implemented on Azure Stack with very little change to the management plane requirements? Probably not.
What about blowback? A feature in Azure Stack dictating a feature in Azure or vice-versa. It could mean that a feature in Azure needs to be held for something in Azure Stack to catch up — maybe a joint release or synchronised release? Maybe a critical feature? It can come back to this effort conundrum. It’s severity will differ but this problem will always be there. Users utilising a service on Azure Stack will always complain that it’s missing a particular feature present in Azure public. Do they wait? Does Microsoft divert some resources to prioritise the feature (thus de-prioritising something else?)? Maybe this is noise they can do without.
Circling back around to serverless again, on the back of feature/service subset. Another one of the biggest benefits in serverless is the ecosystem and integrations. If you’re on-prem deployment lacks a tonne of public features this benefit is highly marginalised. We found the same thing with Eucalyptus when considering Lambda. It sounded great on paper but without direct access to services like API Gateway, DynamoDB, RDS, Elastic Transcoder etc. etc. it’s usefulness was severely curtailed. You then have to architect hybrid scenarios which for the vast majority of use cases you previously deployed private cloud for, are simply not suitable on principle and in practice.
Serverless and an Intelligent Edge?
So is Azure Stack a valid edge play? In it’s current form I'm unconvinced since it leaves too much to go wrong to the partner and is too closely tied to a legacy concept of private cloud. It’s interesting to see that Functions has been a top priority for Microsoft in terms of Azure Stack release features, they recognise the new trend as a marketing strength and see the old mould as a weakness.
I think it’s useful here to compare between Microsoft and AWS to illustrate the different paradigms, currently. Amazon also have a plan for the edge. It involves Snowball Edge, Lambda@Edge and Greengrass. I’m sure they will become even bolder here but the point is that their offerings are far more sophisticated in ushering in a new serverless computing paradigm as the primary form of on-premise play rather than pure VM’s or even containers, as Azure Stack does. As mentioned, a couple of the biggest benefits in serverless are ecosystem integration and doing away with infra management.
Anyway, coming from a Eucalyptus background thats my 2p.
If Azure Stack and service engineers have pagers, they can expect many sleepless nights in future.
Disclaimer: since writing this article I have joined AWS.