Headed to Amazon re:Invent 2017? Here’s a recap of all things Cloud, Serverless & AI from this past year.

10 min readNov 27, 2017

At Work-Bench, our investment ideas are informed by the pain points of Fortune 500 technology leaders who attend our roundtables and events. While we see a lot of “marketecture” on treating infrastructure as code and end to end automation across clouds, what we’re hearing is that it’s still a ways away from reality at a large enterprise today. Making the right engineering choices is a challenge and most infrastructure executives are struggling because of the rapid pace of innovation, existing investments made and evolving their teams to implement, run and manage these systems. Our roundtables function more like a support group.

In this post, I want to share themes that have been top of mind over the past year that could be useful for practitioners and executives to think about in preparation for upcoming conferences like Amazon’s re:Invent (and CloudNativeCon in Austin). For a full overview of enterprise technology trends, be sure to check out our Enterprise Almanac.

The Big Three

The public cloud provides major advantages around cost efficiency and flexibility for end users and 2017 was an all-out battle between Amazon, Microsoft and Alphabet to become the #1 plumbing for public cloud infrastructure and applications, which according to Gartner, will be a $400B market in 2020. Amazon continues to benefit from its first-mover advantage, portfolio of web services & developer mindshare but we recently noticed that half of our roundtable attendees are using Azure in tandem with AWS (with some sprinkles of Google GCE usage, too). Just this last year, AWS has grown and beaten [analyst] expectations, leading the market and is expected to close out the year with $17B+ in public cloud revenue. Microsoft is behind but catching up quickly, thanks to Satya Nadella’s amazing transformation of the entire company and their focus on open source, choice and usability. This has repainted what’s possible for enterprise customers, and Microsoft knows this segment better than anyone else with its Office Suite, depended on by so many companies worldwide. At Google, it’s a different story: it has the most sophisticated infrastructure but limited understanding of the enterprise customer. It remains to be seen if the company can develop a strong enterprise sales strategy, but its two biggest open source products, Kubernetes and Tensorflow are powerful beachheads that could help them win this war in the long run.

Shifts & Standards

We’re in the middle of some profound shifts in enterprise infrastructure. I’m referring to the architectural designs inspired by warehouse scale computing at web scale companies like Google and Facebook which have enabled the trends like microservices and cloud native to become more mainstream. Open source software is also a shift in the new enterprise IT stack and it’s way beyond just replacing traditional product marketing and distribution models. It changes the game on how all companies develop, build, improve, monetize, engage, and partner. Traditionally incumbents had a huge advantage in terms of access and delivery to customers… but that’s changing as software is easier than ever to download in a few clicks. To see the companies and technologies that make up this landscape, be sure to check out Lenny Pruss’ Cloud Native Foundation’s Landscape.

Kubernetes

While Docker has become the standard for containers, with some great value streams flowing around it, Kubernetes has emerged as the defacto container orchestration platform, allowing larger organizations make the potential of containers an operational reality. We spotted this early, thanks to our network, and wrote a blog post, which had then generated some criticism in the broader ecosystem around actual growth metrics. Listing out the data points or reasons why Kubernetes is the leader here would take forever, (and afterall, Kevin Casey has already done it), but to demonstrate how relevant and real it has become, Kubernetes was used by HBO to stream the season 7 premiere of Game of Thrones to millions of users. Will Amazon announce a managed Kubernetes offering at re:invent? 100%.

Nevertheless, while there are tech forward enterprises finding success with their k8s deployments, there are still challenges to be had for the rest of the Fortune 500, especially larger players with bigger IT operations. We frequently hear that it’s still extremely difficult to implement K8s and keep it up and running. Some of the distributions also fork way too many features and present lock-in risk (although a very recent agreement within the CNCF may change that). Players like Heptio, building a solid reputation with high touch customer engagement, will be able to identify and build core and viable products necessary using their consultative approach.

Solving Multi-Cloud

Standards around container format have been fought and won, and now orchestration is getting sorted as vendors shift resources to Kubernetes. So what’s the next largest opportunity? Solving multi-cloud. It’s top of mind for nearly all infrastructure executives who want to develop and adopt these strategies as technologies improve. A multi cloud strategy helps ensure that you’re not keeping all of your eggs in one basket. Using a single cloud vendor presents a slew of problems including vendor lock-in, risk of outages, and potential bandwidth issues. Being cloud agnostic, Kubernetes is most definitely an enabler for multi-cloud, but it will be up to startups like CoreOS and Hashicorp to offer enterprises a better path. CoreOS makes Kubernetes (and the best open source technologies) more enterprise-friendly and builds Automated Operations around it. Hashicorp’s products, Terraform & Vault in particular, are becoming go-to choices for defining & using data center infrastructure as code. Companies still early in their cloud journey should expect challenges with applications that were not architected for multiple clouds. Rest assured, the right level of abstraction required between a company’s IT footprint and underlying cloud APIs are evolving quickly through open source / commercial offerings.

Observability is a big problem (and opportunity)

Transparency is critical when dealing with distributed systems across clouds. Microservices multiply the number of resources that operators need to track, and tooling that provides visibility is more important than anything else. There’s a slew of new data across the stack that needs to be collected and tracked in order to debug failures that commonly occur in production. Observability not only opens the hood into the health and activity of applications across clouds and on-premise but also implements enterprise features like monitoring, alerting, visualization, tracing, log aggregation and analytics. On the flip side, it’s also required for the security of these systems. There’s promising work happening on all of this now through open source projects like Prometheus, OpenTracing (inspired by Google’s Dapper paper), service mesh frameworks like Istio & LinkerD, and commercial solutions like Lightstep, Netsil and Honeycomb. Still a little early to say what will actually subsumed by the broader Kubernetes ecosystem and what will manifest itself in commercial software offerings. We’re at risk of too many science projects out there that serve as cool technology without a business need… and it makes me a bit cautious about the viability of some of these tools.

So what about “Serverless?”

Serverless refers to a new generation of PaaS offerings where the infrastructure provider handles most of the responsibility for incoming client requests, like capacity planning, task scheduling and monitoring. Developers only need to prescribe the logic for processing the requests allowing them to focus on building and running auto-scaling applications without the pains of provisioning or maintaining servers.

It could be our generation’s requirement for instant results that has made serverless platforms the new thing, but with any early trend, these technologies have some challenges which limit their usability and applicability for production use cases today. These are things like slow performance, increased complexity, lock-in risk, deploying and debugging in a multi-cloud setup, and performance monitoring (to keep bills low).

Challenges aside, it’s still too early to draw lines in the sand but I’m amazed by efforts by the small but passionate community forming. Because it’s still new, serverless technology does not have a proper ecosystem of tools available for developers and no standards have yet sprouted on the right format for deployment. I’m actively looking for a standard to emerge here, but this could take some time. I also think there’s an opportunity for a platform play that allow developers to write, compose and share functions more easily. Further, any platform that makes it easier for developers to add functionality or intelligence to existing applications can be successful, especially as they abstract away complex deployment & monitoring responsibilities. This was what we saw when we invested in Algorithmia, applying this concept to Data Science at every scale and across any cloud with its Serverless AI Layer.

Which reminds me… this post would not be complete without talking about AI, aside from being all the hype nowadays, will be a big part of Amazon’s keynote announcements.

AI (Infrastructure)

There’s a good chance that Amazon will announce managed ML & AI platform… It’s top of mind for many of their customers who need to provide access, tooling and some peace of mind for their data scientists, engineers and operators. Much of this requires a data pipeline and production grade system to manage DevOps on AI workloads. I’ll be waiting to see if Amazon can offer this as a platform with pay as you go billing, elastic scaling and speed. Anything less and they risk being behind the curve of companies like Algorithmia who are democratizing state of the art AI by making it easier to create, share, and productionize machine learning models in a serverless fashion.

If announcements from last year are any telling, Amazon will forgo the opportunity to solve real industry problems. Announcements made last year, like Glue (a cloud ETL service), show that Amazon isn’t interested in solving larger problems at hand but instead wants to put pressure on traditional incumbents (like Oracle, SAP and IBM) for their business. There are limitations to the traditional ETL approach (regardless if cloud-based or not). Large enterprises have gone through years of re-orgs, M&A, and technology waves and as a result have multiple data silos with differing schemas, formats and quality. Data variety is the biggest challenge stopping large companies from achieving analytic and operational breakthroughs and moving ETL or MDM to the cloud won’t help solve these problems. Database vets (and gods) Michael Stonebraker, Andy Palmer and Ihab Ilyas are actively solving for this problem using probabilistic data unification at Tamr.

It’s also about time to see some automation and AI magic applied to running workloads on AWS to make them more efficient. Predictive scaling, for example, can be made better using machine learning. Aside from tooling and infrastructure, I’m most interested in actual startups focused on Vertical AI, a notion pioneered by Bradford Cross, CEO of Merlon Intelligence, an “AI for Compliance” startup that we’ve invested in. At its core, these types of startups use subject-matter expertise, state of the art AI, and proprietary data to deliver their product’s core value proposition, typically in industries outside of technology and those that haven’t seen much tech innovation in recent time.

Other areas I’m tracking that are still a little early but promising:

Authorized Dataset Management & Ops Platforms, which allow AI systems to train on datasets owned internally across lines of businesses as well as different organizations without compromising confidentiality.
Model Explainability, especially for deep learning, which yields impressive results but is notoriously difficult to debug and explain, necessary for regulated businesses.
Security for Mission Critical AI, from banking to any machine performing autonomous driving, surgery or workplace automation. Are security enclaves enough?
The Edge for AI, becoming increasingly important with the advent of self driving cars, robotics and personalization. Some functionality that’s latency sensitive can be offloaded to the edge device (like a mobile phone) while training across devices and more sophisticated modeling needs to be done in the datacenter.

Speaking of…

The Edge

The edge represents a move from centralized computing to a more distributed model and Silicon Valley believes that this is where the puck is headed. If this sounds interesting, be sure to check out Peter Levine’s talk on The End of Cloud Computing. While all of this makes sense and is exciting, I don’t believe that timing’s there (yet) though Amazon and Microsoft are beginning to develop solutions. Today, the right thing to do at the enterprise level is to use open source, hire great talent, evolve your workforce (both the old army and the new) and keep it going. There is no one size fits all solution to your infrastructure needs and you keep working at it until you get to a more open, easy, and agile solution which is conducive to the way your business operates. In turn, new products and capabilities will unravel and you’ll be ready to play in any technology shift.

This piece was a reflection of the past year ahead of re:Invent… Our F500 network helps us suss out shiny toys from must need technology that solves a business problem and I look at infrastructure technology through that lens. If there’s something I missed (or misunderstood), please comment and let me know. If you’re a founder with thoughts on any of these topics, I’d love to talk to you.

Note: Algorithmia, CoreOS, Merlon Intelligence and Tamr are Work-Bench Portfolio companies.

Last but not least, a special thanks to Jonathan Lehr, Jessica Lin, Michael Yamnitsky, Kelley Mak for a Sunday night set of eyes on this post and these views would not have been possible without our fantastic roundtable of Next Gen Infrastructure leaders in New York.