Capturing Lightning: Moving to a Microservices and Serverless World
Three and a half years ago I wrote a blog post about a novel open source project called Docker that was starting to make waves in the OSS infrastructure world.
The world has changed since 2014. San Francisco median rent prices are falling. Trap music has overtaken electro-house in pop music. And microservices platforms like Docker, which were once considered “rocket science projects” even in the geeky arcane world of infrastructure technology, have become standard form factors for computing in Silicon Valley.
However unlike the bro-romper, microservices are definitely not a Silicon Valley oddity. With large, non-Silicon Valley enterprises such as JP Morgan Chase and Wall-Mart openly embracing microservices architectures in production environments (and trusting tens of billions of dollars annually on microservices architectures), it is becoming plainly apparent that microservices and even more granular styles of computing architectures are becoming the global norm in infrastructure technology.
This movement towards increasingly more atomic units of virtual computing is being propelled by some extremely attractive opportunities in terms of performance and economics for computing. But it is not without consequences, and moving to a microservices and serverless world poses a unique set of challenges given our current conventions around infrastructure and software development.
We’re Not in Kansas Anymore
Before we jump into looking at how the disruption of microservices platforms changes the world of IT, it’s probably important to revisit just exactly what microservices and serverless platforms are as well as where we’re coming from as a whole in virtual computing.
Most virtual computing form factors today focus on virtual machines, logical pieces of software that capitalize on the spare computing power of a server to run multiple full operating systems on a single OS. Powered by a software hypervisor that brokers access to the single physical machine’s computing resources (memory, networking, processor time, etc.), VMs allow you to write and run software that more efficiently takes advantage of existing physical resources and effectively do more with the same — a process we tend to call virtualization.
Virtual machines have been an extremely disruptive force in computing and even the global economy over the last decade. Combined together, server virtualization and desktop virtualization have reached an astounding $30B+ in global market size. For comparison, this is nearly double the amount of money spent every year on the global coffee industry with over triple the growth rate.
But a quarter of this massive market has reached, as Gartner calls it, a sense of “maturity.” 2016 saw the first time that the growth rate of virtualization focused solely performance computing and traditional application server workloads beginning to stall. Says Gartner’s recent report on the subject:
“New software licenses [for x86 virtual servers] have declined for the first time since this market became mainstream more than a decade ago.”
This data provokes an interesting thought exercise around whether we’ve simply reached the carrying capacity of the market or if something else is afoot.
Indeed the data seems to point to the latter: something big has begun drawing people away from on-prem, self-managed environments where virtual machines have thrived.
That something is the public cloud.
A Not-So Invisible Hand
While license counts (and accordingly Gartner and Forrester’s unit of measurement for capturing server virtualization) have measurably declined in the last year, the market size for public cloud adoption has continued to roar. In 2017, the market size for cloud services platforms like AWS and Azure have grown to an astounding $246 billion globally — or roughly 85% of the GDP of Singapore in 2015.
Much of this has to do with the mechanics of VMWare and other server virtualization vendors’ license models. The popularity of using AWS or Azure as a primary computing platform (or as an alternative / shared computing platform ala a hybrid cloud) has meant that there are fewer companies directly purchasing licenses from vendors like VMware or their resellers. This certainly changes how one measures this market in an algebraic sense, but it also changes the dynamic of how developers and infrastructure operators “see” virtualization.
In a world where Amazon and Microsoft obviate the need for me to care about my servers, as a developer I am increasingly incentivized to care less about the form factor that my computing platform — and thereby its hypervisor that hosts that platform — is delivered on.
This lack of a preference, need, or nostalgia for a particular computing form factor means that the only thing a developer really needs to care about is solving problems with code. And some of those problems have optimal sub-problems.
Divide and Conquer
One of my favorite quotes about computer science is that “computer science is as much about computers as astronomy is about telescopes.”
Indeed while the most common educational path into software engineering is CS, most undergraduate educations (and especially their graduate counterparts) focus instead on the mathematics of solving problems rather than the technical underpinnings of a computer’s physical components.
The core of how we solve problems is the algorithm. This term, which gets thrown Silicon Valley almost as much as the phrase “crushing it”, refers to a mathematically-analyzeable solution. Not all algorithms are the same. When the input for an algorithm gets excessively large, we have to start worrying about how long it takes for a computer to actually finish analyzing and solving the problem.
This is plainly apparent in areas like cryptography (where we purposely make things hard to secure the secrecy of data) or large scale data analysis. But even in seemingly “non-geeky” realms, algorithm optimization can make or break one’s ability to use a computer.
Sometimes optimized algorithms seem weird or, in the case of sorting a list of integers, much more complex than other simpler but less efficient systems.
For example, in the above graphic we analyze the real time running speed of a set of algorithms on sorting a list of numbers. On the far left we have insertion bubble sort, an algorithm that is very similar to bubble sort in both its efficiency and its approach. On the far right we have the well-named quicksort, a much more efficient algorithm that takes a wildly different approach.
Selection sort and bubble sort make much more intuitive sense to solving the problem of sorting a list smallest to largest: you look through each number and, at each step, re-arrange each previous number in order. While they differ in the methodology on how they pull this off, they’re very akin to what most folks would normally do when given something like a set of cards and told to arrange them in order.
Think of how most people arrange cards in their hand when they play poker or bridge. This is basically a form of selection sort.
Quicksort is different. In this algorithm, the computer selects a “pivot point” in the middle of set of analysis, sorts elements that are larger to one side and smaller to another. It does so recursively (as in over and over until it hits a base case where you can’t bisect the analyzed group) rather than iterating through the list linearly like bubble sort and insertion sort. It’s also a little weird to conceptualize and probably unintuitive to how you might sort a list of numbers of shuffle a deck of cards normally. But it’s a lot faster.
Quicksort is an example of a divide and conquer algorithm. In this case, you divide up the problem into subproblems then, recursively, work your way back to the solution. There are host of other algorithms that take this approach.
Similar but different to divide and conquer is another approach called dynamic programming. Dynamic Programming (or “DP” for geeks like me who spent more time in TopCoder or an ACM match than rightfully drinking at a tailgate in college) is an algorithmic optimization technique that takes a similar approach by reducing complex problems to a series of subproblems.
While DP looks a lot like divide and conquer, it’s different in a few key ways. The problems DP looks to solve are not independent — solving one subproblem is linked directly to solving another subproblem. Additionally you hold state in a DP solution, “carrying” the information you sort at each step up with you as you scale up a tree of subproblems rather than pulling them up from the top down.
Arcane as DP and divide and conquer algorithms may seem, and there is a deep and messy set of math lying far underneath this that I’m desperately trying to avoid bringing into play, the use of breaking a problem down into subproblems and using one of these two approaches is critical to understanding how modern software engineers trained in CS work real world problems with big data.
For example, the knapsack problem is a classic usecase for dynamic programming. The problem focuses on attempting to build the best possible collection of items given the value of those items, the cost of carrying those items (a “weight”), and a maximum capacity of the container.
Classic examples of using knapsack in the real world are in shipping. Carriers like FedEx use more complex knapsack algorithms to determine what packages to put in their planes, given the shipping rate paid for the transport. But knapsack has other usecases beyond the obvious shipping and transport.
Given that it’s a means of solving other constrained optimization problems, it can be used in a wide swath of areas in industrial organization and management economics: from quantitatively determining what’s the optimal way to run a factory given the costs of labor and capital, to even running a war machine.
In fact, the math used in knapsack was even repurposed by mathematicians in the Allied forces during WW2 to help determine how many planes and tanks the United States could ship for Lend Lease and ultimately need to win the war.
Hacking the Gibson (with Containers and Serverless)
Whether crafting divide and conquer or DP solutions, the focus on reducing a large problem to a set of optimal subproblems has a potentially dramatic impact for solving complex computing solutions.
This poses a unique opportunity for containerized architectures and (increasingly) serverless architectures that take virtualization to the next by focusing on an isolated task rather than importing in an entire computer.
Let’s step back for a moment. A container is another abstraction in virtualization that uses shared computing resources to run a single or small set of applications. Rather than full VMs, which carry with them all of the possibilities but also all of the resource cost of a full operating system, containers are lean and singularly focused on that single application or task.
For enterprising developers who remember their computer science, a container poses a unique form factor to run the subproblems used in DP or divide and conquer style problems. Rather than occupying full precious processing threads or draining a full VM of its resources, containers could be used to dynamically spin up and spin down resources in solving a subproblem — echoing the solution back to the original process on its VM with minimal draws from memory or the machine’s processor.
Effectively for problems that can elegantly be solved with a DP or divide and conquer approach, a container allows you to do more with less.
The application-focused nature of containers has certainly been talked about ad nauseum in tech. But when we take a more problem-focused analysis of when and how to use containers based off of their applicability to certain problems, the reason why they’ve become adopted so quickly in the enterprise starts to become apparent.
Yes, it’s exciting to use a container to run a mirrored copy of a web server. But you know what’s really exciting? When you use containers to help sequence genomes by solving optimal subproblems in calculating the Needlemen-Wensch score of two combined DNA strands. Or how about solving FedEx and other carriers’ knapsack problems when they load planes and cargo ships with goods in an automated fashion? Or how about figuring our purchasing orders for inventory for similar types of constrained optimization in an automated fashion at scale for someone like Costco or Wall-Mart?
For many developers this is the real value of a container: its a compute form factor that costs less resources than a full VM and is logically built to handle sub-problems that I may face when using a DP or divide and conquer-style approach to solving problems. Many of these problems are challenges seen in large enterprises, and containers have become a great way of doing more with automation with the same (or less) IT infrastructure.
Serverless is similarly another way to support these kinds of solutions but with an added kick: I don’t even need the container. Even leaner than a docker container or a kubernetes pod, a serverless architecture relies on a process specific form factor where the computer needs to go out and accomplish a single task rather than host an application and its dependencies. The classic example of serverless is AWS Lambda, an interface where I can simply export a set of code to run and let an intelligent backend infrastructure “spit out” the answer before moving on.
Serverless is, in some ways, the nirvana of divide and conquer style solutions. For developers who don’t need to fine tune every aspect of their containers, it’s an elegant means of dividing up subproblems and letting someone else handle the whole messy business of infrastructure. The technology is still in its infancy however, and beyond its use with cloud providers on-prem, easy to manage serverless platforms are still very much on the horizon.
But while serverless and container infrastructures provide a new set of benefits, they also provide a unique set of challenges to developers and dev-ops alike.
There Is No Such Thing as a Free Lunch
For one, containers and (especially) serverless infrastructures are much more complex to manage than their traditional VM or bare metal counterparts. Both are ephemeral: they may live for a short period of time and only persist for as long as needed.
You don’t manage a container as much as you manage access to applications for resources to create and run certain types of containers, and this requires significant changes to SPOG (Single Pane of Glass) IT management platforms that focus on more static resources like a physical or virtual server.
The ephemeral and lean nature of both also makes writing applications in a traditional fashion challenging. For example, Kubernetes containers don’t have a traditional filesystem contained within their specific pods (the container form factor within kubernetes). So persistent data used by the pod needs to sit somewhere else outside the container — either in a database or other repository visible to the container.
Containers don’t have a ton of intelligence or even know they’re containers either, so in infrastructures like kubernetes you need to layer containers with other “helper” containers like init-pods or flexvolumes that help bootstrap the container with the information it needs to get started and working on a subproblem.
This gets much more complicated when we think about serverless. Again, with serverless architectures, the end goal is to simply not think about infrastructure. In that case, if the process you want to run needs access to resources, you have to figure out some means of dynamically addressing those external resources without the aid of “helper” serverless processes. Much like containers you need to draw on resources outside of the solution, but with the even more lean nature of a serverless process this can prove extremely challenging in complex usecases.
Ultimately, the surging rise of containers and serverless architectures shows how we can solve a massive swath of computing problems with a unique, lean architecture. But just as we can do more with less from a compute point of view, the dependency of both architectures on external static resources means that instead of “everything moving onto a container” we’re going to enter into a much more heterogeneous world in terms of computing form factors for virtualization.
To put it plainly: containers and serverless won’t take away traditional VMs or servers. Containers and the like will instead change how you use VMs and servers: less as places to monolithically run data, more as as place to manage the orchestration of containers and serverless processes as well as hold data and state for both.