Five Frequently Asked Questions about Serverless

This post owes a great debt to Amiram Shachar’s article, “The hidden cost of serverless”, and Mike Roberts article, “Serverless Architectures”.

We’re all really excited about serverless, and with good reason, it’s a pretty interesting alternative to traditional infrastructure. It’s also relatively easy to get up and running quickly, thanks to specialized tool kits developed by the open source community, and of course frameworks provided by vendors.

There are many advantages to using serverless. The two I find most compelling are on-demand and near instant scalability, and only paying for what you need, when you need it.

With all the excitement I think we should strive to exercise pragmatism, and make sure we’re making the right choice in the long term for our businesses and engineering teams.What follows are the five frequently asked questions I’ve encountered with regards to moving to serverless.

Question #1: Do I need SRE or Dev Ops if I move to Serverless?

In some very specific, and rare circumstances, you won’t need a dedicated SRE or Dev Ops team to run your serverless environment. However, I don’t think it is safe to assume that you will never need the skills and abilities of a Site Reliability Engineer on your team if you’re moving to this new architecture.

It would be a mistake to assume that you’ll never need any other piece of infrastructure to run your application, and that your cloud provider’s serverless environment will completely cover all of your needs during the life cycle of your application. SREs and Dev Ops teams will still manage the areas of your infrastructure that are not just serverless functions.

In other words, we need to ensure that we’re not assuming SREs are just the people on a team who stand up servers.

A good definition of SRE comes from Ben Treynor, who called it,

“what happens when a software engineer is tasked with what used to be called operations.”

Site Reliability Engineers have very specific skills- a lot of their talent lies in being able to predict failure points in your infrastructure and build solutions to remediate them, all while keeping the lights on and your application environment humming. A good SRE will be able to tell if you’re in a bad spot, and address the issue before it becomes a production incident.

Consider too that most serverless deployments need other infrastructure to run. You’ll need API Gateways to handle and route requests, IP addresses and DNS entries to actually serve the entire bundle to the wider world, and in most cases, serverless doesn’t eliminate the need for persistence layers, like S3 or cloud hosted databases.

Question #2: Does serverless work with any application?

Serverless environments are designed to be limited. They’re great for event based architectures, for example. But, if you’re running an application that needs intense CPU cycles to do its job, or, if your app needs guaranteed low-latency, serverless may not be the right fit for you.

I’ll echo a sentiment I’ve heard many times, which is serverless is best adopted right at the beginning, e.g. in a greenfield project. All of your functions need to fall within the constraints imposed upon them by the environment, which means your code architecture needs to consider these requirements fundamentally, on the outset of development. In other words, the constraints imposed upon you by a serverless environment need to be a primary concern, and your code practices need to follow accordingly.

Migrating a brownfield or legacy application to serverless without a wholesale rewrite seems nearly impossible. The requirements for moving to function based units of work just don’t map well to applications that were not written as serverless first.

As an aside, I am curious to see which apps will eventually outgrow or migrate away from serverless offerings. We’ve seen similar trends in the move towards microservice architectures, and the eventual regression back to monoliths. The take away is that you might have to abandon a serverless implementation if the constraints restrict the ability of the application to grow and remain maintainable. And, that regression might also come with high costs.

Question #3: When do I know my team is ready to embrace serverless?

We have to keep in mind that any move in architecture or approach also needs to be met with a restructuring of your engineering processes. A drastic shift in architecture might also require a change in how you organize your engineering teams, which means there probably isn’t an optimal, single moment in time to introduce serverless.

For example, if your team is currently building monolithic rails applications on heroku, the move to serverless is going to take considerable effort. Code will need to be rewritten to pivot to functions first. Dependencies and libraries are packaged completely differently in a serverless architecture, and integration testing significantly changes when your integration point is an event hook in a cloud hosted function.

On many monolithic teams, everyone contributes to the same code base. This might have to shift in a serverless world, where application concerns are broken up into stand alone projects. Or, if your team chooses to embrace the monorepo approach, there will still need to be reorientation on who is responsible for what piece of the app- and it may not be a one-to-one pivot from your old code contribution practices.

Consider the amount of effort it would take to pivot a traditional web based application from a MVC architecture to FaaS. The View layer, if it hasn’t been broken out of the monolith already, needs to be extrapolated into its own application. The Controller layer is probably where the most one-to-one portability could happen, but that’s an optimistic statement- monolithic controllers are notorious for their complexity and mixing of concerns. In the Model layer though, things become even more dicey- where do these entities live in FaaS? If your engineers are used to thinking of architecture in the MVC way, the jump to serverless is going to include the overhead of rethinking old patterns and replacing them with a completely different mindset.

If your team is on a microservice architecture today, you can’t assume that a move to serverless will take less work. While the concepts and mental models will be familiar, code still needs to be reoriented to work in FaaS, and I’ll wager a much of your team structure needs to change as a result.

I think the best approach for existing applications seems to be to introduce serverless functions slowly, possibly as a piece of orchestration or “glue code” that helps alleviate some of the overhead you’d have if you stood up yet another server to host these tools. Eventually, your team’s familiarity with the practice will grow, and you’ll be able to pragmatically weight the cost and benefits while introducing more serverless capabilities in your environment. This approach also allows your team to update their practices and processes to better suit the new architecture.

Question #4: Does serverless reduce vendor lock?

Every major cloud provider has a serverless offering, which to some, might speak to its portability and potential to move to other cloud providers, should one become prohibitively expensive. But let’s revisit a limitation of serverless, namely, application state.

FaaS environments limit memory and local disk storage. What is more, function based state is ephemeral — it doesn’t stick around when the function spins down after doing its work. This means you need to have some other way to store state, at least if your application requires state. If your application is truly stateless, this definitely won’t be a concern- with probably little effort, you’d be able to move your function layer to any of the major providers.

I haven’t seen many applications that are purely stateless, so you’re going to beholden to your vendor to provide you a persistence layer for your app. Whether that is S3, google cloud storage, or a provider hosted database, state and the need for persistence is usually status quo.

You incur costs when you transfer data from one provider to another. In the case of storage like S3, you’re bound to the rules of data transfer costs, which are usually documented by the provider themselves. Unless you have a trivial amount of data, which is rare in production systems, transferring data between providers can become very expensive.

S3 Request Costs provided by AWS

It’s also worth noting that any move between providers will not be automatic- you will need engineering personnel in place to facilitate the migration between them. Whether that is writing ETL style scripts to move database instances, or orchestrating s3 data transfers, these costs still exist, whether you’re going with an in-house migration effort or if you’re outsourcing migration to a consultancy.

Question #5: Will serverless reduce my cloud spend?

The classic consultant’s answer to this question is, “it depends”. The excellent post by Amiram Shachar goes into greater detail about the underlying costs of running on serverless infrastructure. The take away is that you have to consider the cost of your entire infrastructure, not just the functions themselves. API Gateways, logs, alerts, DNS, IP addresses, and yes, storage all add up in your infrastructure bill.

A well thought out, well planned serverless deployment can save you money, depending on your needs. But as I’ve illustrated above, there are underlying costs in retooling engineering practices and teams, and the actual engineering effort in moving to serverless, that need to be rooted in any conversation about cost savings.