My Database: Is it 🤷‍♂️serverless🤷‍♂️? An opinionated Checklist

When building a 🤷‍♂️serverless🤷‍♂️ application, not only the code execution has to be ephemeral — it likely requires one or more data stores, which are usually on continuously running servers.

It takes advanced knowledge of systems and database administration to properly scale, replicate and/or shard RDBMS servers. A 🤷‍♂️serverless🤷‍♂️ backend introduces even more challenges into that domain:

  1. Connection pooling across unlimited 🤷‍♂️serverless🤷‍♂️ functions is not possible.
  2. Most 🤷‍♂️serverless🤷‍♂️ platforms don’t give guarantees in which network/subnet your function will eventually run, but database servers often operate in a strict networking environment. Granting the function access can be challenge.
  3. While the 🤷‍♂️serverless🤷‍♂️ function scales instantly, globally, and automatically, most database servers do not. Adding new replications/shards to your cluster takes a long time, locality is key, and provisioning is usually done manually (although much could be automated). Not a good fit, right? 🤔

There are similar issues with auto-scaling micro service systems, although to a lesser extent. One (and by that I mean me 🙌) usually starts to invest in humongous database servers to basically delay the issue. To quote my twitter-self:

Don’t assume I live by me tweets :P

Here is a checklist telling you whether your database is 🙅‍♂️serverfull🙅‍♂️:

  1. Pay per use not an option? (Usually includes storage size, network throughput and # of requests) [y/n]
  2. Do you tend to overspend on resources? ("Let’s add one more node to get at least to next quarter without any issues. Again.") [y/n]
  3. Does it require high provisioning and maintenance time? Are you thinking of hiring a “Database Administrator”? [y/n/I am a DB Admin, lol]
  4. Is scaling throughout multiple data centers in different world-regions hard or introduces new limitations? [y/n]

If your answer is yes on all 4 questions, congrats: you’re running Oracle, MySQL, or some derivate.

You could just negate the list and have yourself a 🤷‍♂️serverless🤷‍♂️ Database checklist, but it would also pass a globally replicated .txt file on S3. We still require a subset of the features given by traditional databases, depending on the use case: Analytics and Monitoring, ACID Consistency, High Security, Relations, Schemafull/lessness — to name the most prominent.

To deliver on the title-promise:

  1. Supports true pay per use? [y/n]
  2. Provisioning, scaling, backups, and other maintenance fully automated? [y/n]
  3. Offers global locality on-demand? [y/n]
  4. Networking is not an issue over multiple data centers / regions? [y/n]
  5. Provides an analytics and/or monitoring interface to study your current load? (e.g. query execution times) [y/n]
  6. A database model that fits your domain problem? (ACID Transactions, Schema(less), Relations, …) [y/n]

6 x yes means the thing you’re looking at is truly 🤷‍♂️serverless🤷‍♂️ and probably a database. I would say you’re good to go.


The earliest and most popular thing that was called a 🤷‍♂️serverless🤷‍♂️ database, was probably AWS’s DynamoDB, and as Forrest Brazeal correctly pointed out, in Nov 2017: DynamoDB isn’t for everyone.

It changed/introduced new features since then, so here is the checklist filled out for DynamoDB:

  1. Supports true pay per use?

Since on-demand pricing was introduced: yes.

2. Provisioning, scaling, backups, and other maintenance fully automated?

Yes!

3. Offers global locality on-demand?

Say “Hi!” to Global Tables: Yes.

4. Networking is not an issue over multiple data centers / regions?

More or less yes, especially when you vendor lock. The Serverless Framework even lets you setup DynamoDB tables in your servelress.yml and running with AWS Lambda never caused issues for me. Running on other Serverless Platforms: 🤷‍♂️.

5. Provides an analytics and/or monitoring interface to study your current load? (e.g. query execution times)

Not really. DynamoDB is not providing much monitoring insight. There are some interesting CloudWatch metrics, but still a long way to the level of monitoring we are looking for. But based on this Blog Post by Segment, AWS can provide you deeper insights on request. 🙆‍♂️ That’s something.

6. A database model that fits your domain problem? (ACID Transactions, Schema(less), Relations, …)

There are a few uses cases where it may fit. Up to you.


So based on my opinionated checklist, DynamoDB scores somewhere between 3 and 4 out of 6. I would say it’s a “serverless database-like storage-thingy on AWS”. You’re free to use that term, selling it to your Boss. U welcome.

MANY TANKS TO Jignesh Solanki, Maciej Winnicki, Alex DeBrie, and Forrest Brazeal, whose articles on the topic:

  1. Serverless Databases: The Future of Event Driven Architecture
  2. Serverless Database Wishlist
  3. Serverless Aurora: What it means and why it’s the future of our data
  4. WhynamoDB

are the basis for this list.

Cheers,

Alex

Edit 15th of March — 08:45 UTC

A friendly commentator on reddit pointed out, a feature I would label as nice to have, but not required for my definition of a serverless database: Offline emulation, meaning I don’t need a server to test … so serverless testing? DynamoDB offers a JAR for that.