At this year’s Google Cloud Next I co-presented a session entitled “How to grow a spreadsheet into an application,” where the thesis was that it’s OK to start with a spreadsheet and layer on additional technologies as you gain adoption and your requirements change. However that doesn’t mean that Sheets is always a great choice for storing your application’s data, and in this post I’ll explore some of the signals that you should be looking into other database options.
Note: In this post I use the term “database” to refer to the backend data store a software developer uses when building an application. For the more colloquial meaning of “a list of stuff I’m trying to keep track of,” fear not, a spreadsheet is a great option!
Sheets are for people
First and foremost Sheets is a tool for users, allowing them to explore and operate on their data. It’s a data store that’s optimized for human behavior, not the scale and throughput that most applications require. No schema, no problem! Calculated fields that depend on other calculated fields, sure!
Although you can access and modify sheets programmatically — using the Drive API, Sheets API, or Apps Script — the underlying architecture was built and optimized for users. For example, Sheets live in a user’s Google Drive, and if their account is locked or deleted the data goes with it.
Rule of thumb: if you don’t expect users to open the spreadsheet then you probably shouldn’t be storing your data there.
Not all Sheets usage is the same
Generally speaking there are three broad categories of applications that store data in Sheets:
- Collaborating on spreadsheet — The application is like an assistant, doing something useful for the user and very much centered on the spreadsheet. For example, you may have an application that keeps some data in the sheet in sync with an external system. If the user deletes the spreadsheet that’s fine, the application was just there to help.
- Using a spreadsheet as input/output — The application is using the spreadsheet as a user-friendly interface for data entry or reporting. For example, allowing users to bulk add new records or download a report. If the user deletes the spreadsheet that’s fine, the data always lived somewhere else.
- Using the spreadsheet as a database — The application is using using Sheets as the permanent home for some data and serving from it. For example, small workflow applications like expense approval. If the user deletes the spreadsheet, the application is broken.
Apps that fall into category #1 and #2 are completely fine in my book, since they are operating with users at the center and at human-scale. Category #3 is the focus of this post and the most fraught.
You should only consider using Sheets as a database if the data is small, the traffic is low, and some subset of users (approvers, etc) are still regularly opening the spreadsheet. If any of those are not true then you should store your data to a proper database.
The cold, hard numbers
Even if you want to ignore all of the “right” ways to use spreadsheets, there are just some hard limits that make them a poor fit as a database. While the Sheets team has made great strides over the years in allowing more data to be stored in a spreadsheet, the current limit of 5 million cells likely won’t be enough for many large- or medium-sized applications. Additionally, Sheets API quotas cap your application somewhere between 1–5 QPS, putting an upper bound on the real-time features you can provide and simultaneous users you can support.
Database features you’ll miss
For all that Sheets has going for it, there are some common features you are going to miss if you rely on it as a database:
- Queries: aside from the
=QUERYformula, you can’t really search across your data
- Consistency: anything goes in a spreadsheet cell, and user edits add even more chaos
- Joins: you can fake a little with
=IMPORTRANGE, but when you get beyond a table or two of data things break down
Alternatives to consider
So I’ve convinced you to back away from the spreadsheet, now what?
The Firebase Realtime Database is a great way to store blobs of JSON and get them back again later. If you don’t care about database features and just want more space or throughput this is a great option. The newer Cloud Firestore database expands on this by further increasing scale and adding support for schemas. Both have a free tier that make it an attractive next step if you’ve outgrown a spreadsheet.
If you are in the mood for a more traditional database then Google Cloud SQL is an option, allowing you to setup fully-managed MySQL and PostgreSQL databases in just minutes. This is what Google App Maker uses under the hood, and there are tons of ORMs that make it easy to work with the data in your programming language of choice. The pricing here is a bit complex (to say the least) but TL;DR you’ll be spending at a minimum a couple of bucks a month just to keep the servers warm.
Lastly, if you need a data warehouse instead of a database (logging transactions, etc) then you should tale a look at BigQuery. It’s great at storing and analyzing large amounts of structured data, and with the upcoming connected sheets feature you can easily bring that data back into Sheets for further exploration.
But don’t forget about Sheets!
Just because you moved your data to another database doesn’t mean you need to give up on Sheets completely. You can pivot to the other categories of usage I outlined earlier, using Sheets as an extension of your UI instead of as a data store. For example, in our Sheets API codelab we demonstrate how you can use Sheets as a custom reporting tool for your application, syncing select data into a spreadsheet and automatically generating useful pivot tables and charts.