Google Sheets as a Relational Database

Shambhavi Jahagirdar
VLEAD-Tech
Published in
9 min readAug 19, 2023

Sheets are used to store tabular data in day-to-day work life. This data storage system can be used as a database for relatively smaller-scale projects, to collect and fetch data from users. But for various reasons like data consistency, organised and extensible data structure, and operational convenience, a simple database is not manageable for even slightly complicated data structures — a relational database is required.

Here’s a handy solution to put together both of these factors — small-scale data and complicated data structure — in Relational Database in a Google Sheet. Read on to find out how.

TL;DR

Various sheets of a workbook in Google sheets can be related by using Data Validation options. Consider each sheet to be a table in the database and structure you database thus. To relate tables with column references, create a Named Range of the column to be referenced from. Add a Data Validation rule to create a Dropdown from this Named Range to the column which refers to this foreign key.
Add Data Validations wherever possible to make the sheet well connected, unambiguous and easy to maintain.
Finally, write an Apps Script to serve as the API call for data to be fetched from this database. Deploy the API to complete your backend!

Google Sheets provides an intuitive user interface to add/modify data — the most common operations involving simple clicks and dropdowns to start typing away. Data on google sheets is easy to visualise as it stores and displays data in the simplest structure — tables. This way, the data can cater to the services of both manual data reading and automated scripts easily.

Limitations to be considered with this system are short and crisp, which can clearly determine whether this database will suit your needs or not —

  1. Performance Speed: The performance of GS as a database in fetching and rendering APIs is well at low volumes but the performance is affected when the dataset grows large.
  2. Storage Limit: Google Sheets also has a storage limit of 5 million cells, which would seriously hamper the amount of data that can be stored in a company’s database with ever-increasing data.

For a detailed debate over the pros and cons of this arrangement, check out previous blog in this series.

This blog is a part of a series of blogs that includes:
1. Google Sheets as a Database — Pros and Cons
2. Google Sheets as a Relational Database (this article)
3. Apps Script as an API layer
4. Google Sheets and Apps Script: A complete featherweight BaaS

If you think your usage of this database will not encounter the major issues above, then you don’t have a reason to worry about choosing Google Sheets for your database. Let me show you how to create one.

To start with, create multiple worksheets in the same workbook. The workbook will serve as the relational database and the sheets will analogously represent a table each in the database.

We shall follow a running example of a school database, which currently has two sheets named Teachers and Subjects and looks as follows —

Teachers Sheet
Classes Sheet

Primary Key

The primary Key in each table is required to uniquely determine or reference an entry in the table. A primary key is a field (column) that has unique values for each entry in the table, for example, Roll Number or Email ID.

In our example, we can consider ID in the Teachers sheet and Class Name in the Classes sheet as the primary keys. That is, to to get the details of any Teacher from the first sheet, we use their ID to get to the right entry, and then retrieve values of other fields from the same entry. We will look at this in the Apps Script later on.

Named Ranges and Data Validation — Value Ranges

Now suppose we wish to add a column Class Name to the table Teachers, we can add a dropdown of Class Names column from the Classes sheet.

Select the column Class Name in Classes. Go to Data → Named Ranges
Enter a name for this range, for example, class_names. Verify / Edit the column range to be referenced, and click on Done.

Named Range class_names

Now, in the Teachers sheet add a column titled Class Name. Select the entire column and right-click. Go to Data Validation. Click on Add Rule.

Adding a Data Validation Rule

Select the Dropdown from a Range option, and enter class_names as the range. Apply to Range from D2 to D1000 or so, to avoid adding dropdown for the column title. You now have a column titled Class Name with dropdown entries that are derived from the column in the Classes sheet.

Dropdown referencing to values in a separate table

This gives us a Teachers table with primary key ID, a Classes table with primary key Class Name, and a column Class Name in the Teachers table that references to the foreign key Class Name, thus establishing a relationship between the two tables — and there’s our relational database!

Why did we need to create a Named Range here? Validating a column data by keeping the entries restricted to a Named Range ensures that there is only one copy of data; everywhere else this data will only be referenced as required and not entered again separately. This way we avoid spelling mistakes and ensure consistency.

Psssssttt…..
Why couldn’t we just have single sheet instead of using two, since there are two columns in common out of three? The answer is that in reality both tables will very likely contain a lot more columns, and in that case having just a single sheet for both is both unmanageable and non professional.

More Data Validations

Let us add another column to Teachers sheet, called Email. To make sure that the email entered is valid, and there is no typing error or accidental change, we can add a data validation using the same steps as above to this column to check for “valid email”.

Similarly, many more data validations for valid URLs, valid date, date within range, numbers within range, as well as custom formula can be applied to a specific set of cells.

Another important data validation is allowing only unique values in the column. Though there is no direct operation present for this, it can be implemented using the custom formula option — See Here.

Automating Value Entries — VLOOKUP

We have added Named Ranges to ensure there is only one place where data will be stored, modified, and referenced, to ensure consistency. Now, you will not face the ambiguity raised by manual errors such as entering Class Name as 1A in one sheet and 1-A in the other.

This is taking care of individual entries. What about paired ones, like Teacher ID and Teacher Name? For example, you need to add another sheet that includes the Teachers’ Name, ID, and monthly salary details. You might want to avoid entering the wrong ID against the wrong name.

The solution is, as to every manual error problem, automating the data entry in one field based on the other. That is, you only need to enter the Teachers’ ID, and the corresponding Teacher’s Name will be filled in automatically from the Teachers Sheet.
We use the VLOOKUP function for this.

The syntax is as follows: = VLOOKUP ( key, range, index)
This function searches for the key value in the given range and returns the corresponding value from its field number index.

Let’s apply the solution to our example. Create a sheet named Salary Details with columns ID, Teacher’s Name, and Salary. Create a Named Range of Teachers Names from the Teachers Sheet. Add a dropdown from this named range to the Teacher’s Name column in the Salary Details sheet.

Salary Details Sheet

In the cell B2, enter the VLOOKUP formula:
= ifna (( VLOOKUP (A2, “Teachers: $A$2 : $B”, 2, false), “” )

Select the cells from B2 to B1000 and press Ctrl + D to copy the formula to the entire column, and we’re done! Try entering IDs to see the Names automatically filling in. Contributing to an error-free database, and also making data entry easier!

Metadata

Now let’s look up a new idea. Create a sheet named Metadata. This sheet will be a little different from the others, in that it will not contain a table, instead it will contain certain fields having a fixed set of values, in the form of independent columns. Let’s see how and why.

In the above example, let’s say we wish to add the subject names to the database.

Add a column Subjects to the Metadata sheet, and list all the subjects to be recorded in the column. Create a Named Range of this column by following the above steps.

Now, you can add a Subject column in any sheet and apply Dropdown from the Named Range Subjects as its Data Validation, by using the same method as above. You now have a column titled Subject with dropdown entries, that can be modified from the Metadata sheet.

With Metadata, the idea is that we want to create a range of primary values. In an RDBMS, each of these ranges will be a table that will be referenced through an ID and probably will need a join to get the data back.

Tips, Explanations, Reasoning

While maintaining a relational DBMS, it is essential to keep it error-free. Since manual rechecking is not always possible, we include as many data validations as possible to identify errors. Maintaining metadata columns for as many fields as possible prevents typing errors and maintains consistency in the sheet.

Having abstractions in the database through Metadata and rules for filling entries in most fields suggests the provision of a README sheet. This will maintain a record of your work for reference to others and even yourself later on, provide ease for extensibility as well a user guide for future maintainers.

Apps Script — API

Status: We have an RDBMS with all the data we want to store.
Access to this data: Visual

Our current scenario is analogous to this — having your work shelf organized, with labels and details written all over — very convenient for you to access. But it is still your work to go to the shelf and find the book you want. What you might need now, is to appoint a person that will get the set of books collated from different shelves, requested by any person with appropriate rights. That is, our RDBMS data needs to be accessed in order to be useful.

You may have some business rules to take care of, regular statistics to be generated, or maybe you just want to access the data in a particular format or collate data from tables. All of this is difficult manually.

In our running example, let us suppose we need a list of all the teachers teaching a class of more than 35 students generated every month. For this, we can create an API layer on top of the database.

This is achieved using Apps Script. Apps Script will serve as an API layer for us, making this system a BaaS (Backend as a Service) rather than just a database. We write Apps Script to create a web app that serves our needs by fetching from the sheets the data we want, in the structure we want. Apps Script is similar to JavaScript, with few additional functions to reference work with the sheets.

The web app created using the Apps Script is deployed as an API link. This will serve as the connection between your front end — be it website, app, or anything else — and the relational database backend you have now created using only one Google sheet workbook.

A convenient, easy-to-maintain backend is now at your service!

--

--