The quest to tame Firestore
This article is aimed at giving information on whether or not you should choose Firestore as your database for any given project. We’ll review the pros and cons and all the thoughts our team had on the product.
Note that the new project I’ll be talking about is based on a node.js stack written in Typescript and deployed on Google Cloud Platform with Terraform.
I. From Postgres with love
Before trying to explain anything about Firestore, we need to talk about where we come from (technically speaking).
On our legacy project, we worked with a Postgres stack, using Doctrine or plain SQL to query on it. Working with those tools means that we were forced to deal with relations, foreign & primary keys, Postgres queues, schemas and everything else that comes with it. It made some parts of the code easy to develop whereas some other parts were hard (duh), but in the end it’s your classic relational database management system.
For some of us this was the only (or the primary) type of database we worked on in our career, with very little academic knowledge on the non-relational ones.
This made us do some learning on Firestore and NoSQL for the new project.
II. Be positive
After some searching, learning and testing, the team was in a good position to understand what a NoSQL database had to bring to the table.
For the new project we needed a database that would cost less than our current hosted databases and most importantly one that would be faster and still be easy to use. There was also no real need for relations between our tables, so we gave it a try.
On paper, Firestore looked almost perfect :
Non relational
During our review of the legacy project, we found out that a non-relational database was easily manageable and better if we wanted to upgrade from the current system we were using. It’s less of a headache to maintain, faster to query and if we wanted a field validation we could use OpenAPI specifications in our APIs (which we’ve done).
Natively included in the GCP ecosystem
This is a big selling point. Because it’s a Google product and because we already acted that we were migrating to GCP, Firestore was the first choice as a NoSQL database. There was no infrastructure to provision (unlike a RDBS) and the Node SDK was easy to use and well documented (even if it was hard to find).
Here are some resources we’ve used :
Costs close to nothing
Managing costs and optimizing all of them on the new project will be a real challenge. On this point Firestore is certainly the best solution we had to keep the costs low. There’s a free tier each day for reading, deleting, updating & storing documents and you only pay for the queries past these limits. It’s really convenient and costs almost nothing for a month if you don’t face enormous amounts of data. At the same time, even with big queries and an application that demands lots of requests on the database, it’s relatively cheap.
For more information on Firestore’s cost, check this link for the pricing and this link for some examples they made based on monthly usage.
Scales pretty well
This point is a no-brainer for anyone working on the cloud. One of the main selling points of migrating to GCP (or any cloud provider for that matter) is that you can scale your infrastructure rapidly and seamlessly/smoothly. For a company like HiPay, especially in the reconciliation applications, we needed to scale a lot when the processes were becoming demanding and to dodge costs when there was no or little activity on the platform. We could go from a hundred to a thousand requests based on our processes at any given time.
It’s important to note that your queries scale to the size of the set of results and not to the size of your data set. It means that searching will be fast no matter how much data you store in the database.
Have a good emulator
Something that’s really convenient when you work with Firestore is its emulator. Google made an entire Firebase Emulator bundled with some of their products to make your local development easier and more in line with what will really happen in a live environment.
Speaking from our point of view, we just used the Firestore part from the emulator, but I’m sure that the Cloud Storage or even the PubSub Emulator could be really helpful if you try it out.
II. There will be blood
With every product, there are pros and cons, and Firestore is no exception. There’s some drawbacks to choosing Firestore and you must be well informed about them before jumping into implementation. We’ll try to give you some solutions to avoid them, but keep in mind that not all of them are possible to avoid and some solutions might not be efficient in your case.
The “IN” clause
Firestore suffers from a huge problem that we faced almost instantly while developing our new project : its “IN” clause. You’re limited to 10 elements maximum to query using a “IN” clause, that means that for each element after this and for each bulk of 10 elements to search by, you’ll need to run another query. From a Google point of view, it may be to reduce costs or any other reason, but for an application that could need any document at any point it’s a real headache.
In our code, we circumvented this limitation by doing exactly what I suggested in the previous paragraph: we just ran the number of queries necessary to split all the values we needed into a bulk of 10. That way we could just merge all the results and delete the duplicates (if there were any).
No incremental IDs
From someone who worked with relational databases almost exclusively before, having incremental IDs in a database is a real help. You can insert new data in your tables freely without almost ever having to check its identifier field (or any field that is incremented automatically). In Firestore you need to forget this: your only solution to have incremental IDs in your database is to manage them yourself. It’s a real problem and, let’s be honest, not the best thing you want to do when you need to insert hundreds or thousands of documents at the same time.
Our solution for this was to create a uuid (https://www.npmjs.com/package/uuid) for our unique fields. That way, we could create a document identifier or any field that would need to be unique forever in this table. But if you need to have incremental IDs other than like this, you’ll need to find a solution to generate them and not be in conflict with any user or app that could insert at the same time in the same table (and that’s not an easy task).
The SDK query builder
This point is more of a personal preference than a real problem with Firestore. By using the NodeJS SDK, we were forced to use Firestore’s way to query the database. It’s not that bad technically speaking, but to implement it into our new project was a bit hard knowing that we would need to let external persons send us queries by requesting our APIs.
If you use the query builder only in your Javascript (or any language) app it’s perfectly functional, but the moment you need to work with external calls (or anything that implies translating to Firestore query) you’ll understand the pain. You need to think about every inconvenience Firestore querying language throws at you.
To name only a few (some of them are directly taken from the official documentation) :
- You need to convert every “OR” queries into multiple request and merge the results together at the end
- “You can perform range or not-equals comparisons only on a single field”
- “You can’t combine not-in and != in a compound query”
The GCP interface
This point could be a bit controversial. For our use cases the GCP Interface of Firestore is a bit clunky but for others it can be the perfect solution for querying from an HCI.
You have two different views :
- The panel view
You can only view one document and filtering is only applicable on a single field, this means you can’t do a precise query at all and you cannot compare documents on the fly as you cannot have multiple documents displayed at the same time.
This view is useful only if you need to check one document at a time.
- The query builder view :
This view allows you to make more complex queries with buttons, selectors and text areas. This way of querying is more flexible than the previous one but you cannot make “OR” queries, the results are truncated if it’s an array (resulting in ‘[{first_key: 1, second_k…}]’ for example) and the results are not clickable (to order them for example).
it’s a really basic view where you can make queries by still needing to follow the constraints of Firestore querying language. One thing that needs to be said is that the interface doesn’t say if a query is possible before executing it, resulting in the need to know what you can and cannot do beforehand to avoid doing your query all over again.
III. Try it yourself !
To conclude this I have only one piece of advice for you : try it yourself !
It sounds a bit like an easy answer, but it’s only by working with it and trying things in your projects or in PoCs that you’ll see if Firestore is for you or not. It has pros and cons, and you must know beforehand what they are and what that means when your project will evolve.