Firebase Data Structures: Complex Data

If you missed my first Firebase Data Structures installment on pagination, you might want to catch that before continuing on.

All caught up? Great! Let’s talk about complex data.

UPDATE 6/30/2016: I’ve published my free class, Firebase 3.0 for Web. Check it out.

Example Data

I’ve created exactly one user. Notice how all of the push keys are the same (-KB_wr7Obntw9yrQNVK7)?

Complex data needs a server

When a new user logs into my system—using Firebase Authentication of course—I create a new entry in the /users node to house their data.

  • As the user creates an account history, I log that history under /userReadable/-KB_wr7Obntw9yrQNVK7/accountHistory.
  • As the user creates addresses, company data and posts, I save those under /userOwned/-KB_wr7Obntw9yrQNVK7/{address, company, posts}.
  • And just for kicks, I save a log of the user under /userLogs/-KB_wr7Obntw9yrQNVK7

These operations require either a server or Google Cloud Functions to duplicate the data. For instance, the /userReadable node will have a security rule preventing writes. I’ll have to write user data to /userReadable from my server.

The /userOwned node will have read/write privileges for the user, but I’ll still need a server to fan out changes from /userOwned to other parts of my data structure that will be readable to other users and/or the public.

Why are my queries slow?

Imagine if I’d nested all of this data under the /users/-KB_wr7Obntw9yrQNVK7 node? I could have pushed the address, company, posts and accountHistory nodes directly to my user object and saved a bunch of time!!! 💩

Now consider displaying a list of my users in the admin section of my website. When I query a user, I’d be receiving all of the nodes nested under that user, whether I want them or not.

Imagine if the accountHistory nodes grew to be greater than 100kb per user? If I wanted to display 10 users in a list, I’d be querying over 1MB of data, even if I only wanted to display usernames and email addresses.

I’ve done this. The page grinds to a halt. 🚨 🚑 🚒

Keep your data shallow

Breaking your data up into it’s component parts enables you to query just the data that you need. And using a server to duplicate data appropriately will enable you to avoid making joins. Duplicate the data so that it looks exactly like you’ll display it to the user.

Shallow data and view-specific data duplication will make write processes a little heavier. However, most apps read data far more than they write it, so duplicated, broken-out data can be super performant.

If, however, you find your app to be write heavy, try writing in the simplest way possible and using a server process to combine those writes into easier-to-read structures. Utilizing multi-location updates makes this a breeze.

Logs

Do not fear duplicate data! NoSQL thrives on duplicate data. If a user changes her username, you may have to update a username node in three different places, but multi-location updates help with that… and again, it makes reads crazy fast.

I love to create logs. I like server processes that watch for data changes and log them out to /logs/comments or /logs/deletedPosts or whatever /logs/* node makes sense. Firebase makes this sort of logging nearly instantaneous, and logs let me quickly sort through shallow lists of data to find what I need. Once I have the comment or deletedPost or whatever, it’s simple enough to find the original document for deeper inspection.

And if my logs get too long, I delete the old logs, because they’re just logs. You can always loop through the original records if you need some forensic data analysis.

Next up!

I’ve got some thoughts on security rules to share. Do follow along!

And hit me up in the comments, on Twitter, on the new Firebase Slack Channel 😍!!!