On Web Apps and Databases

This article is mostly a stream of consciousness based on my own experience and observation.

I strongly believe that the next big thing in front-end web architecture is going to be an advanced data storage — a fully fledged database. To make this clear I’m not referring to a database as a persistence mechanism, but rather a set of facilities for building large scale web apps. That is, software requirements grow every day and so systems grow in complexity. Today’s browsers execute much more application specific code than 5 years ago.

We start simple, with a bunch of libraries, each doing their own thing. This is acceptable for really small projects and static websites. But it becomes messy immediately as you start growing the system, mostly because of granular interactions with DOM and spreading state across the whole codebase.

pre-React architectures

React and Flux/Redux solved this for us by abstracting away rendering and state management. It became possible now to build truly advanced web apps, focusing on actual domain, instead of struggling with underlying implementation.

React, Flux/Redux architectures

The problem however is that systems keeps growing. We are building more UIs and loading and transforming even more data, which introduces various issues, including performance. This requires more sophisticated solutions, so we start adding libraries to handle every of them. The following are some of the problems and libraries to solve them, which are used in majority of complex web apps built with React:

data normalization

When it comes to altering application state it’s most certainly that we don’t want to have deeply nested data, because it’s hard to work with and it will eventually introduce performance issues. Server responses payload are usually the source of such data structures. Normalizr is often used to flatten and rewrite state according to defined schema so it becomes more easier to work with state. Also ImmutableJS has a nice API to perform nested reads/writes on immutable data.

data aggregation

Next thing that impacts performance directly is relations between entities and data aggregation. Having these means that we want to perform complex reads and combine the results together before passing them further to rendering pipeline. We do also want them to be reactive i.e re-read automatically when the state changes, and efficient (memoize read output) so it won’t execute if the actual part of the state didn’t change. Reselect library is often used as a solution to this problem.

mess

We ended up in the same situation as in the beginning. But this time it is state management logic spread across the codebase. At certain scale things become too granular and again we need to abstract them away. The next step is to choose a database system. Since we are in UI space, the database should be somewhat specific to our needs, especially if you are considering building offline-first optimistic UIs. Here’s a list of properties we require:

  • In-memory database with optional persistence (IndexedDB, localStorage)
  • Preferably with relational model
  • SQL as a query language, because everyone is familiar with it
  • SQL DSL in a form of idiomatic API is nice to have
  • Reactive queries (observing change)
  • Transaction log

Sophisticated SQL engine with query optimization together with reactive queries reduce boilerplate code and improve performance. Transaction log can be used to implement state synchronization in distributed systems.

All of these are necessary only for large web apps, like Gmail’s Inbox which is actually using a database in the browser. There’s no point in a database when building a simple or medium size web apps.

There are quite a lot implementations of in-memory JavaScript databases, Lovefield and LokiJS looks promising. If you are looking into a complete solution, you might want to consider Relay and GraphQL, keep in mind that it is a specific type of a system.