State Architecture Patterns in React, Part 2: The Top-Heavy Architecture, Flux and Performance

Skyler Nelson
13 min readMar 17, 2017

--

This is the second in a series of articles about state architecture patterns in React.

In the previous installment, we reviewed some basic concepts that are useful for thinking about information architecture, we looked at the most straightforward pattern for managing state in React and we considered some code complexity problems that can arise when using this pattern.

In this installment, we’ll start off by discussing how to use a top-heavy architecture to decouple information architecture from the component hierarchy. Then we’ll briefly discuss the relationship between the top-heavy architectural pattern and Flux, and finally we’ll look at some performance issues that can come up in top-heavy systems.

The Top-Heavy Architecture

In the conclusion of the previous part, we discussed some issues with the naive hierarchical architectural pattern that can force us to progressively move shared state up the component hierarchy.

It might seem counterintuitive, but one way to effectively decouple your application’s state architecture from the component hierarchy is to take that process to its limit and organize your application so that basically all of the application state comes in at the very top of the hierarchy. I call this a top-heavy architecture.

The basic structure of a top-heavy architecture

In a top-heavy architecture, one could have a single root component at the top of the hierarchy that manages shared state itself, but it’s common to actually factor all the state information into a separate object often called a store, which manages everything and publishes changes to the root component. The idea is that any time anything changes, the store passes the entire state of the application to the root component and everything flows down through props from there. If done right, the user interface is then basically a pure function of the state of the store.

In fact, you can factor different concerns out into different objects and end up with multiple stores, but in this kind of architecture there’s only one real point of contact between the stores and the component hierarchy, and it’s at the root component. I’ll mostly refer to “the store” in the singular, for convenience, even though there may be a number of independently operating stores.

So in this architecture information distribution is still coupled to the component hierarchy — that’s somewhat hard to avoid with React — but information management and publishing is coordinated almost exclusively by the store.

The top-heavy architecture does away with a lot of the chaos that can come along with the naive hierarchical architecture. There’s less hunting down which component manages some particular bit of state, because the store manages pretty much everything. If you want to know how information is published to the component hierarchy, you only have to look for where the store contacts the root component. It’s less vulnerable to problems when refactoring as well, because changing the component hierarchy won’t typically force state management code to relocate if it’s all already in one place.

In practice, applications are rarely purely top-heavy. When it comes to components with some local state that no other component will ever depend on, the architectural consistency bought by factoring that state out so that it’s managed by the store is rarely worth the painful and unnecessary violation of encapsulation. Components that are externally sourced also tend to manage their own state.

But in general, wherever it’s feasible, top-heavy architectures discourage local state (and thus smart/stateful components) and encourage the use of dumb/stateless functional components, which are simpler to reason about and can be written in a much more concise way.

An Aside: Flux & Unidirectional Information Flow

Readers who are familiar with the Flux architecture may be wondering how it relates to the discussion so far, or why I haven’t brought it up yet.

Flux architectures are most often top-heavy, but they don’t strictly have to be. Certain variants of the next architecture we’ll discuss (which isn’t completely top-heavy) are compatible with Flux.

Flux is concerned mostly with the flow of information (e.g. concerning user interactions) from components that depend on state back to the stores that manage that state.

In the naive hierarchical architecture, if we wanted a child component to have an effect on some state managed by a component that’s higher up in the hierarchy, we’d often accomplish this by passing something down to the child component that it can use to send information back. Most often we’d just pass a callback, but we could also send a reference to some object with methods the child could call.

In a top-heavy architecture, this would mean passing loads of things from the store all the way down the hierarchy to wherever they’re needed. That can become very cumbersome, so top-heavy architectures of often come with some other means of signalling events to the store. This is the basic idea behind Flux’s dispatcher: there’s a separate object that components have access to (generally because it’s a global) which they can use to signal events to the store.

The defining characteristic of Flux is what’s often called unidirectional information flow: we avoid building mechanisms to allow information about events (e.g. user actions, asynchronous data fetching, etc.) to flow backwards up the component hierarchy from child to parent. Instead, information about events is published to stores through a dispatcher that sits outside the component hierarchy, and information about any state changes that happen in response to those events flows back down through the component hierarchy from wherever the store makes contact with it (which is often, but not necessarily, the root component).

Flux: Actions flow to the store independently of the hierarchy

A key feature of Flux is the concept of actions. With Flux, when something like a user interaction happens, the component sends something called an “action” — typically an object with some variable fields — that specifies what has happened to the dispatcher. The dispatcher forwards that to any stores, which can then decide if they want to respond to it or not. Since components can often create actions themselves (in theory, all they do is describe what happened, not necessarily what it happened to or what to do in response), actions help us avoid passing callbacks or objects with methods down through props.

In some cases this can dramatically lessen the amount of information that has to be distributed through the component hierarchy, effectively decoupling part of the information architecture from the DOM.

Actions can free us from distributing information to a component that it doesn’t need. If the user clicks a particular button, the button doesn’t necessarily need to know what will be affected by it, and shouldn’t necessarily be responsible for coordinating those effects. It often makes sense for the button to announce “I’ve been clicked!” and let anything that cares about this fact deal with it accordingly. So under Flux, the button can just dispatch an “I’ve been clicked!” action, and any interested stores will update themselves accordingly. This gives us a nice ability to add consequences to a component’s actions without necessarily making that component aware of the things it affects.

However, I find this comes up in practice more rarely than one might expect. It’s often the case that components need to know exactly which things they can have an effect on. For instance, it’s extremely common to build a component to represent (and allow interactions with) a particular type of thing, of which there might be many in your application (e.g. a list item). One way or another, the component has to end up with some kind of reference to the item it represents so that it can specify it as a target when dispatching actions that should affect it.

In such cases, the overhead that typically comes along with the Flux architecture (actions, action creators, machinery for referencing data structures with IDs, etc.) doesn’t actually serve much purpose and can end up being a burden when a more straightforward approach (e.g. passing the actual list item data structure as a prop and having the component simply call methods on it) would do.

In the fourth installment of this series, we’ll briefly return to Flux and elaborate on why its concerns are mostly orthogonal to the issues brought up in this series, but compatible with the architectural strategy we end up with.

Top-Heavy Performance Woes

Before discussing problems that can affect top-heavy architectures, I should make a disclaimer that’s similar to the one I made when talking about the naive hierarchical architecture: the possibility of problems doesn’t mean they’ll affect your application and one shouldn’t automatically dismiss this architecture, as it’s simple in certain ways and most often good enough.

Recall the motivating claim I made at the beginning of this series:

[I]t can be difficult to manage state dependencies that cut across the structure of the component hierarchy in a way that doesn’t introduce a lot of unnecessary complexity or inefficiency.

The top-heavy architectural pattern does away with a lot of the complexity associated with the naive hierarchical pattern, but it has certain inefficiency traps built in to the architecture at the conceptual level, which can ultimately end up manifesting as usability-affecting performance issues.

Naively implemented, applications with a top-heavy architecture have the following property: any time the state of anything on the page changes, the entire page re-renders.

This is fine in many cases. React is built on the idea that you shouldn’t worry too much about the cost of rendering virtual DOM. But when your component hierarchy starts getting bigger or requiring complex calculations with each render, or the state starts changing more frequently, then the time you spend rendering grows. And as you spend more time rendering, responsiveness degrades. As your application grows, this approach can become progressively less fine. Sometimes it’s just too slow.

The main way to deal with this problem in a top-heavy architecture is to (directly or indirectly) utilize React’s shouldComponentUpdate hook. Typically, when a React component re-renders, it recursively re-renders its entire component subtree. A change might affect only a small part of a re-rendering component’s subtree, which means whole branches of that tree might re-render even though nothing that affects them has changed.

The shouldComponentUpdate method fires before a component would re-render and can cut off that recursive descent before it happens, effectively telling React that nothing relevant to the part of the component hierarchy that this component manages has changed and React can just re-use the DOM that’s already in place. In essence, shouldComponentUpdate makes it possible for components to update (partially) independently of their descendants.

Intelligent use of shouldComponentUpdate can dramatically cut down on unnecessary re-renders and increase the efficiency of your application. Libraries that encourage its use behind the scenes (e.g. Redux) tend to tout their performance as a feature, as they will often outperform naive hierarchical implementations right out of the box.

But shouldComponentUpdate isn’t a magic bullet that automatically solves all performance woes. There are a couple of factors that determine how much it can speed up the rendering process. The first is how expensive it is to call for each component — if the calculations used to determine whether your component needs to re-render are as expensive as rendering, you’re not saving any time.

Libraries like Redux aim to make shouldComponentUpdate as cheap as possible by making use of immutable data, so checking whether a component needs to re-render requires nothing more than shallow identity comparisons between the previous props and the new props. This has its own cost: unfortunately Javascript wasn’t designed with immutability in mind, so there’s a certain amount of overhead (both in terms of code complexity and in terms of performance) that can come along with maintaining things like immutable arrays and trees. This is getting better as the language evolves, but it’s definitely a factor worth considering.

The second factor that determines the usefulness of shouldComponentUpdate is the structure of the component hierarchy. If you end up skipping re-rendering for large branches of the tree, you can save a lot of work. But depending on the structure of your DOM, you might not be able to skip much even if you need to update very little.

Top-heavy architectures that intelligently use shouldComponentUpdate still have the following property: any time any component needs to re-render, every ancestor of that component all the way up to the root component must also re-render, and every direct descendant of any of those components must either pass a shouldComponentUpdate test or re-render themselves.

While shouldComponentUpdate enables a component to update (partially) independently of its descendants, it doesn’t allow that component to update independently of its ancestors.

Unnecessary work propagates up the component hierarchy

With a pure top-heavy architecture, it’s just not possible to update a component that’s low in your hierarchy completely independently of its parents (or siblings) because all changes must come in from the very top. In the end, even if only one small change was made deep in the hierarchy, a big part of the DOM tree can end up running through React’s reconciliation process.

This can be a problem when it comes to deep component hierarchies with frequently updating leaf nodes.

But a broad hierarchy suffers as well. It can get especially bad when you’re updating elements inside a large list. If you make a change to one element of a list, that change propagates to that element’s container which can potentially re-render every other item in the list.

List items can’t be re-rendered fully independently of their siblings

Small costs like this are often not worth worrying about, but they can add up. Sometimes you need to display a lot of data or have an especially complex DOM structure, and sometimes you need to update the page many times a second (e.g. in response to keystrokes/mouse events or new information from requests), and sometimes you need to do both. In such cases, performance can degrade to the point of unusability.

A Real-World Example

I first ran into performance problems with the top-heavy architecture on the job. I build web-based tools for working with an advanced symbolic A.I. system. One of the tools I built is called the CRC (for Compare, Relate and Contrast). The CRC is used for both external investor demos and internally as a debugging and development tool. Most of the details of the CRC aren’t important for this discussion, but the overall constraints I ended up having to work with are. When the project started, this was what I knew:

  1. The user would make a query, which would initiate an AJAX request to fetch an array of items, all of which would need to be displayed as complex elements in a scrollable list
  2. Further information about these items would continuously flow in through further requests generated by the contents of the list — subsets of the list or individual items within it would need to be updated to reflect this new information >20 times a second

As I began working on this project, it became clear that many items within the list (and other elements of the page) had a complex pattern of shared state dependencies, so making the list items smart components that managed their own state quickly became impractical. I switched to a top-heavy architecture where all state information was managed by a store outside of the hierarchy and passed down through props, and things seemed to be working fine.

Then I found out that the test queries I was given didn’t accurately represent the scale of the data I was expected to work with. My test queries were returning lists with a few hundred items at most, whereas in the wild, I could expect much larger lists, with easily more than 10,000 items.

Given the top-heavy architecture, trying to update a single item in that list would cause the whole list to re-render. Trying to do that >20 times a second killed the page — it was essentially unusable.

I tried a number of optimization strategies. Gating item re-rendering with shouldComponentUpdate helped, but not enough: it was still slow enough to be hard to use in the worst cases.

I also investigated lazy rendering tricks. I hacked together a version of react-infinite that could deal with elements with non-uniform vertical sizes and got acceptable rendering performance by only ever rendering the list items that were actually on screen, given the user’s window size and scroll position.

Then during an investor demo, I watched my managers attempt to search within the page by hitting ctrl+f and get no results even though the items they were looking for were definitely in the list, because the lazy rendering trick kept them out of the DOM when they weren’t on screen. A slew of other usability issues came up involving things like printing and browser zoom levels, and I realized the lazy rendering trick wasn’t sustainable.

The requirements of the project collectively broke the all the state architecture patterns I knew of at the time, even ones based on immutable tries and pervasive use of shouldComponentUpdate. I considered the possibility that I couldn’t get acceptable performance without caving on usability issues — I’d have to slow down the updates or implement annoying things like paged results and a custom search area. But a question kept nagging at me: why should the user-facing design be dictated by performance issues, if we can avoid them?

While thinking about this problem, I began to become suspicious of a kind of code smell. It seems like re-rendering the parent component (e.g. a list) whenever a child (e.g. a list item) changes is doing a lot of unnecessary work. The store publishes all changes to the root component and lets the component hierarchy figure out what changed and which parts need to update by trickling down re-renders. Why re-render things that don’t need to be re-rendered, or do extra work to check if they do? If one item in a list updates, why can’t we just re-render that one item? What if instead of having each component check if its dependencies have updated every time anything happens, we could proactively push updates out to individual components?

It turns out that this can totally be done, and the complexity cost is surprisingly low. We’ll see how in the next installment.

Next up is Part 3: Articulation Points, react-zine, An Overall Strategy and Flux-duality

Links To All Chapters

  1. Information Architecture in React and the Naive Hierarchical Pattern
  2. The Top-Heavy Architecture, Flux and Performance
  3. Articulation Points, react-zine, An Overall Strategy and Flux-duality

--

--

Skyler Nelson

I’m a cognitive science nerd with a graduate degree in philosophy of physics who makes user interfaces for an A.I. company, for some reason.