Becoming fully reactive: an in-depth explanation of MobX
Due to popular demand (and to have a cool story for my grand-children), these are the inner workings of MobX. A lot of people are surprised how consistent and fast MobX is. But rest assured, there is no magic in play!
First, let’s define the core concepts of MobX:
- Observable state. Any value that can be mutated and might serve as source for computed values is state. MobX can make most types of values (primitives, arrays, classes, objects, etc.) and even (potentially cyclic) references observable out of the box.
- Computed values. Any value that can be computed by using a function that purely operates on other observable values. Computed values can range from the concatenation of a few strings up to deriving complex object graphs and visualizations. Because computed values are observable themselves, even the rendering of a complete user interface can be derived from the observable state. Computed values might evaluate either lazily or in reaction to state changes.
- Reactions. A reaction is a bit similar to a computed value, but instead of producing a new value it produces a side effect. Reactions bridge reactive and imperative programming for things like printing to the console, making network requests, incrementally updating the React component tree to patch the DOM, etc.
- Actions. Actions are the primary means to modify the state. Actions are not a reaction to state changes but take sources of change, like user events or incoming web-socket connections, to modify the observable state.
Computed values and reactions are both referred to as derivations in the remainder of this blog-post. So far, this might all sound a bit academic so let’s make it concrete! In a spreadsheet all data cells that have values would form the observable state. Formulas and charts are computed values that can be derived from the data cells and other formulas. Drawing the output of a data cell or a formula on the screen is a reaction. Changing a data cell or formula is an action.
Anyway, here are all four concepts in a small example that uses MobX and React:
We could draw a dependency tree based on the above listing. Intuitively it will look as follows:
The state of this applications is captured in the observable properties (blue). The green computed value fullName can be derived from the state automatically by observing the firstName and the lastName. Similarly the rendering of the profileView can be derived from the nickName and the fullName. The profileView will react to state changes by producing a side effect: it updates the React component tree.
When using MobX the dependency tree is minimally defined. For example, as soon as the person being rendered has a nickname, the rendering will no longer be affected by the output of the fullName value, nor the first- or lastName (see listing 1). All observer relations between those values can be cleaned up and MobX will automatically simplify the dependency tree accordingly:
MobX will always try to minimize the number of computations that are needed to produce a consistent state. In the rest of this blog post, I will describe several strategies used to achieve this goal. But before diving into the magic of how computed values and reactions are kept in sync with the state, let’s first describe the principle behind MobX:
Reacting to state changes is always better then acting on state changes.
Any imperative action that an application takes in response to a state change usually creates or updates some values. In other words, most actions manage a local cache. Triggering the user interface to update? Updating aggregated values? Notifying the back-end? These can all be thought of as cache invalidations in disguise. To ensure these caches will stay in sync, you need to subscribe to future state changes that will enable your actions to be triggered again.
But working with subscriptions (or cursors, lenses, selectors, connectors, etc) has a fundamental problem: as your app evolves, you will make mistakes in managing those subscriptions and either oversubscribe (continue subscribing to a value or store that is no longer used in a component) or undersubscribe (forgetting to listen for updates leading to subtle staleness bugs).
In other words; when using manual subscriptions, your app will eventually be inconsistent.
The above image is a nice example of the Twitter UI being inconsistent. As explained in my Reactive2015 talk, there can only be two causes for this: Either there is no subscription that tells tweets to re-render if the profile of the associated author has changed. Or the data was normalized and the author of a tweet doesn’t even relate to the profile of the currently logged-in user, despite the fact that both pieces of data try to describe the same properties of the same person.
Coarse grained subscriptions like Flux-style store subscriptions are very susceptible to oversubscribing. When using React, you can simply tell whether your components are oversubscribing by printing wasted renderings. MobX will reduce this number to zero. The idea is simple yet counterintuitive: More subscriptions result in fewer recomputations. MobX manages many thousands of observers for you. You can effectively tradeoff memory for CPU cycles.
Note that oversubscribing also exists in very subtle forms. If you subscribe to data that is used, but not under all conditions, you are still oversubscribing. For example, if the profileView component subscribes to the fullName of a person that has a nickName, it is oversubscribing (see listing 1). So an important principle behind the design MobX is:
A minimal, consistent set of subscriptions can only be achieved if subscriptions are determined at run-time.
The second important idea behind MobX is that for any app that is more complex than TodoMVC, you will often need a data graph, instead of a normalized tree, to store the state in a mentally manageable yet optimal way. Graphs enable referential consistency and avoid data duplication so that it can be guaranteed that derived values are never stale.
How MobX keeps all derivations efficiently in a consistent state
The solution: don’t cache, derive instead. People ask: “isn’t that extremely expensive?” No, it is actually very efficient! The reason for that is, as explained above: MobX doesn’t run all derivations, but ensures that only computed values that are involved in some reaction are kept in sync with the observable state. Those derivations are called to be reactive. To draw the parallel with spreadsheets again: only those formulas that are currently visible or that are used indirectly by a visible formula, need to re-compute when one of the observed data cells change.
Lazy versus reactive evaluation
So what about computations that aren’t used directly or indirectly by a reaction? You can still inspect the value of a computed value like fullName at any time. The solution is simple: if a computed value is not reactive, it will be evaluated on demand (lazily), just like a normal getter function. Lazy derivations (which never observe anything) can simply be garbage collected if they run out of scope. Remember that computed values should always be pure functions of the observable app state? This is the reason why: For pure functions it doesn’t matter whether they are evaluated lazily or eagerly; the evaluation of the function always yields the same result given the same observable state.
Reactions and computed values are both run by MobX in the same manner. When a recomputation is triggered the function is pushed onto the derivation stack; a function stack of currently running derivations. As long as a computation is running, every observable that is accessed will register itself as a dependency of the topmost function of the derivation stack. If the value of a computed value is needed, the value can simply be the last known value if the computed value is already in the reactive state. And otherwise it will push itself on the derivation-stack, switch to reactive mode and start computing as well.
When a computation completes, it will have obtained a list of observables that were accessed during execution. In the profileView for example, this list will either just contain the nickName property, or the nickName and fullName properties. This list is diffed against the previous list of observables. Any removed items will be unobserved (computed values might go back from reactive to lazy mode at this point) and any added observables will be observed until the next computation. When the value of for example firstname is changed in the future, it knows that fullName needs to be recomputed. Which in turn will cause profile view to recomputed. The next paragraph explains this process in more detail.
Propagating state changes
Derivations will react to state changes automatically. All reactions happen synchronously and more importantly glitch-free. When an observable value is modified the following algorithm is performed:
- The observable value sends a stale notification to all its observers to indicate that it has become stale. Any affected computed values will recursively pass on the notification to their observers. As a result, a part of the dependency tree will be marked as stale. In the example dependency tree of figure 5, the observers that will become stale when value ‘1’ is changed are marked with an orange, dashed border. These are all the derivations that might be affected by the changing value.
- After sending the stale notification and storing the new value, a ready notification will be sent. This message also indicates whether the value did actually change.
- As soon as a derivation has received a ready notification for every stale notification received in step 1, it knows that all the observed values are stable and it will start to recompute. Counting the number of ready / stale messages will ensure that, for example, computed value ‘4’ will only re-evaluate after computed value ‘3’ has become stable.
- If none of the ready messages indicate that a value was changed, the derivation will simply tell its own observers that it is ready again, but without changing its value. Otherwise the computation will recompute and send a ready message to its own observers. This results in the order of execution as displayed in figure 5. Note that (for example) the last reaction (marked with ‘-’) will never execute if computed value ‘4’ did re-evaluate but didn’t produce a new value.
The previous two paragraph summarize how dependencies between observable values and derivations are tracked at run-time and how changes are propagated through the derivations. At this point you might also realize that a reaction is basically a computed value that is always in reactive mode. It is important to realize that this algorithm can be implemented very efficiently without closures and just with a bunch of pointer arrays. Additionally, MobX applies a number of other optimizations which are beyond the scope of this blog post.
People are often surprised that MobX runs everything synchronously (like RxJs and unlike knockout). This has two big advantages: First it becomes simply impossible to ever observe stale derivations. So a derived value can be used immediately after changing a value that influences it. Secondly it makes stack-traces and debugging easier as it avoids the useless stack-traces that are typical to Promise / async libraries.
However, synchronous execution also introduces the need for transactions. If several mutations are applied in immediate succession, it is preferable to re-evaluate all derivations after all changes has been applied. Wrapping an action in a transaction block achieves this. Transactions simply postpone all ready notifications until the transaction block has completed. Note that transactions still runs and updates everything synchronously.
That summarizes the most essential implementation details of MobX. We haven’t covered everything yet, but it is good to know for example that you can compose computed values. By composing reactive computations it is even possible to automatically transform one graph of data into another graph of data and keep this derivation up to date with the minimum number of patches. This makes it easy to implement complex patterns like map-reduce, state tracking using immutable shared data, or sideways data loading. But more on that in a next blog post.
- The application state of complex applications can best be expressed using graphs to achieve referential consistency and stay close to the mental model of a problem domain.
- One should not imperatively act on state changes by using manually defined subscriptions or cursors. This will inevitably lead to bugs as a result of under- or oversubscribing.
- Use runtime analysis to determine the smallest possible set of observer → observable relationships. This leads to a computational model where it can be guaranteed that the minimum amount of derivations are run without ever observing a stale value.
- Any derivation that is not needed to achieve an active side effect can be optimized away completely.
For more info on MobX, just check out:
Edit 2–3–2016: MobX was called Mobservable before version 2.0
Image by xt0ph3r