There is a lot of material on the web discussing model-view-controller (MVC) and its variants, but surprisingly little discussion of one of the key topics in how it is actually implemented in real systems. A key part of the MVC design is the degree of coupling or linkage between the model and the view. I will get concrete by discussing this in the context of some early interactive applications, in the context of a complex desktop application like FrontPage and then look at the issue in the context of more recent frameworks like Facebook’s React.
Most descriptions of MVC talk about using the observer design pattern to link the model and the view. A view registers as an observer on the underlying model and receives notifications when changes occur in the model. In response to the notifications, the view updates. When I talk about the degree of coupling between model and view, I am interested in the rate of these notifications, what information is actually transferred in a notification, how the view tracks these notifications, and then when and how the view responds by repainting the display.
Let’s look at an early text editor like vi (emacs followed a similar approach). Both of these editors were implemented for character terminals like the VT100. These terminals could display a grid of characters (24x80 in the case of the original VT100). Each terminal provided a set of commands for updating the display. One basic command was the command to display a string of characters starting at a location on the screen. These terminals also implemented commands to take advantage of text that was already on the screen. For example, there were commands to scroll lines up or down, or to shift characters on a line left or right. These commands all had different costs (i.e., the time required to execute the command and update the display).
An editor wanted to optimize its use of these commands to update the screen after the user invoked some action that changed the underlying model (or changed which part of the model to display in the view). The faster the screen updated, the more responsive the editor would feel. These editors took advantage of a key insight. The cost of the redisplay process was dominated by the actual process of executing the display commands on the terminal. The editor could spend significant computation on an algorithm to come up with the optimal series of commands to issue to the terminal.
When changes were made to the model, the notification to the view simply set a single dirty bit. The view did not immediately do any other processing. Later, at idle (when no keyboard commands were pending which typically happened within 100 milliseconds or so), the view would check this dirty bit and examine the current state of the model to reconstruct its new understanding of what the display should look like. It would use a saved copy of the old display state and this description of the new state to then derive the optimal series of commands to send to the terminal in order to bring it up to date. VI and emacs used different algorithms to come up with the optimal sequence of commands, but the basic structure of the coupling between model and view was equivalent.
If we look at this design in terms of the issues I outlined above, we see a very loose coupling. The notification that is passed between model and view is a single dirty bit. The view does no immediate processing. This is critical because it means that multiple notifications will batch together — and because all we are tracking is a single dirty bit, aggregation is both simple and very cheap. Because it is so cheap, notifications can be generated very low in the model manipulation code, which makes the overall system more robust. Almost any editor has many more high-level commands than low-level operations. No special work is required to prevent display artifacts when composing multiple commands together, since ultimately the view only knows that it needs to repaint.
The key elements of this design are:
- View updates are asynchronous with respect to model updates. Model updates can occur at any rate, while view updates can be optimized for the display output device and the constraints of the human visual system.
- Notifications are very cheap and aggregate naturally. Cheap notifications can be used promiscuously. Because they aggregate simply (it’s always just that single dirty bit), the view — which is limited to the display size — cannot be swamped by notifications from the underlying model — which might be very large and might change at a rate that is faster than we want to display.
- The mechanism is not extensible. The lack of extensibility is a feature. It means that the linkage does not grow tighter over time. It is a knowledge barrier that ensures continued isolation between the model and the view as the overall system evolves. The knowledge barrier tends to eliminate path dependence — the correct behavior of the view is not dependent on the historical and combinatoric sequence of model updates, it is only dependent on the final model state.
The automatic way in which notifications are batched together and processed asynchronously is an example of a much more general design pattern. This pattern is useful any time you have some requests arriving at a varying rate and the marginal cost of processing an additional request is much smaller than the fixed cost of processing a single request. In the text editing case, the marginal cost of displaying additional characters once they are inserted into the model is essentially zero, so there is strong incentive to batch. A critical part of this design pattern is that the batching happens automatically. Some UI systems implement an explicit FreezeDisplay/UnfreezeDisplay mechanism. This essentially requires the caller to explicitly batch their requests even though the caller has no detailed understanding of the internal performance trade-offs. Other examples of this pattern are with client-server designs where request compaction and aggregation can be done in some batching layer. Once you have an asynchronous coupling, the system is very amenable to performance tuning and optimization without changing the overall structure of the design.
The FrontPage HTML editor took a similar general approach. Rather than a single dirty bit, FrontPage maintained a dirty range that specified the range of the tree that had been damaged. Changes to the model (the HTML tree) specified the range of the tree that had been modified. The view would maintain a simple union of the dirty range when notified of changes. The actual view update happened asynchronously, either at idle or when a timer fired (idle normally happens frequently enough but the timer served as a backup to ensure the display was always repainted in a timely way). The redisplay process was much more complex than for VI or Emacs since it required recomputing CSS properties and relaying out the damaged region (the incremental nature of this process was critical to overall performance which is why a dirty range rather than a simple dirty bit was used).
Facebook’s React framework is not a full model-view architecture, but it has some interesting parallels with these earlier systems. The React developers recognized that the overall cost of updating the display in response to model changes is dominated by making changes to the browser DOM and then the resulting browser layout and rendering process. Instead of having each developer optimize this process, a React application always regenerates its display tree from scratch using a fast “virtual DOM”. The React core then performs a comparison between this new DOM and the currently displayed DOM and updates the browser with a (hopefully) optimal set of DOM updates, which results in optimizing the expensive browser layout and rendering process.
A typical React application simply knows that it needs to redisplay (maintains a single dirty bit) so that model change aggregation is very cheap. React may also further batch by delaying the actual DOM update and aggregating multiple virtual DOM rendering passes. This design is strikingly similar to those early text editors, driven by a similar performance balance where display updates are the critical performance bottleneck.
I have focused here on examples of effective loose coupling, but there are lots of MVC examples where the coupling is much tighter and done poorly. Toolkits that provide mechanisms for easily doing “data-binding” between view elements and model elements tend to drive a tighter and synchronous linkage between the model and the view. Establishing an asynchronous linkage typically requires more initial infrastructure but comes with significant long term advantages.
The standard “InvalidateRegion” mechanism in systems like Windows provides some degree of decoupling since the WM_PAINT event is synthesized asynchronously and will batch multiple invalidated regions. Unfortunately, the work to initially determine which region to invalidate is often non-trivial, so even if the actual rendering is asynchronous, this mechanism typically requires significant processing to happen in the view on each model notification.
The standard notification framework in MFC (Microsoft Foundation Classes) also naturally drove developers to provide special purpose notifications since it used a sub-classing mechanism that made creating new and more specific notifications natural.
VI was built on top of the “curses” programming library which implemented the hard display update optimization algorithms. This led to a generation of visual character applications that followed a similar loosely coupled model-view architecture. It will be interesting to see whether React serves to help initiate a similar improvement in overall application architecture for web applications (and native as well with React Native).