Model-View-Controller. And Selection?

The challenge of how to model selection in a model-view-controller system gets surprisingly little attention. For applications manipulating complex data structures, I find it is one of the most critical things to get right in your overall design.

Most of our GUI applications are designed around an object-verb pattern. You select the object you want to operate on and then invoke an action on that selected object. The selection records which object or partial object is selected to operate on.

Most UI toolkits treat selection as an aspect of the view (or perhaps controller if you want to argue that most toolkits we are familiar with merge the view-controller concept). Certainly as you look at Windows common controls, the controls themselves provide the interfaces for recording and manipulating the internal selection. This can seem to make sense at first since as you start designing your application, you normally specify the selection through actions in the view and then invoke commands through the view/controller on to the model. After the action is processed, the selection may also need to be updated so it is reflected properly in the view (e.g. if you delete an item from a list, the next item becomes selected).

In practice, I find that for any reasonably complex app, this is a bad way to think about the selection object. In a complex application, the selection plays a much more critical role in the overall architecture and needs to be abstracted from the view. There are a variety of reasons:

  • Underlying the concept of selection is the whole mechanism for how I specify what part of the model to operate on. This mechanism will be used throughout the code base and needs to be robust, performant and rich.
  • Since it is used internally, it needs to unambiguously specify what to operate on. While actions invoked from the view can apply heuristics to determine what object is really being acted on in ambiguous cases (if I have all the cells selected in a table, am I operating on the table or the cells?), we want to have a very clear dividing line between heuristics applied through the view and unambiguous behavior of lower levels of the code base. In fact, ambiguity at the view level is very common in complex structural applications, but it shouldn’t corrupt the entire system.
  • The selection is also typically used by the object model. This also makes it critical that heuristics (which we may want to alter as we evolve application behavior) doesn’t pollute object model behavior (since that would actually break functionality coded to it). There is no reason that the selection as used by the object model should interfere with a selection maintained by the view — these should be distinct concepts (unless some add-in is explicitly written to update the view’s selection).
  • When the selection is maintained by the view, the code base builds up with assumptions that are later painful. This pain shows up in object model usage, selection in multiple views, and behavior that assumes a specific selection context rather than always simply dealing with the selection provided. Ultimately, code that assumes less about its calling context is more reusable and maintainable.

I have a couple examples to talk through. A key driver for the design of the FrontPage HTML data structures was to support this rich internal selection object. FrontPage at is core uses a gapped buffer to represent the underlying text buffer. Gapped buffers support markers that form the basis of robust selection objects. Markers point between the same two characters in the text buffer, even as insertions and deletions occur elsewhere. This supports robust selection in the underlying text, but what about selecting structure?

I had build a structured editor at BBN (BBN/Slate) and felt that I hadn’t gotten it right there. BBN/Slate used gapped buffers and markers as well, but internally, the view would often have to map the selection to the structured object being manipulated (e.g. a list element or list) and then the code would explicitly pass around these internal objects. This made it awkward when the selection evaluated to lots of these objects. I would need to explicitly enumerate them into some data structure or come up with ad-hoc conventions. For example, some set of routines would take the first and last list item in a range to operate on. These internal node pointers were also not something you can hold on to across other modifications on the document (one aspect of robustness). I felt that I wanted a better way of handling this internally in FrontPage.

So for the HTML editor, I changed the gapped text buffer to also contain sentinel characters that marked the beginning and end of each node. Because of the presence of the sentinel characters, marker locations in the text buffer would always unambiguously specify a location somewhere in the tree (e.g. there is an explicit marker location that specifies “inside the list but outside the first list element in the list”). This gave me a single powerful mechanism that I could use throughout the code base to specify what was being operated on, or where an insertion should occur.

Now, it was still necessary to apply heuristics in some cases, since the selection the user can specify often has lower fidelity than what is supported internally and that gives rise to ambiguity when mapping user gestures and intent and that ambiguity needs to be explicitly resolved. In fact, FrontPage has a routine called “ApplyUIHeuristics” that wraps this logic into one place and attempts to prevent it from leaking out elsewhere.

There are a few cases where the view might need more fidelity than the underlying model. A simple example in a word processor is the end of a wrapped line. The actual cursor position in this case is ambiguous since the end of one line and the beginning of the next is actually the same location in the underlying model (this problem also shows up when mixing right-to-left and left-to-text). You want to deal with this so you don’t get anomalies like “Move-to-line-end” showing the cursor at the start of the next line rather than the end of the current one. So FrontPage (like many editors) maintains a hint in the selection object attached to the view for resolving these ambiguities. Commands can set this hint explicitly, but this hint doesn’t impact subsequent behavior on actions, just UI feedback.

There’s an interesting personal story around this whole selection concept. Shortly after joining Microsoft in 1996, I sat in on a big internal presentation (several hundred people) unveiling the Trident project (the basis for IE 4 and subsequent versions of IE’s HTML support). As part of the presentation, they described the selection model, which really was two separate ones, a low-fidelity text selection and a “control selection” which was essentially an array of nodes. I thought this was broken since it failed on a variety of the dimensions I described above and got up in the Q&A session and called it a “complete disaster”. (This was pretty unusual behavior for me — I’m usually pretty mild-mannered.) There was an internal faction who agreed and this led to later discussions with the Trident team on the issues. In IE5, they did a major rewrite of the entire system that included a significantly enhanced internal selection mechanism.

For me personally, this was one of those “hmmm, maybe I know more than I think I know” moments. I find that sometimes when you work in one area for a long time, you “forget” what you know about building and designing these kind of systems. All that knowledge is embedded in the code. When you step outside what you’re doing or start working on a different project, suddenly you can reapply that knowledge in a different context. That’s one of the reasons I encourage folks to try something different after a couple releases working on one technology. You can be more valuable (which is usually good for you personally and professionally as well) and also get a better sense of what you really know when you go to apply it in a new context. Developers can sometimes be “stick in the mud’s” as they feel more productive in a code base they are familiar with.

This is getting a little long but I’ll describe another interesting selection example. Outlook’s message list control has some interesting design constraints. Because it only has a window into the underlying folder it is displaying, it can’t simply record selection as indices into the list. More importantly, everything about the list (e.g. even the existence of the selected items) might change asynchronously when an update is received from the server. And actions passed to the underlying server can’t specify what to operate on based on index, since the index might have changed (due to server activity) by the time the action is received by the server. Outlook actually stores the selection as a list of EntryIDs. The view uses this to determine which items to show highlighted and it can reliably be used for specifying items to the server. Interestingly, I came to a similar design when working on the listbox control for BeyondMail, an early PC mail client. This design does make simple user actions (like “Select-All” in a large folder) surprisingly expensive. For this control, how the selection is modeled is a critical part of its design.