Making Semantic Highlighting Useful
A number of people in recent years have hit on an idea of ‘semantic highlighting’. In the most common variant, each local variable is given a distinct color:
While I think this works OK for simpler functions, as soon as your functions get a little busy, the highlighter very quickly runs through the 7 colors of the rainbow and so has to resort to using finer color shades of hard-to-notice distinction. Combine this with the fact that what a particular color means varies throughout the whole codebase and I think this highlighting scheme may be less help than hindrance.
In another variant proposed by Douglas Crockford, different levels of nested local scope are given distinct colors:
Because even the craziest Javascript doesn’t nest scopes more than 7 levels deep (let’s hope), we don’t run out of clearly distinct colors. It also helps that what a color refers to is in a sense consistent, e.g. yellow refers to scope level 2. On the other hand, this highlighting is arguably a band-aid over a deeper problem: ugly function nesting and the stateful closures which they typically represent (which is one reason why I prefer Clojure over Javascript).
Still, in solving the wrong problem, I think Crockford comes really close to hitting on the right idea. What I want to clearly see in my code is a broad categorization of where all these names are coming from, where they are defined. In Clojure, I don’t care so much which local scope a symbol belongs to so much as I just want a visual distinction between locals and non-locals. When I try to understand a function, I almost always need to understand the non-local names first: until I’ve loaded onto my mental stack what foo from elsewhere does, following the logic of what happens with local bar rarely does me much good. So clear visual separation of locals and non-locals helps my mental process.
For further clarity, I also want a visual separation between non-locals of the current module and non-locals of imported modules. When looking at code, I tend to have better familiarity with the current module than imported modules and so generally need to focus on the names from imported modules first.
It also would be helpful to have a distinct color for standard library names and reserved words because the programmer should be intimately familiar with such names already and so generally want to focus on them only after the non-local names.
So the ‘order of understanding’ of code is:
- non-local names from imported modules
- non-local names of the current module
- standard library names (and reserved words)
- local names
Applying a coloring scheme that distinguishes these categories to Clojure, I’ve been using white for names from other namespaces, orange for names in the current namespace, blue for special forms and names from the standard libraries, and red for locals. I’ve also found it helpful to highlight the definitions of local names in bold, dark red and highlight the uses of local names in non-bold, light red:
I haven’t yet created any implementation of this, but in my hand-cranked experiments, the resulting code is much easier to visually parse. Unfamiliar functions suddenly become much less scary when I can immediately sort out locals from non-locals, tune out the standard language stuff, and quickly find local definitions.
Rather than having editors attempt to parse code (and inevitably doing so poorly), I think the best way to implement this scheme—or any other kind of semantic highlighting—would be for the compiler itself to emit a source map which editors would use to do the highlighting. (In fact, the source map could contain all sorts of potentially useful information for advanced editor conveniences, which I might discuss in a later article.)