Pattern Overloading

C-like languages have a problem of overloaded syntax that I noticed while teaching high school students. Consider the following snippets in such a language:

foo(45)

function foo(int x) {

for(int i=0;i < 10; i++) {

if(x > 10) {

case(x) {

A programmer experienced with this family would see

  1. Function invocation
  2. Function definition
  3. Control flow examples

In my experience, new programmers see these constructs as instances of the same idea: name(some-stuff) more-stuff. This is not an unreasonable conclusion to reach. The syntax for each construct is shockingly similar given that their semantics are wildly different. This has resulted in some hopelessly wrong code along the lines of

foo(x < 20); { ... }

if(int x = 0; x < 10; x++);

for(playerName) { ...

On top of that parentheses in these languages have yet another use: grouping. Bringing it all together, we can write code that uses the same symbols and syntax to mean wildly different things in a small space.

int foo(int x) {
if(x < 10) {
return x * (width() + getSize(thing));
}
return 0;
}

Python goes even further and uses parentheses to denote tuple literals. My argument is not that this is overly difficult to learn, but that it is a burden on programmers to no meaningful gain. There is very little in the text of the code to remind the reader which particular interpretation of a pattern they are encountering. You are expected to memorize the different cases and the control flow keywords to get by, and there is no simple answer to the question “what does this symbol mean?”

The argument tends to be that there is a small number of symbols on the QWERTY keyboards we inherited, and that we have to make do. I don’t quite buy that. I think this is sloppy design that has persisted due to programmer comfort over the years.

I can sum up my feelings about this in a simple mantra

Syntactic similarity should mirror semantic similarity

Or, to take a quote from the UX world

Similar things should look similar and dissimilar things should look dissimilar

A language that I feel gets it right is Clojure. As a Lisp, its syntax is extremely consistent, and makes few compromises in the name of ‘convenience’.

(+ 1 2)

(map inc [1 2 3])

(vals {:foo "bar" :baz qux})

Here, parentheses mean exactly one thing: invocation. Square brackets mean exactly one thing: a vector. Curly braces mean exactly one thing: a hash map. In the cases where Clojure does overload symbols, the overloaded use is always prefixed by a symbol.

{:foo "bar"}

#{:foo "bar"}
(+ 1 2)
'(+ 1 2)

Those are a map literal, a set literal, function invocation, and a literal list. The context is explicitly in the text. This is good design.

One place where Clojure is less explicit is in the fact that invoking functions and macros looks identical, but in reality is doing very different things. I will not count this against the language, because macros are specifically meant as tools to modify the syntax itself, so this similarity makes more sense in this case.

I find myself valuing consistent, predictable syntax over anything overly ‘clever’, ‘easy’ or ‘convenient’ these days.