The Asymmetry of ++ Validations

found by Fede Bergero

Let’s kick off 2018 with a super simple Erlang oddity that can be summarized in a tweet from Fede Bergero:

I won’t add much to this, but I thought I would write an article just because I think this totally belongs to this blog.

Partial Asymmetry found at an Iranian Tomb Ceiling — Source

The Findings

Let’s try what Fede suggested in a console…

1> [] ++ something.
something
2> something ++ [].
** exception error: bad argument
in operator ++/2
called as something ++ []
3>

Since we already learnt that compiled codeevaluated code, let’s try this within a module…

1> c(plusplus).
plusplus.erl:7: Warning: this expression will fail with a 'badarg' exception
{ok,plusplus}

That’s cool! Erlang compiler warns us, but only when the empty list is the second argument (as expected).

2> plusplus:first().
something
3> plusplus:second().
** exception error: bad argument
in function plusplus:second/0 (plusplus.erl, line 7)
4>

And this time compiled and evaluated codes behave consistently. That’s good.


What’s going on here?

This time it’s not so hard to see: ++ only validates that its first argument is a list. The second argument can be anything, and whatever it is will be added as the tail to the generated list. Check this out…

5> [1,2,3] ++ something.
[1,2,3|something]
6>

It’s just how ++ works. What Fede found is just an edge case where the resulting list has no head, and therefore it’s only its tail. It’s as if we would have constructed [head|something] and requested its tail…

7> tl([head|something]).
something
8>

So, mystery solved. But to be fair, the only slightly relevant documentation I could find about this was the following one:

The list concatenation operator ++ appends its second argument to its first and returns the resulting list.

It might be a good thing to add some more implementation/spec details to the Erlang docs.

In any case, typer (and consequently dialyzer) seems to be aware of all this. Check this module annotated by typer with specs:

Note how typer clearly understands that first/0 will return just the atom something since the resulting improper list will have no head, second/0 will fail (therefore the result is none()) and third/0 will return an improper list. Actually, typer is very precise: it will be a non empty improper list with 1|2|3 as its elements and something as its tail.