Ode to the Robot Butt

…or is it to Erlang Pattern-Matching?

Brujo Benavides
Erlang Battleground
4 min readMay 25, 2021

--

Working as a mentor in the Education Working Group at The EEF, there are several tiny simple lessons that I’ve delivered over and over again. The one in this article is so common that I already gave it a name: The Robot Butt Rule.

Sexy robot
Hey, girl! 😏

The rule goes like this…

To find if a list is not empty, don’t check its length. Use the robot butt instead.

TL;DR

Thanks to Filipe Varjão on Twitter, you can just skip this whole article.

TL;DR

The Problem

The situation usually arises when some exercise or problem requires the developer to do something with a list only if it is not empty. For instance, return its first or last element. To exemplify this scenario, let’s write a generic function for that scenario: it either returns an error if the list is empty or runs another function on the list if it’s not.

-spec if_not_empty(FunctionType, ListType) -> ErrorType | OKType
when FunctionType :: fun(([AnyType,...]) -> AnotherType)
, ListType :: [AnyType]
, ErrorType :: {error, empty}
, OKType :: {ok, AnotherType}.

That spec might be a bit too crowded, let’s unpack it:

This is the spec of a function called if_not_empty, with 2 arguments of types FunctionType and ListType, respectively. It can return elements of one of two possible types: ErrorType or OkType.

Elements of type FunctionType are functions that receive a non-empty list of elements of type AnyType ([AnyType,…]) and return an element of another type (AnotherType).

ListType, in turn, is a possibly empty list of elements of type AnyType ([AnyType]).

The result of a function is either ErrorType (that is the tuple {error, empty}) or OKType (a tuple of type {ok, AnotherType}, as in AnotherType, the result of the application of FunctionType).

It’s safe to assume that if_not_empty/2 will return {error, empty} if the list it receives is empty and {ok, Function(List)} otherwise.

Initial Solution

Faced with that specification, developers (particularly folks coming from an object-oriented background) tend to implement their solution like this…

if_not_empty(Function, List) ->
if length(List) > 0 ->
{ok, Function(List)};
true ->
{error, empty}
end.

It’s a correct solution, but not the most elegant nor the most performant one.

Before we tackle the performance issue, let’s deal with that nasty nasty if…

Without If

As I’ve said in the past, in Erlang (and sometimes, in general) it is better to stay away from the if statement. So, I generally recommend changing that code and (following another guideline) use function clauses, instead.

That generally ends up in something similar to…

if_not_empty(Function, List) when length(List) > 0 ->
{ok, Function(List)};
if_not_empty(_, _) ->
{error, empty}.

Much better! We’re almost there. But, let’s tackle the performance issue: Let’s benchmark this code in a totally unscientific way, just for the sake of exemplifying the situation…

% First, let's create a really long list…
9> LongList = [rand:uniform() || _ <- lists:seq(1, 100000)].
[0.420515222637019,0.8779006956486717,0.8478602700250025,
0.1570291508306687,0.9922807451124196,0.7020290584226024,
0.22356701178759975,0.5231326543184913,0.8368749225021561,
0.6239065648187836,0.4439549897507573,0.7351771695343225,
0.5519649363326614,0.049439759267886574,0.7885233763802781,
0.3857834407777109,0.7297454606625307,0.5908430855802675,
0.2654111104532171,0.7267499922471473,0.3542637164641538,
0.8517958528371183,0.4310782304329325,0.6897478919755892,
0.13163761878786606,0.01160366378331068,0.9391918781985158,
0.9608111320278309,0.9974567206076438|...]
% Then, a function that does nothing…
10> DoNothing = fun(_) -> nothing end.
#Fun<erl_eval.44.79398840>
% And then run some benchmarks…
11> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{182,{ok,nothing}}
12> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{163,{ok,nothing}}
13> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{208,{ok,nothing}}
14> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{232,{ok,nothing}}
15> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{190,{ok,nothing}}

Perfection

At this point, I generally tell the developer three things:

  1. You’re not using the whole power of pattern-matching
  2. Even when length/1 is written in C, the code still traverses the whole list to compute it.
  3. You should use the robot butt!

And what is the robot butt?

Well… This is the robot butt

if_not_empty(Function, [_|_] = List) ->
{ok, Function(List)};
if_not_empty(_, _) ->
{error, empty}.

That expression ([_|_]) will match on any non-empty list because it’s matching on a list with a head and a tail. They’re both disregarded since we used underscores (_) in their positions, but they must exist. And therefore, if the list has a head, it can not be empty.

And you can bet that this implementation is faster than the previous one…

19> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{4,{ok,nothing}}
20> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{4,{ok,nothing}}
21> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{9,{ok,nothing}}
22> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{5,{ok,nothing}}
23> timer:tc(s, if_not_empty, [DoNothing, LongList]).
{3,{ok,nothing}}

So, next time you need to match on a non-empty list, remember to use the robot butt instead of checking its length. 🧐

Of course, you can also reverse the function clause order and match with an empty list in the first clause… but where is the fun in that? 😜

As a matter of fact, if you run this code through the formatter, it will promptly tell you that it should be written as…

if_not_empty(Function, [_ | _] = List) ->
{ok, Function(List)};
if_not_empty(_, _) ->
{error, empty}.

Because, of course…

Thank you, Sir MixALot. Thank you very much!

Oh, and don’t forget to register for the upcoming 2021 edition of Spawnfest, organized by The Fellowship of the BEAM!

--

--