Down the Rabbit Hole: Defining a guard-safe modulo operation in Elixir
In this article, I’m going to talk a little bit about a tiny new language feature I proposed that got completely out of hand, and the crazy roller-coaster ride that happened next.
Let me start with a little bit of background:
Elixir is a fresh and clean dynamically-typed functional programming language, built on top of Erlang.
As José Valim, the creator of Elixir once said: “I liked the parts of Erlang that I saw, but I disliked what I didn’t see”. Elixir adds quite a few interesting features on top of Erlang’s functionality, including a different syntax that will even feel natural to you if you don’t know Prolog.
But Erlang is quite the amazing tool to build on top of: It has the accrued stability, resilience and interoptability that can only be obtained by being actively maintained for now more than 30 years. The features of the architecture of Erlang and the BEAM are quite marvelous. But that is a topic of another article.
In these thirty years, there are also some weird choices that were made, often for historical reasons that now seem strange; there are some limitations in the language that Elixir inherits from Erlang: the most prevalent being what is and what isn’t allowed inside guard clauses.
What is a guard clause, you might ask?
Basically, because Erlang (and Elixir) are dynamically typed languages, the only way to make sure that the things that get passed in to your functions adhere certain rules (so you can do better pattern-matching), is to query certain functions on them. What makes this harder, however, is the fact that both languages are also impure functional programming languages: There is no way to know if a function will return the same result when it is called twice with the same input.
So, the solution that was implemented were guard-safe built-in functions (BIFs): These are a very limited subset of functions in the standard library that the compiler knows to have deterministic results (and no side-effects), and that are reasonably fast, so pattern-matching stays fast.
This approach works, and has done so with great success for the last 30 years. However, there are many operations that are not allowed in guards. And sooner or later, every Erlang/Elixir programmer will come across something they’d want to test for in a pattern match, but cannot.
Elixir tries to make it easier for developers to enhance the things that can be done in guard clauses, by allowing you to write a macro: a piece of code that is replaced with its implementation during compilation. Of course, for this new operation to be guard-safe, all of its components need to be guard-safe as well.
How this all got started
Two days ago, I hit something where I needed a `modulo` function for. The modulo operation, or mod for short, returns the remainder of an integer division, rounding towards negative infinity (this is called floored division), and takes the sign of the divisor (the number that you divided by).
Elixir does not have a built-in mod function. There is a remainder-function called rem, but it does truncated division: It rounds towards zero, and takes the sign of the dividend (the number that was divided). There are certain mathematical operations that really need mod, because the result will stay positive as long as the divisor is positive.
I wanted a mod function in Elixir. At first I thought this would be impossible to do in a guard-safe way. But then I read the Wikipedia article about the Modulo operation, which states that one can be built in an environment that does not support it natively, by doing:
mod(a, n) === a-(n * trunc(a/n))
‘Aha!’, I thought. ‘We can use this to define a guard-safe variant!’.
However, the integers inside Elixir are BigNums with unlimited size (until the number fills all of your computer’s memory, that is). Above function would only work when first converting the value to a float (which happens during the `/`- operation above). This would mean that it would only work with a subrange of all possible integers.
But, Elixir does have a built-in integer division operation called `div`. So, I decided that it was possible to use that one (the new version of the mod implementation would then be `a -(n * div(a, n))`, and ventured out to the Elixir-Core mailing list, to proclaim that it would be wonderful if this would be used to implement a guard-safe mod operation. I got the green light to get started, so I did.
Down the Rabbit Hole
Turns out it was not that simple. Twelve hours later I had a working version, which was more than ten times as long as I originally expected it would become. In the end, the decision was made to not merge it because of its size (and therefore inefficiency), and instead only add a non-guard-safe version to the core language. But the road there was a very interesting, to say the least. Here we go:
Subproblem 1: Elixir does have a guard-safe floored division operation
Remember that I said that we could use div to write a mod function?
I was wrong. div is the matching counterpart to rem. That means that it also does floored division: Negative numbers are rounded towards zero, and therefore ‘up’: div(-99, 2) is -49, whereas the required result, which is what the default integer division operation does in many other environments, would be -50.
So, I was stuck. The solution? Write a guard-safe floor_div operation, of course!
The mod implementation would then be:
mod(a, n) === a - (n * floor_div(a, n))
However, how do we get that floor_div operation?
We know that the result of div is correct, except in the cases where its answer is off by one. This happens when the signs of dividend and divisor do not match, and if the remainder of this division is non-zero. Only in these cases, we need to subtract one to get the desired floored result.
So, I came up with a truth table that represents this:
a | n | rem(a, n) | div(a,n) + ?
+ | + | 0 | 0
+ | + | !0 | 0
- | + | 0 | 0
- | + | !0 | -1
+ | - | 0 | 0
+ | - | !0 | -1
- | - | 0 | 0
- | - | !0 | 0
0 | + | 0 | 0
0 | + | !0 | 0
0 | - | 0 | 0
0 | - | !0 | 0
Now, how do we come up with an operation that is only -1 when the signs of a and n do not match, and their remainder is not zero?
This looked like it was similar to what the sign function provided: sign(x) returns -1 if x < 0, 0 if x == 0, 1 if x > 0.
When taking the sign of the product of two numbers, the answer will be -1 if the signs are different (one number is positive and one negative), 1 if the signs are the same, and 0 if one of the two is zero. So, sign(a*n) was a step in the good direction, only returning -1 when the signs of a and n do not match.
To make this also take the remainder in account, we can take the sign(rem(a, n) * n) instead. As rem(a,n) will have the sign of a, the result is the same as before, except when the remainder is 0, in which case the sign will also return zero.
So, the solution to this subproblem:
sign(rem(a, n) *n)
But this will still output 1 if a and n have the same sign! This can be changed by translating the output as follows: div(sign(x) -1, 2). The truth table for this operation is:
sign(x) | div(sign(x) - 1, 2)
+1 | 0
0 | 0
-1 | -1
Hooray! Success!
Are we done? No… because there still is a little problem…
Subproblem 2: Elixir does not have a guard-safe`sign(x)` operation
Right! There is no sign function in Elixir. But, even the guard-safe implementation should be simple, right?
sign(x) === div(x, abs(x))
Wooh! It works! That is… until you pass 0 into this function and everything breaks apart.
How to prevent the division-by-zero here was what I was stuck on for the longest time. In the end, I found out that there was a way to ‘cheat’ a little:
- When we divide 0 by 1, we get the output `0`, which is what we want.
- When we divide any non-zero integer x by its absolute value, it is certain that this value is always >= 1.
These two statements mean that we can introduce max into the mix:
sign(x) === div(x), max(abs(x), 1))
Woohoo! This clever trick prevents the division by zero!
There is only a sliiiight problem with this approach. Yep, you guessed it:
Subproblem 3: Elixir does not have a guard-safe `max` operation
There does exist a max(a, b) function in Elixir, but it is not guard-safe. But is it possible to define a guard-safe version of it that works for integers? Wikipedia came to the rescue again:
max(a, b) === div(a+b + abs(a-b), 2)
All right! All done! All functions inside that definition already exist, so we’re finally finished.
When fully writing everything out, the final code is:
x - (n * div(x, n) + div(div(rem(x,n) * n, div(abs(rem(x, n) * n) + 1 + abs(abs(rem(x, n) * n) - 1), 2)) - 1, 2))
That’s 4 × div, 3× rem, 3× abs and a bunch of multiplications and subtractions. But it works!
Luckily, there are ways to separate these parts in Elixir, which keeps the result reasonably readable. For the adventurous, the resulting code can be found here on GitHub.
In the end, this was not included in the core language because the long macro would be rather inefficient to compute (probably at least ten times slower than its `rem` counterpart). Still, it was a very interesting experience.
I want to thank José for the hours where we were brainstorming in the #elixir-lang IRC-channel about ways to improve this implementation. I also want to thank all people that contributed to the discussion about this on its PullRequest page on GitHub. Even though it did not make it into the language in the end, it was a lot of fun to tinker with this and get a deeper understanding of both math functions and Elixir’s macro system.
So, there you have it! Thank you very much for reading. Feel free to reach out to me with any questions you might have. And I also always have a listening ear for ridiculous or reckless coding ideas. I’ll be waiting!
;-)
~Wiebe-Marten Wijnja