Elm & Guarantees
In this post, I’ll show you a bug I found in an Elm program recently. This bug is interesting because it shows us both how the guarantees Elm gives us really helps in debugging, and where those guarantees end. I’ll compare a bit with Redux/React along the way.
A user of elm-mdl reported that using elm-mdl buttons caused his application state to occasionally revert itself. He supplied this video:
The expected behaviour was that whenever the user clicks one of the flat buttons in the bottom, numbers are put in the highlighted row, and the highlight advances. But as you can see, it only works reliably for the first row. Subsequent clicks sometimes cause the highlight to flicker briefly but stay where it is.
We saw this only when using elm-mdl buttons; using standard buttons instead made the glitch go away. What gives?
So we look at that. In the app, the sub-component “Form.QuestionSet” uses elm-mdl buttons. The actions of those buttons live someplace deep inside the “QuestionSetAction …” action in the code below (simplified for clarity):
It looks innocent enough: “QuestionSet” is a subcomponent, which has its actions wrapped in “QuestionSetAction”. To dispatch one of those, we call its “QuestionSet”s update. All standard Elm Architecture so far.
But wait! Notice that “QuestionSetAction” is carrying (line 6) the model argument to “QuestionSet.update” (line 8). So we will be updating not the current model, but whatever the model was when the action was constructed!
When was it constructed, then?
In the view function. So if the first action after rendering that view is “QuestionSetAction state.questionModel”, the captured “state.questionModel” will be current, and the app will behave correctly. If the first action is something else, we will revert the model when the “QuestionSetAction …” happens.
So why are there extra actions with elm-html buttons, but not with elm-mdl buttons? Because elm-mdl buttons have animation and so dispatch actions asynchronously, in this case with delays. The ripple component issues a delayed stop-the-animation action some 200ms after the click. That action is wrapped in “QuestionSetAction state.questionModel”, and when it is dispatched after 200ms, the captured “state.questionModel” replaces the current model (line 10 of the first Gist above).
This bug was easy to find because Elm gives us a guarantee: the only way to change application state is via the top-level update function.
In contrast, it’s flatly not possible to write an Elm Architecture program which mutates any state from any place other than the top-level update function(*)—it’s not possible to commit any of the above React/Redux errors in Elm. Debugging the present bug is reduced to asking the question “How does update get called with a stale model?”, which is straightforward to work out. Pure languages FTW, baby!
Voiding the warranty
If Elm is so great, how come there was a bug in the first place?
Elm Architecture experts would tell you that you are not supposed to capture your model in actions, like the above program does. Its not “best practice”. Elm as a language cannot enforce this best practice.
So that seems to be one limit of Elm’s guarantees: it’s still up to the programmer to get asynchronous computation right.
I’m not aware of any language that gives guarantees for asynchronous computations strong enough to statically eliminate the present bug. That’ll have to be in the future ;)
(*) There should be small print here, but we’ll leave that for a future post.