Confessions of a threading macro addict
Since the very beginning of my work in Clojure, I’ve been a fan of the threading macros: ->
, ->>
, and related. For me, they are essential tools.
For example, consider this code snippet:
(map #(assoc % :full-name (str (:first-name %) " " (:last-name %))) (sort-by :last-name (filter #(= "97204" (:zip %)) (get-contacts db))))
That’s a reasonably typical sequence of transformations … but I would never call such a code sample readable, and you can certainly start to hear the chorus chant “all those parens”. It’s the kind of thing that scares traditional programmers away from Clojure entirely.
This can be improved with some formatting:
(map #(assoc % :full-name (str (:first-name %) " " (:last-name %)))
(sort-by :last-name
(filter #(= "97204" (:zip %))
(get-contacts db))))
The indentation makes it easier to follow, but this code remains something of a puzzle to figure out, as you must carefully dig down to the deepest point, (get-contacts db)
, and work outwards. Meanwhile, the code is starting to inch towards the right margin.
A common approach is to break this sequence up using let
:
(let [all-contacts (get-contacts db)
local-contacts (filter #(= "97204" (:zip %)) all-contacts)
sorted-contacts (sort-by :last-name local-contacts)]
(map #(assoc % :full-name (str (:first-name %) " " (:last-name %)))
sorted-contacts))
This gets things in the right order, but you have to keep coming up with new names for each step … and naming is, of course, the hardest thing to do. This approach has benefits, but to my eye at least, it seems more cluttered than what follows, and the reader is forced to jump back and forth between each new name and its corresponding expression.
This is where the threading macros come in; in the above example, the transformed collection is always the last argument in each function call, and this is absolutely not a coincidence.
(->> (get-contacts db)
(filter #(= "97204" (:zip %)))
(sort-by :last-name)
(map #(assoc % :full-name (str (:first-name %) " " (:last-name %)))))
This last example tells a story, one that starts with contacts coming out of a database, and continues through filtering, sorting, and adding the :full-name key. I would maintain that, without creating new names, or (worse) reusing existing names, I can concentrate on the operations, not the names.
In fact, I often do not feel satisfied with a chunk of code until I have it in this form, as a flow of transformations.
Ok, and now for a couple of tips and tricks.
Injecting cond->
into ->
Sometimes you find yourself with a simple enough thread of operations, but you find you need to add one, or a handful, of conditional steps.
Perhaps you start with a standard ->
block:
(-> {:env (:env options :dev)
::http/routes (route/expand-routes (graphql-routes compiled-schema options))
::http/port (:port options 8888)
::http/type :jetty
::http/join? false}
http/create-server)
(This example is adapted from this code).
But then you need to add a single optional step, so you convert the entire block into a cond->
:
(cond->
{:env (:env options :dev)
::http/routes (route/expand-routes (graphql-routes compiled-schema options))
::http/port (:port options 8888)
::http/type :jetty
::http/join? false}
(:graphiql options) (assoc ::http/resource-path "graphiql")
true http/create-server)
That’s not so ideal, as every form after the first must have a triggering expression, which explains the existence of thetrue
expression before http/create-server
. But that feels odd and unnecessary, and gets worse if there’s a series of conditional clauses, mixing truly conditional clauses with hard-wired on clauses.
Fortunately, there’s a better option:
(-> {:env (:env options :dev)
::http/routes (route/expand-routes (graphql-routes compiled-schema options))
::http/port (:port options 8888)
::http/type :jetty
::http/join? false}
(cond->
(:graphiql options) (assoc ::http/resource-path "graphiql"))
http/create-server)
Now the conditional part stands out, but leaves the overall flow intact, without the true
non-sequitur before http/create-server
.
Is it a collection or a value?
The rule of thumb is that you use ->
on values and ->>
on collections. So ->
works with assoc
, update
, and so forth (functions to modify a single map), while ->>
works with filter
, map
, reduce
, and other functions that transform a collection of values.
Very occasionally, these lines can get blurred. Here’s a hypothetical example (I don’t have a shareable real example):
(-> []
(conj "first" "second")
(->> (mapv str/upper-case))
(conj "last"))
Here the embedded ->>
steps out of the normal single-value flow, to perform an operation that is associated with a collection. The result of that is threaded back into the value flow.
The rest of the bunch
Clojure’s core library is so rich, it’s easy to miss the other threading macros.
some->
and some->>
are useful in that they short-circuit as soon as the threaded value or collection evaluates to nil.
as->
is the true swiss-army knife; with as->
, you can inject the flow value anywhere in each threaded form.
Wrapup
The threading macros are one of those features that really highlights what a tightly integrated design Clojure represents. Only the combination of persistent data structures and collections, combined with careful thought about parameter order with an eye towards composing functions, combined with the almost magical power of macros, yields a system that is simultaneously concise, readable, and efficient.
Alas, there are no hard-and-fast rules about when code gets too concise and readability is sacrificed: that varies based on the nature of the code, and preferences of the developers writing (and reading!) that code … but my preferences lead towards more visible operations, and fewer awkward intermediate names.