Static Analysis: For, and in Clojure

Yuri Vendruscolo da Silveira
Funding Circle
Published in
5 min readJan 31, 2023
not this kind of ‘static’

Don’t you love it when you can write down and your program just runs™? No compiler warnings, no type errors, your CI not breaking due to unformatted code, etc… However as with all things in life, freedom comes with a cost. Not just eventual runtime errors, but for complex codebases (such as ours) readability may go down the drain.

That’s why many projects rely on tooling to help mitigate such problems, usually in the form of static analysis. Before we continue, let’s define what “static analysis” means for us:

what can we tell about this program,

without actually running it?

Keep in mind that the above only makes sense if we rely on our knowledge of the language itself, that we know what the symbols stand for.

So we need to understand and parse the syntax before we can analyse what it is trying to tell us. In many languages this is already a challenge, due to their complex, inconsistent, or extensible nature. That is not a problem in general for lisp family members, as homoiconicity helps us stay away from complexity without sacrificing extensibility or ease of parseability/expression.

Good, now that we know we can make machines understand our code without running it, what do we do with it? Many languages use this knowledge to query and check your codebase looking for errors (like type errors or invalid syntax), others have tools to help detect practices that the community would like to shy away from (even if valid). Most linters and static analysis tools come with the ability to configure it to use/ignore certain rules or integrate plugins to add further checks.

As helpful some of those might be, none of them is tailored to your project, they are merely practices a part of the community agrees on. If you do try to customise them for your own purposes, like detecting perfectly valid code but report it as undesirable for a specific reason, it may prove itself once again challenging, requiring knowledge that goes way beyond your project and the language being used. I will show you how to try it out with zero dependencies, just pure plain Clojure:

Try to compile/run a Clojure file with that and you will get an error. Yes, all you need is to throw an exception at compile-time, and compilation will halt!

Once again, this is thanks to another cool lisp family feature: running arbitrary code at compile-time.

This specific example is rather useless, so here is a still basic, but helpful one:

If you are a clojurist yourself, you will notice that is just plain normal Clojure, nothing weird or advanced going on, and it helps us check for the common convention of using `!` (bangs) to make io/side-effects/mutation explicit.

When working with a complex domain as ours, with many people interacting with it, things can get a bit more complicated than we would like them to be, so static analysis and quality control become essential for the project’s health.

The following section assumes basic familiarity with clara.rules, the library we use to decouple our rules and make decisions on applications. If you have no idea how clara works, I suggest you just check the overview and come back here!

Take the following fictitious rule:

From the LHS (Left-Hand Side, before =>, where we have the requirements for the rule to fire), can you be sure what, or even if any, fact will be inserted?

No, because there is a cond on the RHS (Right-Hand Side, after the =>, where we have “what happens when the rule fires”).

We don’t really like that, we prefer to have all the requirements on the LHS and save the RHS for the effect, our insert! form.

Since this isn’t really about working with clara specifically, I’m just going to give you right away our answer to it:

With the rule broken into 2, we can more easily express the requirements exclusively on the LHS (kind of what one can achieve by using Clojure’s multimethods to break down and decouple gnarly and possibly nested conditionals).

Great, now that we know what we want and what we don’t want, we can check on PRs and prevent it from being added to our codebase! But checking it manually still takes time, and wasn’t this post about static analysis?

Getting back to our ability to halt compilation, can we detect this pattern? Can we break compilation on the given rule?

In fact we can, we just need to:

  • start from rule definitions
  • descend the syntax tree (code is data 🧘)
  • If we meet any conditional on the way?

(╯°□°)╯︵ ┻━┻

Again, since this article is not about data manipulation, I’m just going to give you the answer we arrived at:

With that we can try out some forms:

Yay, it works as expected! You can try out other forms and tweak the macro to your liking.

But wait, compared to other static analysis tools, the usage here is rather awful really:

  • It is not easily runnable from CLI (you have to compile the whole project)
  • You can’t have it in a separate CI step
  • You can’t collect all the errors, it halts compilation at the first one
  • Also it messes with the source code itself, and there is an easy escape hatch: just don’t wrap the form in the macro

Can we fix this?

Enter the BorkVerse

clj-kondo by borkdude is a well known clojure linter that sparks joy and I hope most of you are using it by now. Better yet: it is not an island, as it can be easily extended with custom code, called hooks.

In fact, our crazy macro kind of mostly works already! All we need to do is:

  • turn it into a normal function
  • change it to use what clj-kondo passes to it (a function call)
  • put it into where clj-kondo expects it to be
  • and let clj-kondo know we want to use this hook!

so the final definition should be changed to look like:

then place it in .clj-kondo/hooks/defrule.clj

and finally add the config bellow in .clj-kondo/config.edn:

After that, if you have your editor setup to use clj-kondo correctly, you should even see red squiggly lines complaining with our custom error messages!

You may still be unhappy about how it just points to the beginning of the defrule form and not the infraction itself.

That is because we are using the same mechanism as our macro: throwing!

At that point clj-kondo only has the file and line where the form begins in scope. To get precise detection and better reporting one can use clj-kondo analysis api to register findings and do more sophisticated things. For this use case, we judged it enough to use as it stands today.

With this newfound power, I encourage you to try out building some custom hooks, it will save you time on reviews and can be helpful for project newcomers when picking up the team’s recommended practices!

Thank you for reading! Make sure to follow us on Medium and our engineering page on Linkedin.

--

--