Better Haskell with Custom hlint Rules

Most programmers think of linters — automated code formatting tools, such as CheckStyle and JSHint, as merely stylistic tools aimed at readability and visual consistency. Though this is true, a good linter provides deeper benefits in a collaborative environment: not only do they ensure that new contributions fit a project’s style, they also prevent pointless arguments and reduce cognitive load and decision fatigue. Linters are available for nearly every programming language out there, and I don’t think it’s an exaggeration to say that every large project can benefit from judicious use thereof.

The Reconfigure.io compiler is implemented in Haskell. And while we love Haskell, it presents challenges that differ from just about every language out there. Not only is Haskell very syntactically flexible, but its semantic flexibility admits purely-functional, imperative, interpretive, combinatorial, and category-theoretical approaches, to name but a few. And though this flexibility is welcome, especially for those of us coming from more restrictive languages, it can be bewildering, especially in a diverse environment with varying levels of Haskell expertise. How do we, as Haskell programmers, ensure that we use a consistent, readable, maintainable, and safe subset of Haskell?

The hlint linter is a venerable and battle-tested tool for catching common errors and inefficiencies in Haskell code. But the recent release of hlint 2.0 is significantly more powerful—if hlint finds an .hlint.yaml file in your project root, it uses the definitions in this file to augment and customize its hinting rules. Combined with careful thought about your stylistic guidelines, you can bring machine-checked rigor to your Haskell codebase. We at Reconfigure.io have instituted several useful rules, and our codebase has improved considerably as a result.

hlint in Action

A simple example of customizing hlint is to ensure that debugging statements never inadvertently land in customer-facing code. Though the trace, traceShow, and traceShowId functions provided by the Debug.Trace module are exceptionally useful during development, it's important that none of these functions inadvertently slip into our master branch, so we disallow them from appearing in any modules by adding the following entries to the functions block of the hlint.yaml in our project root.

- functions: 
- {name: trace, within: []}
- {name: traceShow, within: []}
- {name: traceShowId, within: []}

Barring certain exceptionally unusual situations, unsafePerformIO has no place in production code: at best, its effects are unpredictable, and at worst it can outright crash your program, given that it's trivial to implement unsafeCoerce with unsafePerformIO. As such, we disallow its use:

  - {name: unsafePerformIO, within: []}

We at Reconfigure.io prefer left-to-right monad operations, rather than right-to-left, so we enforce the use of >>= and >=> over the right-to-left =<< and <=< alternatives:

- error: { lhs: a <=< b
, rhs: b >=> a
, name: "Left to right monad operations" }
- error: { lhs: a =<< b
, rhs: b >>= a
, name: "Left to right monad operations" }
- error: { lhs: "(a =<<)"
, rhs: "(>>= a)"
, name: "Left to right monad operations" }
- error: { lhs: "(a >> b)"
, rhs: "(do a; b;)"
, name: "Use do-notation" }

And you can see the result that hlint yields here:

Note that we can provide both an unwanted expression we want to match (with the lhs key) and its replacement (with rhs). This allows hlint both to provide better error messages, and enables automated refactoring (which we will discuss in a moment).

On the other hand, there are several hlint warnings that we want to disable by default. Reconfigure is built on a large legacy codebase — dating back to 2007, which is practically prehistoric in Haskell terms — which contains a number of warts. Though we’d like to fix these issues someday, we’re much more focused on features, performance, and bugfixes rather than patching up legacy code, so we disable some of these warnings:

- ignore: {name: Use newtype instead of data} 
- ignore: {name: Use section, within: [ Latch ]}
- ignore: {name: Reduce duplication, within: [ NetParts ]}

While it’s possible to disable a given hlint error per-file with the ANN pragma, it's often much simpler to configure it globally.

Extensive use of hlint can bring readability and clarity to your codebase, but it can come at a price. The more rules you add to your linter, the longer you have to spend fixing errors before your linter passes successfully. Though there is a degree of in-editor support for hlint, sometimes you just want the linter to take care of your issues for you.

Enter hlint --refactor. When a refactor can be automatically inferred, hlint --refactor FILE will apply all the refactors it can to a given file—this saves a considerable amount of time, especially when you've added a rule that's triggered often.

Building a Robust Pipeline

Tools like hlint truly shine when incorporated into a continuous-integration (CI) system. No matter what platform you use for CI—whether it's Jenkins, Travis, Circle, or any of the many other excellent systems on the market—it will support custom verification stages, during which tools like hlint can permit or forbid a build from continuing based on whether they find any errors. At Reconfigure.io we take a draconian approach to hlint compatibility—any patch that does not pass hlint's muster will fail to integrate.

hlint is by no means the only useful tool available in the Haskell ecosystem. weeder is a robust tool to detect dead code and unused exports, stylish-haskell is a configurable autoformatter, and Liquid Haskell lets you express sophisticated invariants on top of the Haskell type system thanks to refinement types. By taking advantage of Haskell's deep ecosystem, you can bring concrete improvements to your Haskell codebase with a minimum of effort—and it'll pay off in the long run!