Continuous Integration in Haskell

Matthias Benkort
9 min readApr 19, 2020

--

Introduction

In this post, which is much more like a memo to me, I will go through how to setup a continuous integration (abbrev. CI) environment for Haskell projects and cover the followings:

  • Github Actions workflows
  • Efficient build caching
  • Haddock documentation previews
  • Code coverage reports (with overlays)

This setup heavily relies on stack for I am still not convinced by the added benefits of Nix complexity for simple Haskell projects which do not have any particular deployment ambition apart from a publication on Hackage or Stackage. Here we go…

About Github Workflow

For a little while now, Github has launched its own continuous integration platform called Github Actions. The idea is pretty similar to other existing competitors (Travis, CircleCI, Jenkins…) yet is now directly integrated into Github. From what I’ve experienced so far, the feedback can be quite fast — down to less than 2 minutes between a push and the results from the CI environment.

Actions are made of Workflows declared in a dedicated folder .github/workflows on the source repository. Each Workflow defines rules and conditions for triggering one or several jobs, as well as the steps to execute for each job. What I find particularly interesting with Github Actions is how they can be shared and re-used across environments. Github provides some built-in actions for the basic stuff but gets extended by the community every day. So, a big thank to Justin Le (@mstksg) for pulling together some of the essential pieces of the workflow detailed below

Let’s start with a minimal workflow that compiles and run all tests on a Haskell project setup with stack:

Basic Workflow

This basic workflow works great and is relatively simple. In a few lines, we get a continuous integration environment for compiling source code and running tests! Awesome!

Yet, it suffers from at least one big problem inherent to stack: it gets to rebuild the stack snapshot and all dependencies on every build. Even on small projects, this can already take up to 10 or 15 minutes. Furthermore, snapshot dependencies don't usually change that often so it's a good opportunity for caching.

Caching Dependencies

Caching is supported through a dedicated Github action which requires however to provide a cache key. That cache key will be used to invalidate cache when necessary. Ideally, this means that our cache key must change when anything in the snapshot also changes. One option could be to simply hard-code the snapshot resolver (e.g. lts-14.25) as a cache key although:

a. This means we need to manually update the CI configuration every time we make a change to the snapshot. Even though it may not be that frequent, this is still quite annoying.

b. Stack is quite “smart” about what it builds from the snapshot and doesn’t actually build the whole snapshot, but only the dependencies it has to build (based on what’s specified in the `.cabal` file). Thus, adding some external dependencies to our project may end up requiring new packages from the snapshot as well.

Therefore it is much nicer to take a hash of the stack.yaml as a cache key which lists the project current snapshot and its external dependencies. No need for a strong hash here so pretty much anything will do, like MD5. Then, it's only a matter of knowing what to cache.

Great, almost there. If we are a library author, we like to write nice documentation and make sure we give our future users all the necessary keys for understand our library. So, it’d be great to:

a. Check that our documentation annotations in our source do compile as well!

b. Export a preview of that documentation so that the latest documentation is always available even if the library hasn’t been published on Hackage and/or Stackage yet.

Haddock Preview

In this section, I’ll describe how we can automatically build and export the Haddock documentation of a project to Github pages. This makes sure that the documentation of a project is always reflecting the latest version and, automatically hosted and available.

Before anything, we’ll have to make sure that the documentation is only exported upon successful merges. As a matter of fact, it’ll be quite unfortunate if any external contributor could override the project documentation via a mere PR! Therefore, we’ll restrict documentation export to commits that are merged and pushed to the master branch.

With this, a preview of the Haddock documentation is automatically published every time something is merged to the master branch. See for instance input-output-hk/cardano-addresses.

As a library developer, this helps a lot working on nice top-level interfaces for modules, making sure to hide the irrelevant parts andpromote the relevant parts for library users. Even internally in a company, it encourages teams to push out nice documentation and use Haddock to share interfaces across teams.

Code Coverage

Disclaimer

Last but not least: program coverage. This is a somewhat controversial topic in many projects, and particularly in Haskell. I’ve seen folks who even sometimes claim that “Haskell code doesn’t need testing” because the type-system already is sufficient for catching bugs.

This is wrong.

Programs need testing. Haskell is great because it allows for discarding a whole class of tests and makes it easier to focus on testing parts that actually matter! I recall a previous JavaScript project where we’d have to test whether functions would not throw the infamous “undefined is not a function”. We had hundred if not thousands of such tests in our project just to make it possible to refactor code with some level of confidence. In Haskell, those concerns usually go away and we can instead write actual tests to look for bugs in the logic or, in components integration.

It can be somewhat hard to know whether or not we have covered what needed to be covered. Code coverage can help here, to some extend. Looking at program coverage is a good indicator for identifying areas that clearly miss testing. It doesn’t tell one much however about areas that are already covered (and whether they are sufficiently covered). Reaching 100% will never prove that a program can’t fail. It isn’t a formal proof. Still, aiming for a good code coverage reduces chances that something will effectively go wrong in practice. This, coupled with property testing and strong static typing is what makes Haskell programs so robust.

When looking for code coverage in Haskell, one can’t escape the Haskell Program Coverage (abbrev. hpc) tool developed at Galois some years ago. Hpc is a very capable albeit capricious tool and, it hasn’t got half of the love it deserves. The documentation is quite terse and it’s poorly supported by existing tooling.

There are some projects out there which attempt to circumvent some of these difficulties but they are often old and with partial support. Besides, I do find the existing local reports generated by hpc a lot nicer and insightful than what platforms like coveralls.io or codecov.io present (not blaming them though but, it’s hard to get some language-specific features done right on a generic platform. Especially when the language is bit exotic itself).

Indeed, hpc is tailored to Haskell programs, and gives us coverage information about top-level definitions, alternatives and expressions. It also supports partial coverage which makes it easier to spot some redundant conditions or untested branches. In addition, hpc also supports coverage overlays to discard sections that are either not coverable (because they may be actual code invariant) or, because testing them may actually be pointless (like automatically derived instances). The overlay can be hard to setup, especially with stack, but I’ll show below how we can work-around that nicely. To conclude, hpc HTML reports are already great so what about simply exporting those to Github pages in our workflow as well?

As a first step towards improving on the Haskell Program Coverage tooling, I’ve compiled a little Makefile with what I’ll show below. This Makefile should make it slightly easier to integrate code coverage in existing continuous integration environment (on Github actions or others).

Basic Coverage Reports

Getting coverage reports is pretty simple with stack, though requires running the test suites with a special flag:

$ stack test --coverage 
$ stack hpc report --all

This can then be integrated to our workflow pretty easily in a similar fashion to what we did for the Haddock preview.

This is sufficient in most cases.

Nevertheless, program coverage in Haskell will often look quite low because some pieces of code do not easily get “tested”. Indeed, hpc is only able to tick parts of the code that end up being evaluated to WHNF. This is primarily a good thing because more than once have I been bitten by lazy evaluation in tests where some errors wouldn’t actually be triggered after not being fully evaluated by test assertions! Looking at the coverage can help here to notice entire sections that do not get evaluated although they clearly should.

On the contrary, there are some parts that don’t get evaluated and present no risk. For example, it’s quite common to have Show instances on data-types when using QuickCheck. Yet, Show is only truly required upon test failures, to show counter-examples! If the test passes, Show will never be evaluated. Another example would be some use of Proxy, which is typically not evaluated at the term-level because it carries information only relevant at the type-level.

In the end, it’s useful to remove false alarms when doing a code coverage analysis to reduce the amount of noise coming from those reports. It can look somewhat alarming to see that a module is only 50% covered by tests whereas in practice, the other 50% are just derived instances and Proxies. Hpc can achieve this through overlays! So let’s dive in:

Program Coverage Overlays

Hpc computes code coverage from .mix and .tix files which captures which parts of the source were triggered by the test suites.

An overlay is an additional file that marks some of the source code as magically “covered”. Hpc can convert to and combined an overlay with other .tix files to make a final report. For constructing an overlay file, one typically starts by generating a draft overlay which covers 100% of the program, and trims it to make a final overlay.

Here’s the trick: hpc and stack don’t really play well together with regards to overlay. Indeed, stack does generate packages prefixed with specific hashes which helps it managing builds (via ghc-pkg) and knows what should be rebuild when code changes. There's currently no built-in support in stack for working around this so we'll have to get our hand dirty a little bit.

One approach is to use hpc directly and to override module names with their corresponding hashes with some grep and sed commands. Thus, one can generate a draft with:

$ stack test --coverage                       
$ stack hpc report --all
$ stack exec hpc -- draft \
--hpcdir=$(stack path --dist-dir)/hpc \
--srcdir=. \
$(stack path --local-hpc-root)/combined/custom/custom.tix \
| sed "s/module \".*:/module \"/g" > draft.overlay

From there, we have a nice overlay file that gives 100% coverage of our library. We can now tweak it to make an overlay template that we’ll use in our final report. For example, let’s say I want to magically cover every Show and Eq derived instances in a module called “Cardano.Address”. I can simply declare:

"Cardano.Address" {                         
tick function "showsPrec";
tick function "==";
}

As a next step, I need to figure out what hashes are actually used by ghc-pkg underneath in order to generate a report that combines the output of our test suites with this overlay:

PKG_NAME=$(cat *.cabal | grep "name:" | sed "s/name:\s*\(.*\)/\1/")
PKG_HASH=$(ls \
$(stack path --dist-dir)/hpc \
| grep $PKG_NAME \
| sed "s/$PKG_NAME-[0-9]*-\([0-9A-Za-z]*\)/\1/g"))
TMP=$(mktemp)cat template.overlay \
| sed "s/module \"/module \"$PKG_HASH\//g" > $TMP
stack exec hpc -- overlay \
--hpcdir=$(stack path --dist-dir)/hpc \
--srcdir=. \
$TMP > overlay.tix

With this, we get to generate a .tix file corresponding to our overlay which can be included in our final report:

stack hpc report --all overlay.tix

That report can then be exported as shown in the previous section so long as we version the overlay with the source code. As mentioned in introduction, I've compiled these steps in a Makefile so it's easy to re-use in a workflow:

And, 🎉! A blazing-fast continuous integration setup with meaningful reports, documentation and caching. I’ve hosted a version of the final workflow setup as a Gist right here. The good news is that this is a drop-in workflow; Project names and dependencies are entirely figured out from the project configuration. From there, it’s possible to add your favorite linter and/or extra checks! Enjoy!

Feedback is greatly appreciated. Thanks in advance ❤️

--

--