Why Not Both?

Build Haskell projects with either cabal or stack.

fommil

--

The State of Haskell 2018 survey results show that the community is split between cabal (aka cabal-install) and Stack as their build tool.

Although this split can get heated at times (duplicated effort, risk of tribalism, etc), I see it as a user choice that could lead to greater levels of happiness for every developer.

However, when build tools are mandated by management or legacy decisions, it is no longer a choice and we can become resentful of a tool that is not a good fit for our personal workflow.

“The Tooling War is a limiting factor to Haskell adoption. We must endure any cost for a truce.” — Thomas Tuegel

In this letter I will explain what cabal, Stack, Hackage and Stackage are, and describe how developers can use freeze files to access Stackage curated sets from cabal builds. I’ll also show how to build stack.yaml projects with cabal.

Truly, both excellent tools can live together and we can get on with the business of writing beautiful Haskell together!

Cabal, Hackage and Stackage

When I was learning Haskell, the tooling infrastructure was by far the most difficult thing to understand. This is a tour of what I learnt.

When we say cabal, we mean the command line tool cabal (lowercase). Both cabal and Stack are built on top of the Cabal (capitalised) library.

Cabal reads .cabal files, called package descriptors, a simple hierarchy of keys and values. Here is an example: jaeger-flamegraph.cabal

With this standalone description of a package, Free Software contributors can upload their applications and libraries to Hackage, the Haskell community’s central package archive. Here is an example: jaeger-flamegraph.

Once on Hackage, it becomes searchable by Hoogle with docs and source code available for browsing by everybody. Here is an example: Data.List

.cabal files, consumed by both cabal and Stack, can also be used to produce proprietary binaries for distribution to private customers.

In most major programming languages, transitive dependencies between libraries cause painful compatibility problems. For example, the Scala community recently decided to invest in managing transitive dependencies because it has become such a problem. Cabal already has a solution: each dependency can have a series of constraints, listing both the minimum compatible version (as per industry standards) and a maximum compatible version, which I believe is unique to Haskell. More details in the Package Versioning Policy FAQ.

From a practical point of view, library and application authors typically declare “foo version 1.0.3 works for me" and get on with writing their code. This implies it won't necessarily work for foo version 1.1.0 or 2.0.0 (when/if they are released). It is not always possible to know the maximum compatible version of a dependency when publishing. Therefore, Hackage has the concept of Revisions, which allow trustees and maintainers to change the constraints without requiring the code to be re-published.

Stackage, on the other hand, is a list of packages, versions, and build flags, that is kept up to date by the Stackage Curators. Stackage is amazing, it is effectively a list of packages that are known (asserted by automated builds and tests) to work with each other. Every JVM project I’ve ever worked on has maintained an independent list of maybe-compatible libraries, that often blew up at runtime. Some larger organisations centrally managed these files, but it was never shared with the wider community. The Haskell community has centralised this job and we can all update our dependencies on a weekly rolling schedule, together!

Hackage and Stackage are the Yin and Yang of package management: how to publish a Free Software ecosystem, and how to gradually evolve commercial applications against that ecosystem.

cabal uses constraint solving of raw Hackage data by default, and Stack uses Stackage curated sets by default. It should therefore be of no surprise that library authors typically prefer cabal and application developers typically prefer Stack. But these are just the defaults, and can be changed.

Build Tools

The command line tools cabal and stack both build on top of the Cabal library and its .cabal file format.

The main problem that was addressed by Stack, and why it is so popular, was solving “cabal hell”: cabal once had a single repository for all applications and libraries installed by a user. This meant that everything had to be compatible. Deleting the ~/.ghc folder and starting again was a common complaint among Haskell developers, with a stopgap "sandbox" solution being available (but unpopular) for a short time.

stack, in contrast, used isolated and shared caches out of the box which Just Worked for most developers. Recent versions of cabal have caught up with v2-* (also called new-*, or "nix like") commands, which also use isolated, shareable, caches. The hellscape is a thing of the past!

cabal

A modern cabal based project may have three additional files to .cabal, which are ignored by Hackage and downstream dependencies:

  1. cabal.project listing where to find the .cabal files for a multi-package build. Additional constraints, flags or optimization levels may be specified. This file can be used to be more specific about how developers can build an application, including specifying git repository overrides of Hackage, and requiring a specific version of ghc.
  2. cabal.project.local is exactly the same format as cabal.project but typically ignored by git, allowing individual developers to customise their workflow, e.g. adding -ferror-spans may improve IDE integration.
  3. cabal.project.freeze allows only a subset of the cabal.project and is typically auto-generated by cabal v2-freeze. This stores the state of the constraint solver, effectively producing a package set specific to your project!

For example. If we have an application that depends on foo ^>= 1.0.3 (i.e. "works for me" constraints), then when foo version 1.0.4 is released, cabal will automatically upgrade our dependency. With a freeze file, we will be frozen at the previous version.

But the package set doesn’t have to be produced by manually solving the constraints, we can use Stackage’s freeze files! For example, to use LTS 12.21 in cabal is a single command:

curl https://www.stackage.org/lts-12.21/cabal.config > cabal.project.freeze

stack

stack uses .cabal files but encourages users to write package.yaml files that are pre-processed by hpack to produce .cabal files. This allows common stanzas and constraints to be shared across .cabal files, although many users do not make use of this feature!

An additional benefit of hpack is that it can automatically generate the other-modules section by scanning the files on disk.

For example: this package.yaml produces this registry.cabal.

Hopefully the Cabal library will adopt support for sharing common stanzas and auto-generating the modules, as these are convenient features to have. In the meantime, both cabal and Stack users can use hpack in their projects: so long as they remember to create .cabal files before uploading to Hackage!

The other required file is stack.yaml, which is where the version of Stackage is defined along with version overrides and extra dependencies (including git repositories). Much like cabal.project, it is ignored by Hackage.

It is possible to access the cabal solver from stack, overriding the Stackage package versions in stack.yaml, see stack solve. If we are writing a library with stack, it is still up to the library author to provide sensible constraints in their package.yaml / .cabal files.

Strengths

Stack takes a holistic approach and comes with many features that are aimed at beginners. For example, the stack binary will automatically download the version of ghc that should build the project. I've personally not had a good experience with these automated ghc downloads on Ubuntu, CentOS, Debian or ArchLinux but it seems to work for the majority of Haskell developers.

cabal prefers to follow the UNIX philosophy and often requires an extra step. For example, ghc must be installed through the operating system's package manager (which are therefore very reliable) or with the ghcup script, which makes it easy to build ghc from source (an attractive option for security-conscious organisations). The ghcup approach is comparable to popular languages such as Java, Python, Scala and Rust.

stack will automatically convert package.yaml files into .cabal files, whereas the equivalent behaviour in cabal requires a wrapper script

#!/bin/sh
# save this file as cabal-hpack and run instead of cabal
find . -maxdepth 2 -name package.yaml -exec hpack {} \;
exec cabal "$@"

stack comes with a --docker feature that builds the project inside a dockerised environment. With a small shell script, cabal-docker, we can accomplish the same in cabal. Arguably cabal-docker is the most minimal Haskell installation of all, as it can be used to build projects and run tests with no dependencies besides docker!

Although stack began its life with the best-of-class caching, cabal has now caught up and overtaken. stack struggles to support changing compiler flags (e.g. swapping between --fast builds and regular builds) whereas cabal can swap between -O0 / -O1 / -O2 without missing a beat. stack is unable to share the caches of extra-deps or git sources, whereas cabal treats everything equally and can share builds between projects .

Both stack and cabal can be used to create reproducible builds, with a caveat: ghc is non-deterministic and can produce different binary outputs for the same input. This means that Haskell builds cannot, at present, be validated by a third party or repeated exactly.

Arguably, Hackage revisions can interfere with the reproducability of a build: some aspects of the build may change (e.g. versions of build-depends code generators). stack and Stackage go to some additional lengths to pin to specific revisions of packages. In cabal, these corner cases are mitigated by using allow-newer, allow-older, or index pinning.

Of course, the easiest way to achieve reproducible builds is to ban network lookups during the build and to have everything locally available, which is the industry standard in military contracting and investment banking.

Unfortunately, cabal can lag behind stack for features, simply because it is currently the second-most popular build tool. However, it usually only takes a little bit of coding to reach parity. For example, stack comes with hoogle integration and can generate a database for the current project in a single command. cabal-hoogle will soon fill this void.

stack2cabal

I wanted to use cabal on a project that is defined using stack.yaml because caching semantics have a huge impact on my productivity.

It turns out that there are several tools already available to make the conversion (apparently unaware of each other’s existence, as they have the same name)!

  1. stack2cabal by Edsko de Vries
  2. stack2cabal by Lars Brünjes
  3. stack2cabal by Tseen She
  4. jenga by Erik de Castro Lopo

I tried all four but was most satisfied by Tseen’s variant. I installed it with

cabal v2-install stack2cabal

Simply run stack2cabal every time the stack.yaml or package.yaml changes and it will generate a cabal.project and a cabal.project.freeze.

For example, try building stack itself:

git clone git@github.com:commercialhaskell/stack.git
cd stack
stack2cabal
cabal v2-build

and then installing it into ~/.cabal/bin

cabal v2-install exe:stack --overwrite-policy=always

If overwrite-policy isn't available, make sure to upgrade to the latest cabal.

The Cabal User’s Guide is a great place to go from here to learn how to run tests, benchmarks, and more.

Now we can get back to the real holy war: Emacs vs Vim!!!

--

--