HSoC — Hadrian Optimisation: Update 3

5 min readJul 9, 2019

Calling this week 3 would be a bit too much of a stretch considering I’m posting this near the beginning of week 5, so instead I’m going to switch to “updates” and aim to write them roughly weekly.

Most of the past week or so has been spent moving house (which was quite the ordeal, one I’m not looking forward to doing again next month), writing code that compiles HsInstances with different configurations of flags, and spending hours upon hours running that code, so let’s jump right in.

The Code

This section doesn’t relate to Hadrian directly and can mostly be ignored if you just want to cut to the chase, but it contains bits and pieces of the code I wrote for testing and things I found interesting.

Running system commands

It turns out running system commands in Haskell is pretty easy and basically any argument you could make for python shell scripts applies here in addition to the type safety and general functional goodness you get with Haskell. Who ever said Haskell wasn’t practical?

From System.Process you get system :: String -> IO ExitCode, which simply takes a command as a String and runs it, returning the resulting ExitCode. This is nice, but what if you want to know what the stdout and stderr for the process was (to see why a command failed for example)? Luckily there is systemOutput :: String -> IO (ExitCode, String) from System.Process.Extra , which works the same way but returns the ExitCode with stdout and stderr as a String.

The other essential function here is duration :: IO a -> IO (Seconds, a) from System.Time.Extra which simply performs an IO action and returns the time it took in Seconds (which is just a type synonym for Double), along with its output.

Getting the flags

Each optimisation is represented with the following data types, based off of the information for optimisations in ghc/docs/users_guide/using-optimisation.rst, using record syntax and Template Haskell to create lenses for Opt:

The weird spellings of reverse and default are because they (or more accurately the corresponding Template Haskell lenses created for them) conflicted with other functions, so I just added another letter to the end of each of them because I wasn’t feeling too creative at the time.

At first I tried to write my own parser with Megaparsec, but since Yoda is the only parser combinator library I had used up to this point I struggled with this. Yoda backtracks automatically, which Megaparsec can do manually to an extent with try, but crucially Yoda keeps track of partial parses, which is inefficient in most cases, but does make it easy to consume an unknown amount of text, reach a token, backtrack to just before the token, and save all that text to a particular part of a data structure before moving on. There must be ways of doing this with Megaparsec and similar libraries, but it’s certainly not as simple as wrapping the same code I’d write for Yoda with try. I’d like to explore that further at some point, but luckily I eventually had the sense to look for a .rst parser and found Pandoc does the job nicely.

The custom parsing wasn’t all for nothing though since Pandoc just gives me a string for _defaultt and I need an Info. Here’s the code that does that:

Then it’s mostly just a matter of unwrapping the Pandoc I get from the file using runIO and readRST to convert it into an Opt.

After extensive testing, it became clear that using-optimisation.rst had flags that did not apply to GHC 8.6.5, the version of GHC used by the Stack build script, and even versions of the file much older than the release of 8.6.5 seemed to have flags that were too new and therefore unrecognised. So instead I used man ghc and copied all the flags from there into a list and manually set parameters etc. Less fun but it mostly works, except for the reverse of -fmax-relevant-binds being listed as -fno-max-relevant-bindings instead of the correct -fno-max-relevant-binds. I’ll make an issue/merge request for that soon.

Testing the flags

To test a list of positive flags, this is the basic code I used:

We use some function genSubset to take a list of flags and make a list of subsets of those flags. I used different functions for this, but the main one I used is inits :: [a] -> [[a]] since it completes in a reasonable amount of time and tests all the flags, including all of them all together.

The reverse version (setting -O1 and then using reverse flags) is very similar and so is the one using [Opt] (it’s just slightly more involved with some use of lenses).

Results

To make a very long story short, none of the positive flags can account for the ridiculous compile times since even with all of them on they barely increase the time from -O0, but turning on -O1 and adding reverse flags does bring the time down by a decent amount (some flags more than others), but not all the way down to -O0.

This is a bit confusing since -O1 and -O2 are described as sets of compiler flags, but seemingly there’s either more to them than that or they use different parameters than the default for certain flags (though I have no idea how those parameters would be determined). The documentation on this issue is a bit thin and I don’t know where to start looking in GHC’s massive codebase to find out exactly what’s happening, so I’ll probably jump onto the GHC IRC and hope somebody knows.

I haven’t managed to find the problematic flag(s), but one thing is for certain: HsInstances compiles much faster with -O0. The question, then, is whether this unoptimised HsInstances slows down the rest of the build process. Spoiler: not really.

The following times were taking on my laptop using these commands:

./hadrian/build.stack.sh clean
./hadrian/build.stack.sh -j -B --flavour=quickest --integer-simple --configure --profile

As a baseline, the build took just over 30 minutes (two runs: 30m04s and 30m17s). By making HsInstances compile with -O0 by putting {-# OPTIONS_GHC -O0 #-} at the top of the file, well over 2 minutes was slashed off the time (two runs: 27m33s and 27m47s). It could be that other parts of the build process perform slightly slower, but that slow down is negligible compared to the reduced compile time.

However, adding the GHC options to the top of the file means that it will be compiled with -O0 for every stage of the build process, including the final one, which could result in a slower output GHC. It might be the case that HsInstances is already being compiled with -O0 for the output, which would explain why the stage1 compile is faster, but to be on the safe side it would probably be better to make a rule for the stage0 compile instead, since this would result in a purely beneficial speed-boost for the exact same output.

To make this rule requires properly diving into to Hadrian and figuring out what patterns are responsible and how to override them with special cases, since there’s no specific rule for HsInstances like there was for .dependencies.

Next Steps

Try to find out what’s going on with -O1 and -O2 from the IRC
Make a rule for HsInstances in stage0
Do a similar process for DynFlags after first checking if there’s any one thing in particular that’s slow to compile