HSoC — Hadrian Optimisation: Profiling and getting ahead of Shake changes (Update 5)

4 min readAug 5, 2019

This past couple weeks I’ve fixed some documentation, submitted a bug report for --rebuild and -B doing seemingly wrong but useful things, profiled Hadrian, and opened an MR to get ahead of some Shake changes.

The recent heatwave and my aging laptop (we’re approaching double digits soon) have meant my timings have been pretty inconsistent. It booting at all isn’t even a guarantee anymore.

But enough about that, let’s talk a little about profiling and the MR.

Profiling

I profiled Hadrian and Shake by changing hadrian/build.stack.sh from stack build -—no-library-profiling ${HADRIAN_NIX:+ — nix} to stack build —-library-profiling --profile ${HADRIAN_NIX:+ — nix}. You can do something similar with build.cabal.sh.

Then I used Profiterole to restructure the profile and hopefully gain new insights. Here are the results.

From the profile, it seemed like split :: (a -> Bool) -> [a] -> [[a]] might be causing problems, with it taking up 6% of the runtime in the profile. But after looking at the code, running some tests, and talking to my mentors it seemed like it was linear and not much could be done there. So instead I looked at the callers of split, the main one being (?==) :: FilePattern -> FilePath -> Bool, which takes up ~11% of the profile runtime.

That seems like quite a lot for one function, although after talking to Neil Mitchell about the accuracy of profiling, it seems best to take the results with a grain of salt.

If I’ve understood correctly, each time a function is called, some information is pushed to a stack for the purpose of recording how long it took. This is a very small operation, but it doesn’t come for free. The problem is that the amount of time it takes is not proportional to how long the function takes, so the shorter the function call, the larger the relative amount of overhead profiling takes up, eg. 2 ticks to record a 2000 tick function is just 0.1% overhead, but 2 ticks to record a 10 tick function is 20% overhead. This means that String operations in particular can be overrepresented, because they are often defined recursively going element by element, which means lots of short function calls, building up the overhead significantly.

If that description is reasonably accurate then I imagine this problem doesn’t apply to tail recursive String functions, but I can’t say for sure. Either way, I don’t believe split or match (which account for most of that 11%) are tail recursive, so this overhead problem almost certainly applies to them.

FilePattern Matching

Accuracy of the profile aside, it turns out Shake is going to be moving over to different, more efficient implementation of ?== anyway, so it’s worth doing something about. The semantics are slightly different however, so it can’t be directly swapped out without a few changes to the FilePatterns first.

(?==) :: FilePattern -> FilePath -> Bool works how you’d probably expect it to. ?== will tell you whether a given FilePath (type FilePath = String) matches a given FilePattern (type FilePattern = String) or not. True if it does, False if it doesn’t. From the documentation of Development.Shake:

The new System.FilePattern ?== is similar, but doesn’t have any support for //, and ** is expanded to work for absolute path prefixes too. This means "**/foo.bar" ?== "/home/foo.bar" will be for the new ?==, when previously for the old ?== it would be false.

Thankfully this slight difference hasn’t seemed to cause any problems with removing // from Hadrian (including functions which use it like <//>), which is what my latest MR does, everything builds just fine.

The conversion from // to ** is roughly as follows:

//a ==> **/a
a// ==> a/**
a//b ==> a/**/b

Optimisation

Will this change lead to a significant speedup when ?== gets moved over to the new implementation? I’ve tried to run times using a local Shake repo to answer this question, but sadly my laptop is really on its way out after 9 years of use (the stress of 100% CPU usage for hours on end probably didn’t help) and I’m having real trouble building Hadrian on any other machine (the stack build getting fixed would really help in that regard). This has been a problem the whole time I’ve been working on this project, but hasn’t been too big of a deal up until now since the stack build was working on my laptop. However, now that the runtimes are getting too inconsistent to be useful I’m going to have to spend some time to get a build working on my desktop.

All that said, as much as I’d like for it to be faster, and System.FilePattern’s ?== does seem to be faster from the preliminary tests I did, I have doubts at how significant the difference will be. There’s no doubt that ?== gets called a decent amount, but the pessimist in me says that the profile is significantly over representing its runtime because it works on Strings, and that in reality it isn’t taking up that much time.

More importantly however, after doing all of this and writing this post, I only just thought to check how much time Hadrian actually runs for vs. how much of the runtime is system commands. It turns out Hadrian is only running for about 30 seconds of the total ~26 minutes, making the maybe 10% of Hadrian’s runtime ?== takes up pretty insignificant. I feel rather silly for not thinking of this earlier, but on the bright side, at least the MR should help future-proof Hadrian, even if it doesn’t really speed it up all that much.

Next Steps

Write some formal documentation for Hadrian expressions
Sort out a hardware/software setup that I can get consistent times for
Try to reduce the critical path somewhat by looking for unused imports on files with high wall clock times

HSoC — Hadrian Optimisation: Profiling and getting ahead of Shake changes (Update 5)

Profiling

FilePattern Matching

Optimisation

Next Steps

Written by James Foster