Formatting code in CodeWorld

Chris Smith
3 min readJan 2, 2020

I’ve made some big changes in the last few days to code formatting in CodeWorld. You can try this out with Ctrl-I in any editor window.

Formatting in Haskell mode (an aside)

On Halloween, I switched CodeWorld’s Haskell formatter from hindent to Ormolu. I suppose I should say say something about the reasons for the change.

First, hindent was no longer maintained, and it seemed like the community was moving in different directions. The other two popular formatting tools for Haskell are Brittany and Ormolu. I find a lot to like about Brittany, but unfortunately I was unable to use it because of its license. This led me to evaluate Ormolu.

I find a lot to like about Ormolu’s philosophy that is will keep and add consistency to programmer choices about formatting. My best experiences with autoformatting code come when I can type something like what I meant, and then hit a button and let the system just clean it up for me. I don’t much care whether the same AST always produces the same formatting (and, in fact, I don’t even believe the notion is well-defined), and if I see something that doesn’t look great, I love the idea that I can nudge things in a different direction, and the formatter will take the hint.

So I’m happy to jump on Ormolu’s bandwagon for a bit, particularly since this isn’t a production-critical use case. But Ormolu is definitely not yet ready for prime time. It mangles formatting in lots of ways, but the worst is removing all blank lines in do blocks, turning your well-written code into garbage. So use at your own risk. I felt like including a premature but promising project was a better choice than sticking with a dead end tool.

Formatting in the educational dialect

That’s the less interesting change, though. The other thing I’ve been working on is formatting in the educational dialect of CodeWorld. Here, no existing Haskell formatter will do the job, since this dialect has different conventions and requirements.

Some of the highlights:

  • The educational dialect uses uncurried functions, which should be written in standard math syntax like f(x), and not f (x) with a space. This is a minor point, but a big deal, and it was surprisingly tricky to retrofit into an existing formatting engine, since changes in columns in one line trickle down into alignment decisions in later lines.
  • Concerns like minimizing diffs don’t matter at all for this tool. Concerns like making the structure of code more apparent at a glance matter a lot. So alignment, in particular, is very valuable.

I already had an implementation from last summer of auto-indent for the educational dialect. It wasn’t always perfect, but teaching with it in the Fall semester dramatically improved student experiences, as students were much less likely to run into problems with layout and indents. Fernando Alegre filed a bug a few days ago complaining that it should be a lot better. (He said less opinionated, but I am convinced that what it needed was better opinions, not fewer opinions.)

So I started working on that. In the process, I ended up realizing that I can do a quick formatter for the educational dialect by just running the auto-indent algorithm on each line! Okay, there’s more subtlety than that: just auto-indenting can choose the wrong layout levels for some lines, so I first walk through the code, noting the layout level at the beginning of each line. Then I walk through again, run the auto-indent, and then ask it to increase or decrease the indent until the line is back at its correct layout level.

I did this originally just to quickly see the results of my auto-indent changes on a large body of code, and it was super helpful. I found a dozen or so little bugs and regressions that would have been missed otherwise. But when I was done, I realized that this is a really useful tool. I now plan to tell my students that before they ask for help on a parse error, they must run the formatter. I predict that much of the time, their mistake will be obvious after just that step.

My second lesson: formatting Haskell (even a small educational dialect) is not easy! I’m glad I have collected a large number of samples of code — my own, my students’, and others’ students’ — to test on, because they popped up all kinds of crazy corner cases and weird examples.

--

--

Chris Smith

Software engineer, volunteer K-12 math and computer science teacher, author of the CodeWorld platform, amateur ring theorist, and Haskell enthusiast.