Incremental Parsing in Amp

With the latest release of Amp, syntax highlighting has been overhauled. This post dives into the performance-related redesign that has landed with v0.5.

Background

First and foremost, it’s worth noting that in any text editor, there’s usually only a subset of visible content on-screen at any given time. A relatively small window into the buffer is scrolled up and down as required:

The highlighted section (B) is the window of visible content.

When rendering a buffer, lines in the scrolled region (A) are ignored, and those in the visible region (B) are rendered to the screen. Even though we’re not rendering lines in the scrolled region, we still need to parse them, as they establish state before entering the visible region. After the visible region is rendered, the process stops; the lines in the pending region (C) aren’t processed at all.

Unfortunately, things get slow as the scrolled region grows. If you scrolled to the bottom of a 4000-line file, rendering just the last 60 lines would require parsing 3940 scrolled lines just to prepare the parser state. It’s also important to remember that the buffer is re-rendered on every keystroke. This inefficiency has a tangible impact: on a 2015 Macbook Pro, this latency starts to get noticeable around ~2k lines; at 7k, it’s completely unusable.


Fortunately, there’s an interesting property of data formats and languages that might not be readily apparent: edits can only reclassify content at or after the edit location. We’ll call this “forward invalidation”, for later reference.

Let’s start with a simple example:

let buffer = Buffer::new();
buffer.insert(content);

Let’s then add a quote to the start of the last line:

let buffer = Buffer::new();
"buffer.insert(content);

Content after the quote is re-classified as a string, but the preceding line hasn’t been affected. This is important: the parser state leading into the second line is the same in both examples. If we cache its state at the end of the first line, we can skip it the next time and start on the second line without issue.

That’s how Amp’s render process works as of v0.5. We resume the parser just before the visible region and skip a whole lot of work.

Implementation Details

Of course, the devil is always in the details, and “just cache the parser state” ended up being anything but. First off, the initial parser design wouldn’t work:

let data = String::from("buffer content");
let tokens = TokenSet::new(data);
for token in tokens.iter() {
// render token to the screen
}

There’s a subtle data relationship above that might not be obvious. The TokenSet instance, which acts as a parser, takes ownership of data. The call to iter() creates an iterator that then borrows and yields slices of data (if you’re curious why we need two types for this, I’ve written about that, too).

This kind of design is idiomatic in Rust. The standard library’s String type has a similar chars() method for iterating over individual characters.

Here’s the problem: all of these types are inextricably linked to one another, and if we want to cache the parse state, we’re also stuck caching the data it points to. It’s almost comical: tens, possibly hundreds of copies of a buffer kept in memory, just to support caching. Even worse: it also prevents us from re-using a cached parser with modified data. TL;DR: rm -rf.

In the new design, parser state and buffer data are kept separately. The parser doesn’t take ownership of data; it borrows it one line at a time and yields offsets that point to ranges of data, rather than the data itself. This keeps the parser memory footprint small, and allows it to be resumed with modified content.

The Cache

Populating the cache is easy enough: Amp’s view manages a collection of per-buffer render caches. When a buffer is rendered, its parser state is cached every 100 lines.

Using the cache is also fairly straight-forward: the cache is searched for a parser that precedes, and is as close as possible to, the visible region. The render process is skipped ahead to match and picks up where it left off.

Unsurprisingly, invalidation is more involved. When a buffer is modified, any cached states at or after the modification need to be cleared (remember that “forward invalidation” property we mentioned earlier?). The trick is, buffers are modified in a lot of different places in Amp. To handle this, we need to dig a little deeper: we’ll track this event in the Buffer type itself.

The thing is, Amp doesn’t define the Buffer type. All of the generic, editor-independent types were built as a separate crate called Scribe. Polluting that crate with a bunch of Amp-specific cache invalidation cruft would be a mistake; we want this to be as subtle as possible. All we really need is a “change” hook; we can let the caller decide how to handle it.

Given that, Scribe’s edit operations have been updated to trigger an optionally-configured change_callback closure. You can build a closure to do whatever you’d like in response to the event. Amp configures one with shared access to the render cache, guarded by a RefCell. Every time an edit is made, render cache points beyond the location are invalidated via the change callback.

Right. Now that the implementation is in place, let’s run the numbers.

Measuring Impact

Here are benchmarks taken of the render process before the caching implementation is in place:

scrolled buffer rendering (127 lines)
 time: [7.1372 ms 7.1492 ms 7.1728 ms]
scrolled buffer rendering (6916 lines)
 time: [453.05 ms 453.41 ms 453.83 ms]

And here are benchmarks taken after:

scrolled buffer rendering (127 lines)
 time: [1.7706 ms 1.7723 ms 1.7754 ms]
 change: [-75.698% -75.437% -75.209%] (p = 0.00 < 0.05)
 Performance has improved.
scrolled buffer rendering (6916 lines)
 time: [2.0095 ms 2.0107 ms 2.0121 ms]
 change: [-99.558% -99.557% -99.555%] (p = 0.00 < 0.05)
 Performance has improved.

Nice, even small buffer performance has improved! More importantly, performance remains consistent at both extremes (within 0.3 ms); scrolled buffer content size has very little effect on rendering performance.

Well, mostly: the first render still has to populate the cache. In practice, this is minor (see the initial benchmark for numbers) and infrequent; editing large files is considerably more enjoyable!

Interested in giving it a try? You can find more info on Amp here.