Detecting invariants on the web

5 min readJul 21, 2015

The QA process sucks. Manual testing is slow and expensive, and scripting those manual tests with a tool like selenium or phantomjs is a pain that every developer I know tries to avoid. As a result, automated browser tests (when they exist!) often lie around broken and disused, despite their clearly having benefit. The cost is still too high.

A while back I went around Melbourne interviewing lots of tech companies about their experiences with QA. It was a pain point, but not one worth devoting their engineering time to — I was told repeatedly that they’d rather have bugs go to production than force their developers to maintain a boring, flaky test suite. One recurrent theme was everyone’s wish that they could just point something at their web app which would magically monitor for breaking changes — and not just visual regressions.

At the time I wanted something I could start building and selling straight away, so I didn’t spend much time thinking about how to solve this pretty nebulous problem. But recently I’ve stumbled across a few bits of research that have made me think it might be possible soon. One of these is “invariant detection”, and tonight I read a 2001 paper on Daikon, one of the only successful implementations.

You hand it your program, and Daikon runs it with a variety of semi-random inputs, and watches as the variables change throughout the program’s execution. Then it detects patterns in how the variables change, and how they relate to one another. When a pattern holds across every single input you can throw at it, it’s called an “invariant”. Daikon detects these, and tells the users about ones it thinks are important.

Invariants are useful to know about:

they tell you something about what a program is doing that you could only otherwise get by parsing and simulating the code in your head
tools like Agitar can generate regression tests to make sure that invariants you care about stay invariant after your code changes
when they’re unexpected or “wrong”, they can indicate faults in our programs.

Daikon depends on being able to inspect the source code of the program it’s working with. It works with a bunch of languages, and even with spreadsheets. But what would an invariant detector for web apps look like? Especially when it doesn’t have access to the source code.

We can think about the user interface of an application as being equivalent to the interface of a class in OO. A class exposes methods that mutate and return internal state, and on the web the UI exposes “methods” in the form of user actions (e.g. clicking and typing) that mutate state in the form of the DOM tree, which the user directly sees.

So a script inside the browser could work out all the things it could interact with (buttons, text boxes, etc.), and explore the space of possible user journeys, and notice changes in how things appear to the user. For example, to add an item to the todo list, you enter some text in a box and click a button. The action of clicking the button appends an element to a list in the DOM. This script would be able to see this, and given enough example it would be able to detect that the text in the new list item is always the same as the value that was in the text box. And it’d be able to see that after clicking the button, the text box always get wiped.

Once it has confidence in a rule (the Daikon paper explains how they assign relevance to rules, so the user isn’t flooded with noise), the script (maybe a browser extension at this point =P) can say “hey, want to make this a rule?”, and then an attached service could check that rule periodically in a variety of browsers, using SauceLabs.

A key point the Daikon paper makes is that the computational complexity of inference is mostly determined by the number of variables you’re tracking. So tracking every single part of the DOM would be insane. But when you restrict it to the bits that actually change over the period you’re interested, and if you get the user to (via the extension, say) constrain the area of the interface to track (or even explicitly tell the extension what inputs / buttons it can interact with, and which bits of the DOM to track), then it’s tractable, even just running in the browser.

The idea of the user helping the extension is important. If the extension comes up with a rule that’s slightly off, the user could show it how. The user could tell the extension which kinds of invariants it doesn’t care about. And it could give it hints on what kind of inputs to put in a certain field if it can’t work it out from the input type attribute (e.g. try dates in this one, or even “leave this as admin@website.com”).

A big problem is going to be the asynchronous nature of JavaScript. How do you know that an email that gets sent three seconds from now is a result of clicking the send button? What about your inbox updating based on push notifications? These are tricky problems, but there will definitely be ways to work around then (I’m imagining bundling up changes in a given time period after a user action as a single “state transition”, but I haven’t thought deeply about it or read around yet).

One thing I haven’t seen in the literature yet (there’s not much, it’s a pretty young research topic) is the idea of using genetic programming to create hypothetical rules to test against. In Daikon and Agitar, the heuristics for finding rules (things like “this variable equals this one” or “this variable is the length of this array, plus one, then multipled by negative one”) are built-in. But using something like fungp (a Clojure library for evolving functions I’ve messed around with a few times) you could evolve arbitrarily complex rules, and score them based on how well they map to the input. And as it’s a Clojure library, it will of course compile to JavaScript and work in the browser…

Anyway, I still have a massive pile of papers on this topic to make notes on, so I’d better hop to it. One day we’ll have that magical service that tests our web apps for us, and I’m curious about how it’s going to happen.

Detecting invariants on the web

Written by Alistair Roche