Let’s Stop Using the Meaningless Terms TAR 1.0 and TAR 2.0

Karen Williams
Aug 3, 2018 · 3 min read

Long ago I knew a very good artist who had been hired to create the art for a video game. One day he came to work and was surprised that his software worked differently than it had the day before. “Oh,” said one of the engineers, “your software was upgraded. Now you’re running version 3.2.” The artist had no idea what the engineer was talking about, because he hadn’t realized that the numbers after the software’s name meant anything. He thought they were like the letters and numbers for cars, so that the numbers after ArtPro 3.2 were like the designation after Lexus RX 350. For attorneys new to Technology Assisted Review (TAR), terms like TAR 1.0 and TAR 2.0 can be just as confusing. Unlike version numbers for a single software program, these names refer to multiple different software programs, all of which do the same thing: use machine learning techniques to help attorneys determine which documents in a review are relevant.

So what is TAR 1.0? Lumped together under this name are most of, including the first set of, predictive coding programs. These tools, in general, first create a random sample of documents, called a control set, that humans tag as relevant or non-relevant. Then the software creates smaller training sets of documents that humans also tag for relevance. After each training round, the software builds a model of what it thinks a relevant document is, and tests that model on the human-tagged documents in the control set to see how many relevant documents in the control set it tagged correctly. These training rounds continue until the best possible computer model has been created. That’s roughly the routine that these programs follow, with some variations — some use human-chosen documents, called “seed sets”, in their training rounds; how they decide that training is complete varies; each has different tools for validating success — but that is how these programs work in general.

So what term should we use instead of TAR 1.0? I’m not an attorney, I’m a product manager with years of software development in my background. I’ve performed deep tests of several of these types of programs, including testing Equivio’s Relevance and Xerox’s Omnix quite extensively. I kept trying to come up with something pithy, but “control/training set” was the best I could do. Then I realized that Predictive Coding, which is how I first described these tools, is a good name for them.

What is TAR 2.0, then? TAR 2.0 refers to a program that uses Continuous Active Learning (CAL). CAL differs from Predictive Coding in that, first, there are no control or training sets. The human reviewer can tag as few as one document as relevant, and CAL can use that document to build its first model and tag the remaining documents. Humans review a set of the newly CAL-tagged documents, and if the humans disagree with the tags, CAL rebuilds its model and retags the documents. CAL also feeds documents to human reviewers that may be “edge cases” to help improve its understanding of a relevant document.

As far as I know there are only two CAL software products out there, and John Tredennick of Catalyst was kind enough to not only give me a demo of their CAL product, Insight, but also to set up an account for me to test it. I was impressed with the demo, not only with the results shown but in particular with how quickly Insight built the model. Unfortunately my schedule meant I couldn’t actually test Insight, so I can’t speak with any authority on the results, but Insight does solve the dreariness of having to tag 500 documents before any model is built.

So I recommend we refer to these TAR products as either Predictive Coding or Continuous Active Learning. These aren’t perfect terms, because Equivio’s Relevance (a Predictive Coding program) would add documents to its training rounds that it considered edge cases for both relevant and non-relevant in order to improve its algorithm, similar to how CAL feeds documents to human reviewers, so the distinctions aren’t perfect. The best solution is to know what you want to accomplish with your review, and to make sure you understand the software you’re going to use.

Oh, and the artist? The video game flopped (though it won an award for its beautiful art), and he went on to have a distinguished career as a US Marshal.

Karen Williams

Written by

Product manager for machine learning and data science, aikido nidan, published fiction writer, MS surviver