Introduction to TLA+ Model Checking in the Command Line

Otherwise known as “wait, you mean I can use the TLA toolbox from the command line?”

I started playing around with TLA+ and its more programmer friendly variant PlusCal about a year ago. At that time the amount of documentation that was understandable to people like me who do not enjoy reading white papers full of mathematical notation was slim to none. (Today you can buy Hillel Wayne’s excellent book Practical TLA+) So I started reading other people’s code (errr…specs) and began piecing together some idea of how things were done.

TLA+ is a tool to model and verify concurrent systems, finding bugs in them before you have written any code by testing every possible combination of inputs. You might say “well how is that different from fuzzing inputs in more conventional functional tests?” and the answer is because TLA+ also simulates the kind of timing issues commonplace with distributive or concurrent systems. As microservices continue to dominate conversations about modern architecture, the question of what happens when a service sends a request that never arrives or arrives before something else has finished or gets repeated become more and more relevant. TLA+ allows you to test for these scenarios.

Beyond that it’s a radically different way of thinking about computer programs. It’s not a programming language so one has to abandon certain mental crutches and really think about how we represent what we mean. You can write an excellent model and then write code that doesn’t actually do things the way the model is describing. So you have to think about how your favorite frameworks and libraries actually do what they do.

I’m a huge fan of anything that might bring unvocalized assumptions up to the surface where people can explore them. It’s hugely difficult to build a complex system of services if two teams are not in sync about their assumptions.

The best way to think of TLA+ is like drawing a blueprint for software you’re going to build. I’ve seen people use it to test algorithms, but I’ve also seen people use it to model user flows throughout an application.

However the Toolbox is Awful

When you’re ready to test your models in TLA+ virtually every resource will direct you to the TLA+ Toolbox. The toolbox is a java application that looks a bit like it’s supposed to be an IDE and allows you to translate PlusCal into TLA+, configure models, test them and explore errors. Its UX is awful, simply awful. How you open a TLA file in the toolbox is not intuitive, things will often fail without any errors (or rather the toolbox will do what you’ve told it to do, which is nothing because you needed to click some buttons you didn’t click first). It’s literally exhibit A for why software developers should respect the skills of designers on their teams.

On top of that I find it a little clunky. I generally don’t work with full scale IDEs if I can avoid it. I prefer the stripped down elegance of TextMate, Sublime or Atom. Sometimes this means I miss out on valuable features, but I’m a person who currently has over thirty tabs open in Chrome, three projects open in TextMate and at least one WiP up in Scrivener…. I like to use my computer’s memory, okay? This is a light day.

At some point while I was noising around other people’s TLA+ specs to try to understand the language I realized that there seemed to be command line versions of the key tools from the toolbox. Rather than open up the toolkit I could compile PlusCal and test my models from my terminal and even more exciting … from other scripts.

Wait… there’s a command line version?

Yup! If you’ve downloaded the TLA+ toolbox, you already have the command line tools installed on your computer. To configure them there’s a helpful set of shell scripts you can download here.

One of the first things I did with this knowledge was build a little proof of concept python application that parsed other python programs and tried to write models in PlusCal based on what they were going to do. I wanted to learn more about bytecode and AST and this seemed like a fun way to do it. It was partially successful in that I could write a few simple programs that became working models, but eventually I hit a wall trying to learn two new things at the same time.

But the potential of command line interfaces is still the same. If model checking can be executed via command line, then it can be executed by other programs. You can update your models and test them automatically the same way you might do with other forms of testing in continuous integration.

pcal: Translating PlusCal to TLA+

Once you have installed and configured the command line toolbox, you can start translating PlusCal comments in your specs to TLA+ using the pcal command.

Note that pcal will assume the extension .tla so you can leave it off if you want. For the sake of clarity I will include it.

The basic command is:

pcal myspec.tla

Unlike command line tools you might be used to pcal is very particular about order of arguments. The name of the file you want to translate should always be last, any flags should come before it.

-version does something that might be unexpected. Rather than telling you the version of pcal you have installed, it lets you pick which version of PlusCal the translator will assume it’s translating (this currently goes up to 1.8)

If you want to see the AST for your PlusCal you can use the -writeAST flag to forego the rest of the translation just to output the AST.

This is super useful tool but not without its bugs. For example, I understand that-spec and -spec2 are supposed to let you choose between version 1 and 2 of TLA+, but I’ve never been able to get them to work.

Defining Models with Configuration files

If you’re used to the Toolbox interface, you’re probably wondering how you can configure your invariants and other test cases. The answer is the .cfg file automatically generated for you by the pcal command.

Normally it will look like this:

SPECIFICATION Spec
\* Add statements after this line.

But there are a variety of parameters you can set before checking your models. The most useful of these areINVARIANT and PROPERTY which specify which invariants and which temporal properties to test respectively.

Here’s a full list of all the possible configuration settings

INVARIANT
INVARIANTS
PROPERTY
PROPERTIES
CONSTANT
CONSTANTS
CONSTRAINT
CONSTRAINTS
ACTION_CONSTRAINT
ACTION_CONSTRAINTS
INIT
NEXT
VIEW
SYMMETRY
TYPE
TYPE_CONSTRAINT

I have not been able to figure out what the difference between the singular form of the parameter (eg CONSTANT) and the plural form of the parameter (eg CONSTANTS) is. Both can take multiple values if separated by a space. It might be that there’s some backwards compatibility issue involved here but…. ¯\_(ツ)_/¯

Make sure you add your custom configuration below the line. If you put them above that comment line, pcal will overwrite them every time you translate.

tlc: Checking the Models

Now it’s time to run the model checker! Like pcal the .tla extension is optional. The basic command looks like this:

tlc myspec.tla

And this will produce the following output:

TLC2 Version 2.10 of 28 September 2017
Running breadth-first search Model-Checking with 1 worker on 4 cores with 910MB heap and 64MB offheap memory (Mac OS X x.x.x x86_64, Oracle Corporation x.x.x_x x86_64).
Parsing file myspec.tla
Parsing file /var/folders/sx/tb9t1_890xd0hs9y0xb1k5fr0000gn/T/Integers.tla
Parsing file /var/folders/sx/tb9t1_890xd0hs9y0xb1k5fr0000gn/T/Naturals.tla
Semantic processing of module Naturals
Semantic processing of module Integers
Semantic processing of module myspec
Starting... (2019-01-17 23:06:59)
Implied-temporal checking--satisfiability problem has 1 branches.
Computing initial states...
Finished computing initial states: 25 distinct states generated.
Progress(5) at 2019-01-17 23:07:00: 332 states generated, 222 distinct states found, 0 states left on queue.
Checking temporal properties for the complete state space with 222 total distinct states at (2019-01-17 23:07:00)

This tells you what modules were imported and how many states the model has. If you’re curious the actual spec we’re running here is the wire transfer example in Hillel Wayne’s book. That particular spec has a temporal property that we know will be violated by the algorithm. This is what that looks like:

Error: Temporal properties were violated.
Error: The following behavior constitutes a counter-example:
State 1: <Initial predicate>
/\ people = {"alice", "bob"}
/\ sender = <<"alice", "alice">>
/\ receiver = <<"bob", "bob">>
/\ acc = [alice |-> 5, bob |-> 5]
/\ amount = <<1, 1>>
/\ pc = <<"CheckAndWithdraw", "CheckAndWithdraw">>
State 2: <Action line 55, col 27 to line 61, col 77 of module wire>
/\ people = {"alice", "bob"}
/\ sender = <<"alice", "alice">>
/\ receiver = <<"bob", "bob">>
/\ acc = [alice |-> 4, bob |-> 5]
/\ amount = <<1, 1>>
/\ pc = <<"CheckAndWithdraw", "Deposit">>
State 3: <Action line 55, col 27 to line 61, col 77 of module wire>
/\ people = {"alice", "bob"}
/\ sender = <<"alice", "alice">>
/\ receiver = <<"bob", "bob">>
/\ acc = [alice |-> 3, bob |-> 5]
/\ amount = <<1, 1>>
/\ pc = <<"Deposit", "Deposit">>
State 4: <Action line 63, col 18 to line 66, col 68 of module wire>
/\ people = {"alice", "bob"}
/\ sender = <<"alice", "alice">>
/\ receiver = <<"bob", "bob">>
/\ acc = [alice |-> 3, bob |-> 6]
/\ amount = <<1, 1>>
/\ pc = <<"Done", "Deposit">>
State 5: Stuttering
Finished checking temporal properties in 00s at 2019-01-17 23:07:00
332 states generated, 222 distinct states found, 0 states left on queue.
Finished in 01s at (2019-01-17 23:07:00)

If you are already familiar with the Toolbox, this looks pretty much like the output you’d see before, expect without the highlighting.

By default tlc stops at the first invariant error. If you’d like it to continue, you use the flag -continue. In general tlc's options are split into two types: MODE-SWITCHES and GENERAL-SWITCHES. Mode switches change how the models are run, from switching from breadth-first to depth-first, to picking a seed number for randomization. General switches allow you to specify different configuration files, how much detail is printed to stdout.

I hope this is enough to get you started. I will update this post from time to time as I figure out more about how the tools work. Happy Model Checking!