Bounded Buffers with TLA+ and Coyote — Part 2: Conditionals

Sam Miller
3 min readDec 3, 2022

--

When I wrote Part 1, I wasn’t really expecting a Part 2 so soon. However, funny things happen when you post on Twitter. Markus Kuppe, who I know from the many amazing TLA+ videos he’s made, replied to my tweet. Markus said, “Most textbook solutions and runtime implementations… use two conditionals instead of notifyAll/PulseAll.” Markus linked to an implementation in the JDK as well as his tutorial-talk for “Weeks of debugging can save you hours of TLA+” (the video is worth watching!).

I’m not sure if Markus was throwing down a gauntlet, but I considered it thrown. Surely if TLA+ can show that the “two conditionals” can work, and it can be implemented in Java, then certainly we can implement it in dotnet/C# and use Coyote to test it.

There are a few things to note about the TLA+ spec. The bugfix notes explain that we can “Remove notifyAll and instead introduce two mutexes (one for Producers and one for Consumers). A Consumer will notify the subset of Producers waiting on the Producer mutex (vice versa for a Producer).” Okay, simple enough. Except the spec doesn’t actually use two mutexes! Markus explains, “we can just pick the right thread type from the set of waiting threads.”

This was subtle and a little confusing to me until I looked at the commit. Once looking at this section of code, it’s pretty clear what’s happening. If the waitlist and others aren’t the same set (defined as their intersection not being equal to the empty set), then we peel off one of the threads. Just above the commit we can see that Producers and Consumers are the two sets which make up the waitset non-deterministically. This is a couple cool TLA+ tricks: letting threads sort themselves out to Producers and Consumers while also being able to use the Producers and Consumers to appropriately determine which thread to unblock. Markus knows what he’s doing!

What about me? Do I know what I’m doing? Do I have any real idea of what I’m doing? Well… sort of. I don’t really know a lot about dotnet/C# so I struggled to implement the mutexes appropriately. Reviewing the error messages, it was clear that I needed to notify a lock that might be held by another thread, which isn’t really how the original sample was setup. After analyzing the JDK and reading some docs, I came across the System.Threading.Interlocked feature. This is a little different from how the JDK implemented the code, but ultimately I was able to get the same result.

I replaced the PulseAll approach with a Conditional approach, leaving most of the logic in place so that the program can be run both with the bug and without the bug. I put the commands in the readme, but they’re exactly the same as the original. The only difference is the implementation. Instead of notifying all the threads, we interleave the interlocks so that Put waits for notFull and notifies notEmpty while Take waits for notEmpty and notifies notFull. If we run the sample, we get the exact same result as the original. Woo! Here’s a second TLA+ spec that we can implement and check with Coyote.

--

--