TDD works on algorithms

Kanela Kaligosi
3 min readJun 28, 2024

--

Driven by a recent LinkedIn post by Jason Gorman I would like to bring to the attention of its author, as well as to the attention of Kent Beck and the TDD community, the concept of Certifying Algorithms. It is a concept developed in the algorithms community. Certifying Algorithms could be used in order to do TDD on Algorithms. This article might bring light to the hidden connections between the two communities and suggest a link between theory and practice.

Certifying Algorithms were introduced by Kurt Mehlhorn, an esteemed Computer Scientist (and also my advisor). One could start from this article to begin learning on the topic. Here is also the Wikipedia page on the topic.

The opinion I present here is only my own. It is about my understandings after reading and thinking about the topic and I do not claim expertise in the area.

I show how to do TDD on algorithms, by using a Certifying Algorithm for the membership problem considered in Gorman’s post. The solution is very simple and the additional amount of time is only O(1), i.e., constant.

As my reply became too long for a LinkedIn post I created a Medium article. Here is the link.

#TDD #Algorithms #CertifyingAlgorithms

Let me start with one note. Maybe testing for the correctness and testing for the performance of an algorithm are two different tasks better addressed by different techniques.

I first discuss testing for correctness. So, what is a Certifying Algorithm? A certifying algorithm is an algorithm that outputs, together with a solution to the problem it solves, a proof that the solution is correct. Then, given the proof, the input, and the solution of the algorithm, a checker (another program) can check that the solution is correct.

The idea is for the algorithm to produce such a proof that the checker, the program that checks for correctness, is simpler than the original algorithm.

Let us apply this idea to the problem considered by Jason Gorman. Namely, given a sorted array a[1..n], containing n distinct elements, we like to find whether element x belongs in the array or not. A regular algorithm would return a simple true or false to this question. (Like for example the “contains” method in ArrayList in Java.) A certifying algorithm for this problem would return true or false as a solution to the question, but also it would return as proof an index k in the array, that locates x, with the property that a[k-1] < x <= a[k]. (Care needs to be taken for the ends of the array.) Given the proof, i.e., index k, the above inequalities are exactly and simply what the checker needs to test in order to verify the correctness. In particular, by considering index k and the solution of the algorithm the checker makes the following checks.

— If solution = true and a[k] = x, the checker readily knows that x belongs in the array and accepts the solution.

— If solution = false and a[k] = x, the checker rejects the solution. (False negative.)

In case it does not find x at index k, it checks the following.

— If solution = false and a[k-1] < x < a[k] the checker knows that x does not belong in the array, since the array is sorted and accepts the solution.

— if solution = true and a[k-1] < x < a[k], it rejects the solution. (False positive.)

It only takes an additional O(1) (constant) time to make these checks. Moreover, the original algorithm does not need much change in order to become certifying. Internally, it already has to locate x, therefore index k, in order to answer the question. It’s only a matter of also outputting k, instead of only outputting true or false. This is an example of a Certifying Algorithm with a simple proof and a simple checker that only costs an additional constant time.

It is also TDD since the checker can be written before the algorithm. It does not need to know anything about the internals of the algorithm.

Several algorithms have become certifying so far, but the area is still wide open for more research. For more complicated questions, it is not trivial to find a “good” proof and check it fast.

Now, regarding the performance test. I will improvise here a bit. What about wrapping a timer around the algorithm and test that the time it takes to run is within the required time?(Study the problem well enough to allow for a large enough constant that is hidden in the Big O() notation.) (Some averaging would be required for randomized algorithms where we are interested in their expected running time.)

--

--