To save me from writing this over and over on Twitter, I’m going to explain the difference between coding for randomness, and bug that introduce randomness.
A calculation function should always perform consistently, when provided with the same parameters. If it can’t do that, you can’t prove that it works. You take a specification for a program, and by hand, calculate the expected results based on the parameters supplied. You then provide a program with those parameters and check that the results match. And they should match perfectly. And you do this with many tests, covering each rule in your specification with a variety of values. If you can’t do that, you can’t prove that your code is correct.
Much is being stated about how the Covid-19 code built by Imperial College is stochastic, how it’s supposed to run multiple random variations. And many calculation functions do this, and work consistently. Your hand calculated tests should include what random variations were applied. You should have expected results based on that, and when it runs, they should be correct every time you run.
Maybe you then want it to have true random values? So, you have a program that calls the function and gets a random number from the machine and passes it in. By dividing that out, the calculation function is testable (and testing the calling program might include things like ensuring that the results aren’t the same).
The Covid-19 function variations aren’t stochastic. They’re a bug caused by poor management of threads in the code. This causes a random variation, so multiple runs give different results. The response from the team at Imperial is that they run it multiple times and take an average. But this is wrong. Because the results should be identical each time. Including the buggy results as well as the correct ones means that the results are an average of the correct and the buggy ones. And so wouldn’t match the expected results if you did the same calculation by hand.
As an aside, we can’t even do the calculations by hand, because there is no specification for the function, so whether the code is even doing what it is supposed to do is impossible to tell. We should be able to take the specification and write our own tests and check the results. Without that, the code is worthless.