Computer models, or models in general, are widely used to predict the weather, the economy, or pandemics. When these models have been created by a large group of scientists who have worked on them for years, you can assume the models are correct. Or if you like, they are more or less correct. But what about models that the media use to demonstrate the effect of changes in our climate, or the effect of the Corona outbreak? How do we know these models are reliable? Let’s have a look at an example. On twitter today, I saw an article in Washington Post by Harry Stevens that uses a model that shows we need to “flatten the curve” of the outbreak.
The model shows a rectangle with dots: healthy people are represented by bluegray dots, sick people are orange, recovered people are pink. A simulation starts with all gray dots, then one dot turns orange (it is infected), then every dot it touches turns orange, and every dot that becomes orange, turns pink after a few seconds, when the simulation says it has recoverd. The first simulation has all dots moving, symbolizing people traveling. It shows all people being infected rather rapidly. Then in another simulation, a majority of dots is static, they don’t move. One of the moving dots gets infected, then they all get infected and finally recover. This process is slower than the first, demonstrating that when a lot of people stay home, the curve is flattened and our health care system does not get overloaded so fewer people die. However, I noticed one thing in the simulation. In the first simulation I saw, a coincidental bunch of static dots in the middle effectively blocked the moving dots from passing. The first infected dot started on the right, then it took quite a while before infection passed through the middle of the field infecting the left part of the field.
I tried to match this with the real world, with people staying home or not and getting infected or not. Stationary dots can block infected dots from passing from one side of the field to the other. But people staying home do not block traveling people from traveling to the other side of town. This blocking effect I saw, would not happen in the real world. Hence, the model is flawed.
The model can be fixed in a very simple way, I think. The dots act as pool balls. They touch, and change directon. If the dots would not change directon, but just pass through one another, the behavior of the model would be slightly different, and more realistic.
In University — I have a degree in astrophysics — I built models of interstellar gas clouds, among others. What I found is that building the model is one thing, testing it is another. In my career as a software developer, I learned that more often than not, testing your models takes more time than building them. The models the weather institutes use have been tested against the actual weather. They predict the weather, and afterwards the programmers can compare the prediction with the actual weather, and adjust the models. Same for climate: weather models and climate models create predictions, which are tested against climate change over ten or twenty years. Models of flu epidemics can be tested that way, if you assume one flu behaves the same as the other. When they don’t, you can still test the models, assuming that you can categorize flu types and compare the model for flu X₁ with those of Y₁, Y₂ and Y₃.
If you have a model that predicts something that you don’t have data for yet, you need to find other types of testing. For the corona virus, we don’t have data because it’s brand new. In some ways, you can use data from SARS or Ebola, but as we don’t know much about the behavior of the Corona virus yet, these comparisons are of limited value.
When I created a model to calculate the total value of the expert knowledge in a company — I did that for Kema in 1995, results were published in their 1994 annual report — I did not have any test or verification data. I decided that I would create three independent models, then compare the results. That is, I would test each model against the other two. If results were very different, my models weren’t reliable, or at least, if one was, I didn’t know which one. If results were the same or very similar, I would assume that the models were more or less correct. Fortunately, results were strikingly similar, so we published the results.
Another way of testing models is to test edge cases. For example, you create a model that calculates your travel time by car from Amsterdam to Rome, given a certain amount of traffic on the freeways. An edge case is “there is no other traffic”. This is unrealistic, but you model should give the right result. You can calculate the result by just dividing the distance by the maximum speed for each segment. Another edge case would be “there is maximum traffic, speed is zero everywhere”. Your model should yield “infinite travel time”. You can try “no traffic” combined with “zero speed” or “3000km/h speed”. This is highly unrealistic, but your model should still yield the results you calculate yourself.
Testing the Corona model like this, I would introduce a case in which every person of the city center of Amsterdam stays at home, and everybody else behaves normally. With people visiting friends in the city center, in this model everybody gets infected, and recovers. In the model, blocking the middle part of the field, people on the left will be infected, people in the center maybe, and people on the right will not get infected. This of course is unrealistic.
I don’t know how Stevens’ model works. It probably does reflect reality well enough. I didn’t find the source code of the model so there’s no way for me to verify. Mr Drew Harris saw the models and according to Stevens said “some of the dots should disappear”, reflecting people dying from the virus, or from other causes. If a model does not show an essential aspect of what you’re modeling — people dying from the virus — I tend to think the model is not accurate in other ways either.
Models are a good way to demonstrate what is happening and what will happen. A model is an abstraction of our world, leaving out unnecessary detail so the essence becomes more clear. A model only gives useful insigths if the model is correct. I can create an intricate model of earth, showing you earth is flat. You know earth is not flat, so you know my model is rubbish. I can create a model that shows you earth is a perfect sphere. You know earth is a sphere, so you think my model is correct. Which it is, mostly. Then I’ll create a model that shows that earth is a slightly flattened sphere, that rotates with a 24 hour period, and it has a wobble (“precession”) with a 26.000 year period. Can you verify that? Do you think my model is correct? You can check my resume and see I have a degree in astronomy and I work as software developer, so you think my model is ok. Also, when I make such a model, I’ll publish the source code on github for you to check. I’ll also provide the tests, so you can double check.
In the end, there’s nothing much you can do to verify a model other than just think and wonder if the model makes sense to you. Think of an edge case and see if the model behaves correctly. Think of what happens in reality (people die from the virus) and check how the model behaves (people don’t die). And most of all, check the article on how the author created the model and if and how they tested it. In my thesis it says “I tested the model against a normal Boltzmann temperature distribution”. You may or may not know what that means, at least you know I did test the model. In the Kema model, it says that the model was tested against two independent models. Again, you can’t verify this in detail, at least you know some testing was done.
Conclusion: models are useful, models are cool, models provide insight, models predict the future, models are fun, models help us understand the world, or in this case, the spread of the corona virus. There’s one reminder: do give some thought to the validity of the model. Check who created it, check if you can find obvious flaws. In case of doubt: do not retweet the link to the model. If the model is correct, others will know and will retweet. If the model is flawed, experts won’t retweet, but you do damage by retweeting it.