Unobserved Heterogeneity in Covid19 Models

Carlos Carvalho
Salem Center for Policy
7 min readApr 11, 2020

By Richard Lowery, CEPA Senior Scholar

As data on infections and hospitalizations for SARS-CoV-2 come in, it is beginning to look like case peaks will be earlier and lower than originally projected by leading epidemiological models. It is essential to objectively evaluate the performance of these models, both for the sake of determining their scientific validity and because the models may be needed again if spread resumes following reduction in mitigation efforts. It is in particular essential to avoid declaring victory based on lower case values relative to predicted values if instead the issue was the models were incorrect in the first place. Without such caution, we may end up validating counterproductive and dangerous policies. Models must be held to the standards they set ex ante and not be taken as true, thus attributing all the discrepancy between model and data to mitigation effects. We cannot repeat the debacle of the stimulus following the financial crisis, where economic performance came in well below predicted performance under the stimulus package and the architects of the stimulus package claimed that this just showed how much worse than expected things would have been if they had not pushed through such a policy. Subsequent research (Mullingan 2012, Glaeser 2017) shows that is far more likely that the stimulus package hampered the recovery, and if that is not the case the failure of the predictions simply means that they could not model the economy and invalidates their claims that they could anticipate the effect of their policies.

In the pandemic setting, we have one major concern about the modeling that we have not seen addressed. It appears that many of the models calibrate an R_0 from the doubling rate of the virus in the early phase of the spread. In an immunologically naive population, this is treated as the right approach to back out R_0. But, in the models we have seen there is no attempt to control for unobserved heterogeneity within the coarse divisions by, for example, age and hospitalization risk. This omission seems to generate a potentially very large upward bias in the number of expected cases under any policy. Economists have long known that it is essential to account for such unobserved heterogeneity when the probability of being included in a sample depends on such heterogeneity. The classic example is estimating the wage available to workers in a population where some people work and some people do not. Simply averaging the wage is not enough, because those workers who are not working are likely not working in part because the wage they could earn is lower than the wages of those who we do observe working, possibly for reasons that are unobservable to the analyst.

In the pandemic context, it seems like there is a potentially extremely strong selection effect if the R_0 is estimated from the doubling rate early in the pandemic. Those individuals most likely to become infected and infect a lot of other people are also likely to be infected early in the outbreak. Thus, the initial doubling rate before any policy intervention may not represent the growth rate later in the pandemic, leading to very erroneous predictions about total cases. Of course, epidemiological models exhibit a decline in the effective reproduction rate as more individuals develop immunity. However, at least in many cases, including the work that influenced decisions in Austin (Wang et al., 2020), the R_0, which is the reproduction rate that would occur in a fully susceptible population, is held constant (Bauche and Oraby, Lancet, 2013 and Breban, Riou and Fontanet, Lancent, 2013). But, even within whatever subgroups the model defines, the rate of spread even holding the number of suspectable individuals fixed, would be different, and potentially appreciably lower, later in the outbreak than earlier. To take an extreme example, consider a population consisting of half straphanging subway commuters and half hermit monks, with both groups having the same age profile. Even assuming no one changes behavior as the epidemic gets going, the modeling based on the initial doubling rate will overstate the peak number of cases by a factor of close to 2; those individuals who engage in the behaviors that are most likely to expose them to the virus will be infected first and will generate more follow on infections; since they were infected first they will have an outsized effect on the estimates of transmissibility. This overestimate will be even more severe if individuals respond by changing their behavior to act more like monks and less like commuters. But, both the selection effect and the behavior effect will combine to make any policy put in place look to have been effective, even if it was completely useless or too late to affect the spread.

Little work seems to address this effect; accounting for unobserved heterogeneity is challenging, but there are well established methods to correct in such a way as to reduce or eliminate the bias. If these are not directly applicable, it is still the case that the models must have some adjustment for this effect if they are not going to mechanically provide bad policy advice. Further, it is not even clear that this is entirely a problem of unobserved heterogeneity; the data used to determine frequency of contact should allow an estimate of the dispersion of contacts within age-risk bins, which can then be used directly to partially correct for selection early in the outbreak.

If the epidemiology models are in fact flawed in this way, it is very hard to use the outcomes of the outbreak to evaluate the effectiveness of the policies we put in place. Fortunately, there is a much better guide for policy going forward; South Korea achieved better outcomes at lower cost with a combination of test and track, selective isolation, and universal mask wearing, with limited shutdowns focusing on the most likely sources of large scale transmission. Perhaps even South Korea would have seen a drop in cases without their interventions (though this seems unlikely), but in any case their policy comes at a much lower cost. The crucial difference seems to be that they based their policy on sound virology and science instead of conjectural epidemiological models with biased assumptions. We should invest resources in emulating these successful policies; while we cannot know which pillar of their approach was most effective, it makes sense to copy the entire package. Since the infrastructure was not in place to implement these policies up front, certain emergency measures may have been necessary; in particular, a lockdown of New York seems likely to have been correct, but at this point we have no evidence that the economically damaging lockdown elsewhere was worth the costs. Going forward, a careful understanding what activities lead to transmission, based on detailed case studies of real virus spread, will likely guide policy better than complex but potentially deeply flawed big picture mathematical epidemiology models. There seems to be ample direct evidence of extensive spread associated with large social events, relatively small social events in confined spaces, and public transportation. Taking the approach of indiscriminately shutting down all contact because the models do not account for these differences seems to create a huge cost at potentially a very small benefit.

We must get the evaluation of these models and the policies implied correct for many reasons. (1) The virus may return when shutdowns are relaxed; establishing the most cost-effective approach to containing the outbreak is essential to getting buy-in from the community to take the most effective measures, particularly the community of relatively young people who are extremely unlikely to become severely ill but can potentially be spreaders. Telling everyone that the world will come to an end if they go golfing is simply asking to be ignored when you suggest the virologically more sensible policy of, say, banning spectators from Texas football games in the fall due to a risk of a resurgence of the virus. (2) The virus may not die off completely, and we may need to maintain some significant restrictions for some period of time, which will be both costly and potentially impossible if the measures are not targeted. Ultimately, we do live in a free society and buy in is necessary; destroying credibility by promoting biased models that quickly prove to overstate the short-term risk will justifiably lead people to trust the government less, and if sustained efforts can be effective in reducing the total number of infections than we need such trust. The authoritarian regime in China may have been able to suppress the virus below epidemic levels and perhaps even eradicate it from some areas, which suggests it can be done; we might be able to do it without such extreme measures if we are more judicious in the approaches we use to advise behavior. Authoritarian governments do have an advantage on some dimensions in combating respiratory viruses, but our society can operate with a greater level of trust, voluntary buy in, and innovation, particularly if the government makes efforts to restore credibility following the decision to make policy based on potentially flawed models. (3) Most importantly, this virus, as bad as it has been, does not really seem to be the big one. Just in the past 20 years there have been 3 deadly novel coronaviruses; SARS and MERS were relatively easily contained due to the high mortality rate limiting the spread. SARS-CoV-2 has spread aggressively, but the mortality rate is disturbingly but not extremely high. Perhaps there can never be a coronavirus to have a longer period of mild symptoms with very effective spread, with a much higher probability of eventual severe disease, but it seems unwise to rule out that possibility. Further, another pandemic flu that avoids our vaccines, or some new, horrifying pox could strike at any time. SARS-CoV-2, for all its problems, is not the big one we have been waiting for. Consequently, we must have credible, low costs policies ready to implement at the first sign of a potential spread; we can never again rely on a massive economic shutdown to stave off a virus. We will either end up destroying the economy with repeated false alarms or waiting too long to implement such policies.

--

--