COVID-19: Population Testing vs. Thoughts and Prayers?
Why have public health officials offered misleading advice, wrapped in social distancing, mask-wearing, and wishful thinking as the primary means of pandemic management?
The Case We Make: A large portion of the population is mildly affected by COVID-19, carriers walk around not knowing they have the virus, spreading it to others. The process continues, 125,000+ have died thus far. While social distancing and mask-wearing measures are important elements of managing the return to work/school through the COVID-19 crisis, they are unimaginative at best, and potentially fatal at worst, if not paired with proactive testing to identify the infectious agent. The absence of mass-scale population testing has a singular outcome — more people will become infected. This seems plainly obvious, but there has been a significant contingent of public health professionals that have suggested testing is not a viable option as a countermeasure against COVID-19. Why? Their logic: The prevalence of COVID-19 is low, perhaps less than 1% in the United States. Since the prevalence of the disease is very low, the proportion of false positives to true positives will be high for any given test. The social and economic cost of false positives is high. Therefore, the cost of large scale testing swamps any possible benefit. We demonstrate that many well-meaning public health professionals have misinterpreted the math behind test efficiency, Bayes’ Theorem. This misinterpretation has lead to dangerous public policy positions. Any congregant community must test their entire population with regular frequency to avoid outbreaks of COVID-19.
The World changed in January as SARS-CoV-2 was identified outside of its center of origin and community spread commenced (Rabin, NY Times) on several continents. In the United States, few, if any of us, could have anticipated the magnitude of changes that were to come. By early March, it was obvious that no large scale social enterprises could remain open. Communities closed down except for essential services. Educational communities were not immune and quickly pivoted to distance learning.
College campuses are communities unto themselves, complete with all of the social enterprising of a vibrant city or town. The difference, however, is that college campuses are generally more tightly knit and interconnected than most cities and towns. In a city or town isolation by default can occur, because not all members of the community work, dine, recreate, and live in the same spaces. College campuses can be viewed as congregate housing facilities filled with late teens and early 20-somethings, where the combination of risky behaviors (Bella, Wash Post; Steinberg, NY Times), social enterprise and cohabitation lay the foundation for a potent spreading center that has communities and individuals appropriately concerned (Weeden and Cornwell, InsideHigherEd; Gressman and Peck).
Amidst this background of highly contagious centers of activity, how will communities of higher education operate safely, without risking the health and lives of the people who live and work on campus and in the surrounding community until a vaccine is widely available? Out of an abundance of caution, some institutions of higher education will remain on-line for the upcoming fall semester and possibly the entire academic year, while others have plans for re-opening following a two-week on campus quarantine for all students. Roughly 20% are somewhere in between with varied and creative plans for social distancing, shortening academic calendars, and testing individuals for the presence of the SARS-CoV-2 virus ( Aspegren and Zwickel, USA Today; Chronicle of Higher Ed). Some suggest that the risks are too high to return to a new normal of in-person instruction in a residential community, despite the myriad of safety measures designed to minimize community spread (NewField, The Conversation; Tierney, InsideHigherEd). Other campuses are faced with financial ruin if students are not on campus (Quilantan and Perez, Politico). Beyond the tangible benefits, colleges and universities are thought centers that extend into their host communities. Cities and towns greatly benefit from institutions of higher education, for example as an important economic driver at local and state levels (Semuels, Atlantic). So, how do we, within our living and learning communities, come together during a pandemic?
It has generally been recognized that testing and tracing are the essential tools for minimizing community spread. Testing allows infection to be identified and isolated, reducing the virus’s ability to spread. When an effective testing and tracing program is in place, the chance of passing the virus on (R0) can be driven down to less than 1, and the pandemic collapses. For months it was expected that the federal government would eventually develop large scale coordinated efforts around testing, but those efforts never materialized. Why is this? Surprisingly, many health policy specialists suggest that large scale testing would be ineffective due to the relatively low incidence of disease in the population (see table below). But, considering that individuals could be infectious for 4 days or more without knowing, the absence of testing has a singular outcome — more people will become infected. This played out dramatically in New York City in March and April 2020 where thousands of people died because tens of thousands of people simply didn’t know they were infected with a deadly virus.
While the challenge of identifying the asymptomatic or presymptomatic proverbial needles in the COVID-19 haystack is challenging, it is not out of reach. Methods for testing are well established, and scaling can be solved, but the philosophy of whole community testing (surveillance testing) must first prevail. The United States healthcare system is currently using the most expensive method to finance pandemic testing: billing insurance on an individual basis at the Medicare floor, $100 per test. This exorbitant cost structure of testing combined with the relatively low incidence of the disease has led many to conclude that the benefits are too little and the costs too great to adopt wide-scale testing. Thus, the federal government and many institutions of higher learning have decided to forgo investment in surveillance testing programs, instead relying on social distancing and wishful thinking as the primary means of pandemic management. While simple social distancing measures are highly preventive and important elements of managing the return to work/school during the COVID-19 crisis, they are unimaginative at best, and potentially fatal at worst, if not paired with proactive testing to identify the infectious agent. We can and must do better, especially considering the reality that the true cost of a qPCR test for this virus is around $5 — significantly less than the medicare floor price of $100 (qPCR costs calculator, Santos et al. 2017, cost of qPCR developed tests).
The lack of wide-scale population testing is due in part to the FDA and CDC’s hesitation to act on such testing. The FDA is only just now providing guidance about the possibility of pooled/group samples, a technique that can reduce the resources required to test groups by an order of magnitude (Lanese, LiveScience). While we can point to failures of leadership for much of these failures to scale, there has also been a significant contingent of public health professionals that have suggested that testing is not a viable option as a countermeasure against COVID (see list at bottom of the article).
Mass scale testing seemed obvious to me, but many with epidemiology training suggest that it would be ineffective in managing COVID-19. Which is it, plainly obvious or too complex for a marine biologist to understand? To answer that question requires an understanding of mathematics, which is where mathematician, Robert Jacobson, comes in (Robert and I have worked together for years to tackle wildlife crime).
Testing is the Key to the Resumption of Production
The Unions and Guilds quickly determined that a comprehensive, mandatory testing regimen would need to be the cornerstone of a safe return to production in a pre-vaccine landscape. Without testing, the entire cast and crew would be working in an environment of unknown risk. Confirmed cases would be determined days after people have been shedding the virus — potentially endangering the health of cast and crew members. Moreover, they could lead to the quarantining of others on set, and should those individuals include a key actor or director, to production delays or even a production shutdown. Not to mention the public health implications associated with cast and crew members interacting with the public and going home to their families.
In March Robert and I began exploring the methods for testing entire populations of people. The efficacy of the test was among the first of our questions. As we modeled our ideas about population testing with statistics and computer simulations, an answer started to emerge…
…and it has to do with the application of Bayes’ Theorem. I will let the mathematician explain. For a great visualization of testing metrics for COVID-19, Robert built a web-based demonstration that allows anyone to change the settings and see the results. The most important result for me is, if a population is infected and you do not test, the positive cases greatly increase. If you regularly test, you find, isolate, and contain the spread. The current crisis in Florida and Texas illustrates the challenges of contact tracing without mass scale testing. It seems the federal government might be reconsidering their ill-positioned stance on testing.
How understanding Bayes Theorem is crucial in informing health policies
Consider the Venn diagram of intellectual ideas as depicted above. The first orange circle represents those ideas that are important for the well-being, smooth functioning, and progression of human society, like the germ theory of disease, the moral imperative of the Golden Rule, and basic arithmetic. The second blue circle represents ideas that are confusing or subtle and includes the Hegelian dialectic and every book with the word semiotics in the title. If you just said, “What?,” that’s the point: these are difficult, confusing ideas. Finally, we have the green circle of ideas that are just too easy to believe one understands when one actually does not. These are ideas that, whether or not they appear simple, have the peculiar ability to lure us into a completely unfounded confidence in our understanding of them. It includes your Uncle Franky’s grasp of international politics, of course, but also some surprises: Robert Frost’s “Stopping by Woods on a Snowy Evening,” and everything you know about leprosy, to name two.
Frightening is the overlapping intersection of these three circles. Here live challenging ideas that are important for human society that we often get wrong while confidently believing we understand. Bayes’ Theorem is one of these ideas.
The Basics of Bayes’ Theorem
Bayes’ Theorem might not be a household name, but where it is known, it is often misinterpreted or misapplied. Bayes’ Theorem is actually quite short and simple:
With testing for diseases like COVID-19, Bayes’ Theorem gives scientists the power to determine relationships between these four scenarios:
Understanding the probability of each of these four scenarios is paramount for the prevention and treatment of disease from the individual level to the level of worldwide populations.
The Paradox of Bayes’ Theorem
What is subtle about this simple formula? In short, interpretation. Bayes’ Theorem can be counterintuitive. A classic example is described by mathematician Chris Wiggins:
A patient goes to see a doctor. The doctor performs a test with 99 percent reliability — that is, 99 percent of people who are sick test positive and 99 percent of the healthy people test negative. The doctor knows that only one percent of the people in the country are sick. Now the question is: if the patient tests positive, what are the chances the patient is sick?
The intuitive answer is 99 percent, but the correct answer is 50 percent….
Even with “good” tests, testing positive does not necessarily mean you “probably” have the disease! For this example, the surprising result is because the disease is very rare in the population. When it comes to testing for diseases, the punchline is:
There is a higher proportion of false positives relative to true positives when the prevalence of a disease is very low.
The Deadly Misunderstanding of Bayes’ Theorem
The False Positives of Azkaban (Or, The Prisoner of False Positives):
That last sentence is worth repeating: There is a higher proportion of false positives relative to true positives when the prevalence of a disease is very low.
False positives have “costs.” If people who test positive but are in reality not infected have to self-quarantine, they could experience a major disruption to their lives, including to their financial and mental health. Citing Bayes’ Theorem, many experts conclude that because COVID-19 is such a low-incidence disease, the cost of whole population testing outweighs the benefits. A logician might structure this argument as follows:
- The prevalence of COVID-19 is low, perhaps less than 1% in the United States.
- Since the prevalence of the disease is very low, the proportion of false positives to true positives will be high for any given test.
- The social and economic cost of false positives is high.
- Therefore, the cost of large scale testing swamps any possible benefit.
This is a dangerously incorrect mathematical interpretation.
Consider a 99% reliable test as in the classic example above. Suppose we test a town with a population of 10,000 people. If 0.1% of people in town (that’s 1 in 1000) are infected, then 10 people have COVID-19, and our test detects all 10 of them 99 times out of 100 (on average). Meanwhile, the test incorrectly labels 1% of the 9990 remaining people (again, on average), which is about 100 people, as positive.
Will we get significantly fewer false positives if the infection rate is higher, as some experts suggest? Let’s instead suppose that 1% of the people are actually infected — that’s 100 infected people, 10 times more than we thought. Then the test correctly labels 99 of the 100 infected people, on average. Of the remaining 9900 COVID-19-free people, 99 people are incorrectly labeled as positive by the test. There are fewer false positives, yes, but fewer only by one single person.
The mistake is to think that the quality of testing depends on the prevalence of the disease. It doesn’t.
The quality of a test depends only on the capabilities of the test itself and could not possibly be influenced by the state of the population outside of the testing lab. On the other hand, our testing policies must be informed by the state of the population. A significantly lower prevalence does not result in significantly more false positives as some experts have claimed, and so we shouldn’t care at all that 50% versus 9% of the people who tested positive are actually infected — the “cost” we pay as a population for those false positives is virtually the same as it would be with a much higher rate of infection. The benefit is exponential: We catch infections much sooner, before they have a chance to take root.
We want nobody in the population to be infected, and if that is the case, then the only way for a test to turn out positive is if the test is wrong — a false positive. In other words, we want every positive test result to be a false positive — and to keep it that way. Choosing not to test the population at large because the prevalence of disease is low is the equivalent of deciding that you don’t need to take your medication anymore because it is working. The reason you are healthy is because it is working!
Some public health specialists are advising schools and employers based on exactly this backwards reasoning.
NOTE: There are no reported false positives from any approved qPCR tests. For qPCR testing, this is a theoretical argument.
The Order of the False Negatives (or, The False Negatives of the Phoenix):
Do false negatives make whole-population testing infeasible? Imagine a test is only capable of identifying an infected sample 50% of the time. That means roughly that every time the population is tested, half of the infected individuals are identified and removed (via quarantine). If removing half of all infected individuals from a college campus every few days sounds like an obviously good idea to you, that is because it is. Simple arithmetic shows whole population testing — especially repeated whole population testing — reduces the number of infected individuals in the population. (Sophisticated models do as well: See Larremore et al.; Dawoud, RSAP.)
Despite this, some health experts like Dr. Tom Jeanne, the deputy epidemiologist for Oregon, justify their recommendations not to employ whole population testing based on the rate of false negatives: “A negative result does not meaningfully increase confidence that a person is not infected,” Jeanne said. “And just as importantly, a negative result does not mean that a person has any period of protection when they are not or cannot be infected.” But our inability to certify an individual as disease free with a high degree of confidence is immaterial. Our goal, rather, is to reduce the rate of infection to zero or near zero and then keep it that way.
Dr. Jeane concludes that because false negative rates are high, the results likely will be wrong, and thus it doesn’t make sense to use tests on asymptomatic people (Murakami, InsideHigherEd). This is not what a high false negative rate means. A test with a high false negative rate might also have an extremely high true positive rate, which is precisely the situation with all FDA approved qPCR tests for COVID-19, all of which have a false positive rate of zero. Indeed, if almost nobody has a disease, then a “test” that always gives a negative result no matter what would be correct the vast majority of the time, but that obviously doesn’t mean the always-negative test is a good one! It simply is not the appropriate measure by which to judge a test in this context.
What makes a good test?
When we say the false negative rate is high, we implicitly mean that it is high relative to some other thing and for a specific purpose. If a person is tested every 3 days, and the false negative rate is 70%, then the probability that the person will test positive before or on day nine is 76%. Now, a false negative rate of 70% is very high relative to other tests for other diseases, and for the purpose of testing a person a single time to certify that person as free of disease, the test would be all but useless. But in the context of whole population testing, even a false negative rate as “bad” as 70% is actually pretty good, because the infected individual is identified and quarantined with high probability when they otherwise would not have been.
With mathematics, nobody needs to trust what I say is true on my authority as a mathematician. I have built an online tool that anyone can use to compute outcomes of interest based on estimates of a test’s performance characteristics that can be adjusted with a slider. The tool is live at this address:
A primary concern of public health officials regarding whole population testing is about the availability and cost of tests and the available capacity to process the tests. The benefit of whole population testing, on the other hand, clearly does have a significant benefit. Scientists from around the world are working to solve the testing problems. Soon you will be able to buy a test kit for less than a movie ticket and test yourself at home.
Resource usage may be a limiting factor in some regions, but sample pooling can potentially magnify the capacity of those resources by an order of magnitude(Mentus et al.). Performing and processing 2,500 tests a day might be out of reach for an institution, but 300 or fewer tests required by sample pooling might be feasible, allowing the institution to test everyone on a regular schedule. Over time, test processing capacity is likely to increase (MA High Technology Council, The War on COVID-19: Reducing Rt Deep Dives).
There are many factors to consider in making the determination to do whole community testing, including test kit availability, test processing capacity, and funding. Many of these needs had been anticipated by scientists around the USA in early March with the creation of COVID-19 National Scientist Volunteer Database. Their intention to fill in the testing vacuum led to the creation of a robust network of experts in scientific testing, bioinformatics, data management, and key members willing to donate lab space and testing supplies. They continue to operate to facilitate testing and contact tracing, and access to their database and supplies can be requested here.
Returning to Oregon’s deputy epidemiologist Dr. Tom Jeane, it is likely that Dr. Jeane has carefully weighed all of these factors. If it is impossible to test every person, then it does not matter how effective testing every person would be. However, his claim that a high false negative rate means that test results will likely be incorrect or that they make wide-scale testing of little use is wrong as a matter of arithmetic.
What should Colleges and Communities do?
There is no one-size-fits-all solution. What does seem clear is that unless campuses are willing to take dramatic measures that will be hard for colleges and universities to swallow, the outcome is likely to be grim. We have no choice but to pay a steep price while we wait for the endgame of this pandemic. We can choose to pay the financial costs of investing in the infrastructure and resources for wide-scale testing, or we can save our money and pay instead with the human lives of those that will succumb to the disease we choose not to test for.
Public health professionals arguing against mass-scale testing.
Dr. Tom Jeanne, the deputy epidemiologist for Oregon: Concerned over False Negative Rate, Test limitations, running out of tests.
Center for Disease Control (CDC) — Higher Ed Guidance: No mention of testing at all.
Michael T. Osterholm, Ph.D., MPH of the University of Minnesota, CIDRAP:
Testing in schools or other low-risk settings (not recommended). In most situations, school-based testing will be of limited value, unless there is a clear cluster of cases and public health officials have determined that testing would offer a public health benefit.
Widespread community-based testing (not recommended). Again, in low-prevalence settings, widespread community testing does not offer a public health benefit because of the varying positive and negative predictive value of the test results.
The overall prevalence of COVID in a healthy young adult population is likely to be very low, and probably less than 1%. At this prevalence level, the positive and negative predictive values of most screening tests would be unreliable unless the test used has both extremely high sensitivity and specificity. Many tests available today do not meet that standard.
Screening large numbers (thousands) of students will likely produce no substantial public health benefit, and at very high cost.
Response to our email expressing concern about their approach. “Thank you for reaching out and providing your thoughts. Our brief aligns with current CDC testing recommendations.”
Based on our research, as well as consultation with Keeling and Associates, we recommend that RISD focus on the use of diagnostic testing, initiated by symptomatology and/or contact tracing, supported by strict use of quarantine and isolation. Mass testing of asymptomatic people has been proposed by several schools, however we have found that such testing does not have high scientific value, because it provides only a quick snapshot of a current condition, which could change within hours. We also do not recommend surveillance testing, which samples the population to track the presence of the virus, because the rates of false negative/positive results for low-prevalence groups may approach the population disease incidence (~1%).