How Covid-19 Under-testing is Influencing Our Perception of Reality
An under-explored aspect of Covid-19 data is the impact of testing. As it turns out, testing may be influencing our perception on the spread of Covid, often in ways that differ from reality. The first chart we should look at is the population-adjusted (per 100K) number of Covid-19 cases by state.
See the full interactive versions of the visuals in this article at Aardvark Data.
From the map above, it should quickly become clear with just a cursory glance that New York, New Jersey, and Louisiana are some of the hardest hit states. Many states, such as California, Texas, and Ohio look relative unscathed when compared to New York. New York has an astronomically high 390 cases per 100K, while Calfornia is only at 19 cases per 100K. Thus, California is doing much better than New York, right?
Well, maybe not.
The next map looks at the number of tests per 100K people in each state. Notice that New York leads the country on this, with 1,055 tests per 100K people. In other words, roughly 1.6% of the population in NY has been tested.
Contrast that with California where a mere 73 tests per 100K have been conducted, or roughly 0.07% of the state’s population. Thus, New York state has conducted about 14 times as many tests per capita as California. This opens the door to the possibility that some of the states with low case counts are simply under-testing.
Before we dive deeper into that, however, let’s look at one more topic: the rate of positives in the testing. This may also be a measure that provides a better understanding of spread of the disease.
This map tells us a lot of interesting things. First off, while New York is doing a lot more testing than other states, it also has an extremely high rate of positives with nearly 37% of tests coming back positive. If anything, this might suggest that while NY is doing a better job than the other states on testing, it’s still under-testing This is a not comforting thought, but it is the reality.
However, beyond New York, we do see some unexpected insights. Oklahoma suddenly looks to be in much worse shape than the initial population-adjusted case map suggested with a 31% rate of positives. Thus, Oklahoma may be in nearly as bad shape as New York and NY only looks worse because it’s testing 23 times as many people as Oklahoma.
Michigan also stands out like a sore thumb with a 39% positive rate, while conducting less than 1/5 of the testing as New York. Indeed, we can start to see evidence here that Michigan may be in worse shape than NY.
Also the golden boy, California, doesn’t look so golden with a 26% positive rate, with a small fraction of NY’s testing, as well. We can also see hotspots in Georgia, Mississippi, and South Carolina.
Comparing Apples to Oranges Data
So how do we compare this apples-and-oranges data? Unfortunately, there’s no definitive answer. However, we can create some metrics that examine the hypothetical impact of under-testing.
I built two different models that did just this. The first examines if every state tested as much as New York (our current testing gold standard), and had the same positive rate through all of the testing, how many cases would there be? To be sure, this hypothetical situation is unlikely since the states that are under-testing are likely testing people with a higher probabilities of having Covid-19. Nevertheless, I include this hypothetical scenario in the interactive version of this chart at Aardvark Data (see “Adjusted Cases 100%”).
I came up with another metric, as well. This 2nd model assumes every state tests at the same rate as New York. It takes all the confirmed cases in each state. It then adds the number of tests needed to meet NY’s testing standards. However, it only assumes a positive rate at 75% of the current rate on those additional tests.
Is this realistic? It’s tough to say and it could differ from state to state. Nevertheless, this is the most reasonable model IMO.
That said, every model will have its obvious flaws. We simply don’t know how many people out there who haven’t been tested would be positive cases. And even with New York, it’s likely that the state is still significantly under-testing and thus, significantly under-counting cases.
Yet, this hypothetical model at least illustrates how under-testing could be skewing the data and creates something closer to an apples-to-apples comparison. The results are in the map below:
How does this change our perception? From this projection, we can now see that Michigan is now in the same ballpark as New York in terms of population-adjusted case numbers. Similar deal with New Jersey. Meanwhile, California’s pop-adjusted case count jumps 11 times and while it’s still lower than New York’s, it now appears to at least be in the same ballpark.
The problems in Oklahoma, Georgia, Mississippi, and Colorado become more evident as well. There’s also a theme that nearly every state with at least one large city has surging cases of Covid-19 and states that haven’t been hit hard (e.g. New Mexico, West Virginia, North Dakota) are more rural / small city than average.
None of my projections are meant to represent the “true reality”, which is unknown, so long as we’re under-testing across the United States. But it does showcase how the under-testing phenomenon is skewing our perception of reality, leading us to believe New York is crisis-central (with some compelling evidence supporting that belief), while looking over some other hot-spots such as California and Georgia.
The narrative surrounding Louisiana also seems different. Instead of Louisiana being an outlier, we can see the same problems across most of the Southeastern US.
I’m happy we have some good data out there on Covid-19, but beware that significant adjustments need to be made to fully understand it.