The Fault in Our Metrics

8 min readJun 20, 2020

‘The recovery rate improves to 54.1 percent’

‘ABC city is the worst hit with cases doubling in 11.4 days’

Sounds familiar? Living as we are, these are common headlines on the evening news today. We have become accustomed to absorbing the emotion behind the statements and leaving the rest to the experts. But I must admit to feeling perplexed by the void that these numbers leave behind them. Even if one clicks through such articles, the insights are mostly mundane with the comparisons limited to a few widely accepted statistics. The best inference is perhaps only the state or district-wise breakups they come attached with. To make matters worse, the bureaucrats often chest-thump themselves using arbitrary metrics and of course with little thought. Needless to say, the numbers feel incomplete.

Now, to feed this curiosity, I sat down, researched, and brainstormed a bit on common COVID-19 metrics we hear about. This article is a small attempt to critique the three most prominent of them and to elaborate on the practices followed in their reporting.

The Numbers

Before you proceed, let me set the expectations straight. This section focuses only on these three metrics — doubling rate, recovery rate, and change in active cases. And throughout the analysis, I force myself to ask a couple of questions, and I’d highly recommend that you do the same:

What does the statistic tell us?
What should it tell us?

With that, let us jump in.

1. Doubling Rate

Financial Express explains doubling rate quite simply — it is the time taken, in days, for the number of diagnosed cases to double. The definition appears simple. But once we put emphasis on two words — ‘time taken’ and ‘diagnosed’ — the picture becomes murkier.

When the doubling rate, say, becomes 21.7, does that mean that cases have doubled in the past 21.7 days or are expected to double from the present number 21.7 days in the future? Interestingly, the metric is borrowed exactly from the world of finance and acts as an estimate of the future doubling time, and not as a reflection of the past. The trend is naturally captured using the five, seven, or ten-day moving averages.

Even without ever looking up the meaning, quite a few of us must have had the right idea of the term in our minds. The tough bit comes when we try to look into the number of ‘diagnosed’ cases. As is often the case with statistics, the true picture lies hidden in the broken-down parts. In this case, the total number of diagnosed cases would depend on:

The criteria for testing individuals
The type of testing — pool testing, individual testing or combined
The number of diagnostic tests performed
The definition of a ‘positive’ case

To make the causations clearer: If suddenly one day, the testing criteria is broadened to include more asymptomatic tests, the testing strategy is made more convenient for the common man, and the definition of what constitutes a positive case is loosened — the cases will rise almost immediately on all counts — shrinking the doubling rate. What doesn’t help the case for the metric is that all these shifts have actually been happening across states. The testing criteria have widened. The testing strategies have shifted. Multiple times. And the definitions have, likewise, been open to interpretation.

All of this puts the spotlight on the utility of doubling rate as an indicator of slowing spread, and on the insufficiency of the metric in the print media. Since the underlying factors do not remain consistent, a good practice would be to publish information on changes in any of the rules or definitions. Moreover, to add more information to the doubling rate, a couple of numbers if quoted can help:

% Growth in diagnostic tests: in case the doubling rate is worsening, then a potential reason could be more tests/day. Reporting this static should help give people more insights into the doubling trends.
Infection or Positive Rate: on the same lines, the proportion of those tested coming out positive would indicate whether we are testing enough and also help us gauge the level of disease spread in case the rate is increasing, despite more testing.

2. Recovery Rate

Another oft-quoted stat is recovery rate, i.e. the proportion of overall infected or COVID-19-positive people that have recovered from the illness. For example, if the recovery rate of a country is 20%, then the rate improves if for every ten new people infected — more than two of the total COVID-19-positive people see recovery. The definition is again most intuitive. But much like before, the true story arises from what’s written in the fine print.

The example illustrates how the trend in the statistic is largely dependent on two variables:

Those newly diagnosed with COVID-19, and
Those cured of COVID-19

The first of the two derives from the definition of Doubling Rate itself and is thus prisoner to all the limitations highlighted for the metric. If the criteria for diagnostic tests changes such that fewer cases are registered, assuming the same recovery trend for the already positive patients, then we will be left with an improving but misleading recovery rate trend. Not only that, if the definition of recording someone as ‘recovered’ or ‘cured’ sees relaxation — the problem is compounded even more.

And just as it happens, we have seen measures on both factors. The testing strategy set by ICMR was for the majority of the last three months surprisingly tight, allowing for tests for only people with symptoms. According to experts, this allowed for ~80 percent of patients — asymptomatic cases — to go undetected. The strategy was changed only lately to include specific asymptomatic testings. Worryingly, the guidelines have not been uniform across states. In fact, the central state has only broadened its testing criteria only ten days ago, much later than the Health Ministry advised. These inconsistent revisions in the strategies guiding the doubling rate or recovery rate highlight the incompleteness of the country-wide reporting.

The looser testing criteria pulls down the recovery rate. And almost to offset it, there have been changes to the other extreme. The criteria for discharge were revised to a lower number of days — ten, and the previously mandated RT-PCR test was removed from the necessary pre-conditions for discharge. This invariably distorts and puts in question the legitimacy of a sudden rise in the recovery rate, as shown in the graph below.

Credits: Business-Standard via Ministry of Health and Family Welfare (MoHFW) Database

To make reporting and understanding of recovery rate better then, it makes sense to track and report the change in three numbers as well:

The Number of tests/million population or Testing Rate: if the number of tests per unit population is on the rise in the state or the country, then that explains why the recovery rates may not be going up fast. The vice-versa stands true as well.
% Growth in Recovered cases: the increase in the number of patients recovered would help us learn two things: whether more people are being treated in hospitals, and whether the illness is showing itself in more mild conditions rather than extreme conditions. Either way, this would add value to the recovery rate.
The Average Recovery Period per discharged patient: this would be a tough statistic to collate, but if we can get good representative sample data from hospitals — then this can indicate the degree of illness and act as a leading indicator for the future recovery rate. For example, if the recovery period comes down from 20.7 to 18, then that implies that cases are milder in nature, or that our hospital procedures have improved. This would directly feed into a higher recovery rate.

3. Active Cases

Active cases, an iteration of the above two metrics, forms the third key statistic. This is relatively simple to understand, with the number reflecting just a simple subtraction formula: total positive cases — total recoveries. But the data agencies and columnists often use the incremental change in active cases, i.e. the addition to COVID-19 cases minus the addition to recoveries in a single day — to see the trend. For example, if the total active cases by yesterday were 5000, and there is a net addition (new cases — new recoveries) of 300 to the number, then the % growth in active cases is 6%. The simplicity of the metric makes it quite useful.

But still wearing the hat of a hard critic, I am quite puzzled by how it is reported and interpreted. Let me highlight a couple of points.

For one, the state that tests more per unit of its population — everything else remaining constant — should see a higher number of active cases. And as it ramps up asymptomatic tests, the incremental growth will only be quicker. This, however, does not mean that the high-testing state is doing worse than other low-active cases low-testing states. The additional detections can instead reveal information about community transmission, and even help in contact tracing — something that low testing will not highlight. In fact, states that saw a high number of active cases initially have done better to contain the spread.

Truth be told, the idea of ‘active cases’ statistic is to see the spread of disease in the region, and it slyly also indicates how well the health facilities are coping with the spread — through the recoveries. Now, instead of relying solely on active cases to see the spread of the disease in a region, a better practice could be to trace the infection rate — the proportion of positives amongst those tested, and the case-fatality rate (CFR) — the percentage of infected persons who have a fatal outcome.

The evidence also indicates that the countries that have focused on measuring infection rate based outcomes have been the most successful in the fight. The fact that the rate has risen from 4.6% to 7.8% should alarm many. It could mean that either we are not testing enough, or we are at the end of the curve with a rising infection rate. In the coming weeks, as we look at the active cases, a good idea would be to keep a close eye on the infection rate in particular.

Next, the case-fatality rate — which has gradually risen from ~2.9 percent to 3.4 percent — would act to throw more light on the adequacy of the health system in the states. Even with the rising infection rate, if we can see a falling CFR, then the doctors and nurses on the line are doing a tremendous job and we should rest just a little easier. And the other way around should make our hair standstill. Saying that, while we track this, the delay in deaths and underreported illnesses make standalone CFR an incomplete statistic. So, it becomes important to use these in collaboration for the greatest understanding.

Final Thoughts

The article is not an attempt to downplay the relevance of these metrics altogether. They still add great value and stand as important as any other. Instead, it is meant to elaborate on what underlies these statistics and how their reporting in conventional media can add more value to the common man. The joys and fears we feel with the headlines can often be irrational. To ensure that we accurately understand the situation in or state and the country, getting a more complete sense of what those headlines indicate and where the trends point assumes great importance. Even if we do not see the complete story in the news articles, we should always attempt to decipher their statements with the help of research. And in this data-crazy world, it should not be too much effort!

Hope the article was informative. If you have any thoughts to share, let me know in the responses or over LinkedIn. Stay home, stay safe people!