A goggled, double-masked Nassim Taleb. Courtesy Nassim Taleb

Nassim Taleb’s Yuuuuuge Mistakes Interpreting the Danish Mask Study

Ted Petrou
Dunder Data

--

A large randomized study of the effectiveness of masks for controlling SARS-CoV-2 Infection was recently published by a Danish research group. It was not able to conclude whether masks increased or decreased infection rates, with the conclusion copied below.

The recommendation to wear surgical masks to supplement other public health measures did not reduce the SARS-CoV-2 infection rate among wearers by more than 50% in a community with modest infection rates, some degree of social distancing, and uncommon general mask use. The data were compatible with lesser degrees of self-protection.

Recently, Nassim Taleb, a yuuuuuge proponent of masks, took time to show how incredibly significant the results of this study were for proving mask effectiveness. In this post, I will examine several flaws with his work and show that it’s not clear whether Nassim actually read the paper or just looked at the table of results.

The Yuuuuugest mistake — Other Respiratory Viruses

I want to cover the most obvious mistake (omission) in Nassim’s analysis, before diving into the finer details. The primary objective in the Danish mask study was to measure the SARS-CoV-2 infection rate between the treatment (masks) and control groups (no masks).

But, there was a secondary objective, which was to measure infection rates for 11 other respiratory viruses. For these other respiratory viruses, 9 (0.5%) occurred in the mask group and 11 (0.6%) in the control, suggesting no advantage for either group. Even if Taleb showed that masks were effective at limiting SARS-CoV-2 infection, he would have to conclude that they made no difference for the other respiratory viruses.

Study Results

In this section, I will summarize the study results. The paper is fairly short and not filled with too much technical jargon, so it should be accessible to the majority of interested readers. As mentioned, the primary objective was to compare the SARS-CoV-2 infection rates between the groups.

Participants — 4862 out of 6024 people completed the study

Mask/Control — 2392/2470, about 3% more in the control group

Dates — Two (approximately equal) overlapping cohorts — April 14 to May 15, 2020, and April 24 to June 2, 2020.

Adherence — 46% wore the mask as recommended, 47% predominantly as recommended, and 7% not as recommended

SARS-CoV-2 Infection rate — 42 (1.8%) vs 53 (2.1%) for mask vs control — not statistically significant

SARS-CoV-2 Infection rate for those that wore masks “exactly as instructed” — 1100 participants wore masks exactly as recommended with a slightly higher percentage, 2%, getting infected

Other respiratory viruses — 11 other respiratory viruses were tested for. 9 (0.5%) vs 11 (0.6%) for mask vs control

Identifying Infections

Infections were identified with the following tests:

  • Antibody testing — blood test for IgM and IgG antibodies
  • PCR — the most commonly used test in most nations, done with a nasal swab sample
  • Hospital diagnoses

It’s not entirely clear to me, but it appears all participants conducted the antibody tests themselves without supervision. The antibody test results were shown by the presence (or absence) of a single line for each antibody. The participants performed a nasal swab themselves and sent it in for the PCR test. This self-testing and self-reporting is in itself a possible source of error for me, but that is left for another story.

Results by test — Taleb misreads the table

A total of 42 infections for the mask group and 53 for the control were reported. The tests identifying these infections are as follows:

  • Antibody (IgM/IgG) — 31/33 vs 37/32 for mask vs control
  • PCR — 0 vs 5 for mask vs control
  • Hospital diagnoses — 5 vs 10 for mask vs control

Notice how the total positive test results do not add up to 42 and 53. For instance, in the mask group, we have 31 + 33 + 0 + 5 = 69. This is because it’s possible to test positive for both antibodies. Removing the PCR (0) and hospital diagnoses (5), we get that 37 unique people tested positive for antibodies. Some basic set properties show that 27 people tested positive for both, 4 for IgM only, and 6 for IgG only.

In the control group, 38 unique individuals tested positive for antibodies, with 31 testing positive for both, 6 for IgM only, and 1 for IgG only. Taleb misreads the table believing that the totals (42 and 53) are all from the antibody test. We will see shortly how this error affects the significance of the results.

False Positive Rates

No test can definitively report that a positive result is indeed a positive result. Test manufacturers usually provide sensitivity (proportion of those infected that test positive) and specificity (proportion of those not infected that test negative).

The study authors provide two values for the specificity of the antibody tests, 99.2% from the manufacturer and 99.5% from their own internal validation. Inverting these percentages gives you a measure of the false positives, the individuals that test positive, but are actually not infected. Averaging the two specificities, we get a false positive rate at about 0.65%.

While this false positive rate seems low, it must be presented in the context of the actual underlying prevalence of the disease. For instance, if it is known that 1 person in a population of 1 million people has a particular disease, then a test with a false positive rate of 0.65% would lead to 6,500 positives. Assuming the actual infected individual is amongst these positives, the probability that you are that person infected, given you tested positive is extremely low (1 in 6500).

Studying blood donors from the same time period of the study, the authors found that about 1.9% of the Danish population was infected. Using this number, we can estimate the probability of being infected given a positive result on the antibody test to be 1.9 / (1.9 + .65) or 75%.

We can apply this probability to each antibody test independently. For IgM only, we have 31 in the mask group and 37 in the control. Multiplying by .75 gets the numbers to 23 (1%) vs 28 (1.1%). For IgG, we get a similar non-significant difference.

Taleb’s Miscalculations with false positives

Taleb uses 1% as a “back of the envelope” percentage for the calculation of false positives. He then subtracts this estimation of false positives from the reported positives and adds back in the PCR and hospital diagnosed positives. He also creates a variable for home infections, as masks were not instructed to be worn at home. He produces the following updated proportions for the mask and control groups.

Let’s take care of the misreading of the table from above first. Taleb incorrectly uses 42 and 53 as antibody positives, when these are the numbers for all positives. He should have used 37 and 38 (or just omit the 5 and 15). Using these numbers (and discarding home infections) yields 18 / 2392 (0.75%) vs 29 / 2470 (1.2%) which is not statistically significant using a chi-squared test. Using a smaller false positive rate (such as the 0.65% from above) would only bring the ratios closer together.

Taleb’s case for negative Infections

Haphazardly using 1% as the false positive rate is more wreckless than it first appears. If recorded infections were below 24 in each group, then we’d get negative infections. Additionally, if he were to use 2.5% as the false positive rate, we’d get negative infections for both groups.

A better estimate of the false positives is to use the ratio of the estimated underlying incidence to the false positive rate, as was done in the previous section (1.9% to 0.65%). Subtracting the false positives, as Taleb did, can lead to negative infections.

Taleb uses his equation with 1% false positives to get a statistically significant result, but as we just saw, using correct counts nullifies his result.

No False Positives in PCR group

Let’s look at just infections from PCR, which were very low altogether at 0 for the mask group and 5 in the control. Taleb attempts to make the case that this is the strongest result. Using his assumptions, getting this result is unlikely due to chance and is significant.

But, Taleb assumes 100% specificity for the PCR test, which isn’t the case. Using just a 0.3% false positive rate with Taleb’s method of subtracting false positives would lead to negative infections in the control group 5 — 2470 * 0.03.

Not calculating any false positives for the PCR comparison, but doing so for the total cases calculation is wrong.

15x more positives in the Antibody group than PCR

A total of 75 participants in both groups tested positive using the antibody tests compared to just 5 for PCR. How is this 15x difference even possible?

It’s not clear to me whether every single participant was tested with both the antibody and PCR tests, though the language in the study suggests this to be the case. If all participants were given both tests, then at least one of the tests must be flawed. Given that the researchers estimated that 1.9% of the population was infected at the time, much more weight should be given to the antibody tests as 75 / 4862 is about 1.5% while 5 / 4862 is ~ 0.1%.

PCR False Negatives

The minuscule number of positive PCR results is very suspicious to me and indicative of a large number of false negatives. If indeed all participants were PCR tested, I can’t think of a reason why the number would be so low. This is especially true, considering how sensitive PCR tests are at identifying dead viral strands where the individual is unlikely to be infectious to others.

Taleb never accounts for false negatives in his work. You would only need a false negative rate of about 0.2% (adding about 5 cases to each group) to nullify any previous significance calculated.

Other Respiratory Viruses

A secondary outcome of the study was to measure the infection rates of 11 other respiratory viruses between the two groups using PCR tests. This resulted in 9 positive cases in the mask group vs 11 in the control, a clearly non-significant result that even Taleb can’t make disappear.

Summary

From my reading of Taleb’s work, it is not clear whether or not he actually read the full study. His approach appears disingenuous with the single task of going out to prove that the study was wrong and apparently in his favor. Below, I summarize all the flaws that I found with his work:

  • He did not mention the other respiratory viruses (9 vs 11)
  • He miscounted the number of infections from the antibody tests
  • He subtracts false positives from the (miscounted) total to derive significance
  • He uses a high false positive rate
  • Negative infections are possible when subtracting false positives
  • He doesn’t calculate false positives for PCR test
  • He doesn’t mention the suspicious 15x difference between antibody and PCR positives
  • He doesn’t calculate false negatives

--

--

Ted Petrou
Dunder Data

Author of Master Data Analysis with Python and Founder of Dunder Data