Coronavirus, Perfectionism, and When to Use the Binomial Approximation

Published in

The Startup

12 min readMay 25, 2020

Almost a decade ago, I first pencilled a squiggly “≈” on my physics homework and felt a pang of guilt.

I had transgressed. I was studying math and mechanics in search of real, exact answers to real, exact questions. How could I stand by and let “ambiguity” and “close enough” contaminate my problem set? Weren’t we beyond that here?

Me in my safe haven in 2015. *Image by author.*

Except, confusingly — there was no other way to answer this question. The TA would only give full points for answers that acknowledged (1 + x)ⁿ ≈ 1+nx. Feeling unresolved, I stapled my homework papers together and handed them in anyway. I did this for years.

Software engineering, though, kicked me into a “Done Is Better Than Perfect” mindset real quick. My first post-grad job was sink-or-swim, and my mathematician lungs repeatedly filled with water until finally, at long last, I comprehended something my manager assumed I’d already known: it’s possible to get authentic, life-improving value out of deliverables that still leave a lot of room for improvement.

Last month I posted a coronavirus infection risk estimator that’s obviously imperfect, but I also think is valuable. It leans heavily on my old confusing frenemy, the binomial approximation.

Why did I use the approximation, (1 + x)ⁿ ≈ 1+nx? Two reasons.

I had a math problem that I authentically cared about answering, and the exact answer looked unintelligible and unexplainable. I gritted my teeth and wondered how I would possibly translate the jumble of symbols to my housemates. But? The exact answer looked kind of like (1 + x)ⁿ.
Years of watching unrelatable, much-older professors chalk approximations on blackboards — sometimes noting “this is hand-wavey but,” — did train my intuition that this tool could work here. I didn’t like it, but precedent was on my side.

So how inexact is the binomial approximation, really?

Precedent aside, two commenters on my risk estimator article shared my misgivings: they pointed that binomial-approximating might introduce error into my results. That’s only two out of hundreds of responses, and I’d said multiple times that all my numbers were estimates, anyway — yet those comments hit my same pang of physics-student guilt from years ago.

But then I noticed another, lighter pang alongside it. This new one said “hey, I’m not a student anymore!” I’m an adult, with adult resources. I’m no longer trapped in perpetual homework-urgency-panic, and I don’t have to stuff down my inconveniently long questions for the sake of getting a good grade. I can just…answer them!

In the rest of this article, I’ll explore these long-dormant approximation-doubts. I’ll give them the breathing space they’ve wanted for years, and refine my risk estimator in the process.

Put on your gardening gloves, because we’re about to pull up some mathematical weeds. (Though, if you’re not interested in calculation nuances, please do skip to the end section “Emerging from the mathematical weeds” and I’ll share how it all emotionally turns out.)

Approximately this many weeds. Photo by David T on Unsplash.

Defining a thought experiment

Before we can run the weedy numbers, we first need an illustrative problem to calculate them on. What problem should we pick? Ultimately, I want to apply these learnings back to my So You’re Going Outside (SYGO) risk estimator, but it has nine parameters, and that’s beyond the scope of this article. Let’s simplify by cutting out seven of them.

That leaves us with the following mega-oversimplified contagion model:

You’re taking a walk outside. You pass N people on your walk, and each person you pass has a probability pᵢ of being infected. The second you pass someone who’s infected — for any definition of “pass” — you instantly become infected too.

There’s almost nothing in this setup! My SYGO estimator also looked at viral load, diffusion-based airflow, and the distinction between someone being “infected” and “contagious,” to name a few important variables. But, right now our main goal is intuition-building, and we’ll have to put the first scaffolding down somewhere. Let’s explore how the two-parameter thought experiment would unfold.

For a given N and pᵢ, what is your risk of infection?

Let’s reason it out. On this stroll of doom, you lose as soon as you pass a single contagious person. As per the scenario’s description, the moment you do that, you “become instantly infected.”

So, in order to return home safe and uninfected, you’d need every last one of the N people you encounter to be uninfected as well. The probability that any given person out of the N is uninfected is 1 - pᵢ. By compound probability of independent events, this means your probability of safety — let’s call it P_s — is all these 1 - pᵢ probabilities multiplied = (1 - pᵢ)ᴺ.

Here’s a handwritten image too, as a visual aid to the text. *Image by author.*

Cool, so now we have our probability of safety! But, I’m interested in the risk. Luckily we can get it quickly from what we have: risk = 1 - P_s = 1 - (1- pᵢ)ᴺ.

Look familiar? By the binomial approximation, we can rewrite probability of safety = P_s = (1 - pᵢ)ᴺ ≈ 1 - Npᵢ.

Doing so allows us to additionally rewrite the risk as ≈ 1- (1 - Npᵢ) ≈ Npᵢ.

Why would we want to do this? Mostly because: when I’m standing on a street corner deciding if I should take the popular main road or the secret back alleyway, weighing outdoor riskiness as 1 - (1- pᵢ)ᴺ doesn’t feel intuitive or practical. I don’t have a ready sense of how it’ll change if I pass 50 people, but then the number of cases in my region doubles. Or, what if I suddenly feel a lot more social and pass 500 people?

Meanwhile, looking at risk ≈ Npᵢ feels simpler. I don’t have to call a math lecture in the kitchen to explain it to my housemates. Double the cases? Double the risk. Pass 10 times the number of people? Wow, I hope you’re prepared for your new 10x increase in risk.

Quantifying the limits of “close enough”

But, you know, there’s never really a free lunch, or else we’d already be eating it. The binomial approximation has some understandability advantages, but it’s still an approximation — it’s a squiggly equals sign, not a straight one. And, it breaks down eventually. Its closeness to the original equation depends on the values of N and pᵢ.

This is the part that kept younger physics-student me from resting easy at night. The ‘Binomial Approximation’ page on Wikipedia, along with most of my professors, only offered the underspecified guidance that the approximation “is valid when |pᵢ|<1 and |Npᵢ|≪ 1.” But what’s “if necessary,” and what’s your numerical threshold for “≪ 1”? Is it 0.1? 0.5? 0.0001?

Annoyingly, as my software engineering work led me to realize — the only possible answer is “it depends.” Let’s take a look under the hood.

What does risk = 1 — (1- pᵢ)ᴺ ≈ Npᵢ actually look like?

Here’s a 3D surface plot of risk (i.e. infection probability) as a function of pᵢ and N. The exact risk surface is above, and the binomial-approximated risk surface is below. For these pictures I let pᵢ range from 0 to 2% of the population infected, and N range from 0 to 100 people encountered.

Ok, this gives us some intuition! Loosely, the binomial approximation overpredicts risk relative to the exact answer, because it maxes out faster.

(A side note on “maxing out:” I artificially capped the approximation at 1 because our final output is a probability, but note that risk = Npᵢ is just a line and would keep increasing forever if I hadn’t intervened.)

But how much does it overpredict? It’s hard to squint at the 3D surfaces to compare their rate of divergence, so here they are superimposed, with the exact answer shaded differently for contrast. Also, to really spare us the squinting, below the superposition I’ll add line plots for some slices at different values of N.

The exact surface is rendered in greenish, and the binomial approximation is rendered in purpleish.

Which of these ski slopes is your favorite? *Image by author.*

In these plots, the error is the vertical distance between the green and the purple. We can see that the error increases as the approximated risk increases, until the approximated risk maxes out at one. Then the error starts decreasing again, as the exact answer catches up to the maxed-out binomial.

Neat! So this confirms — for the first few values of N and pᵢ, the binomial approximation closely tracks the exact risk, just like Wikipedia said it would. Seeing is believing.

This is a lead for sure, and my inner past-physics-student feels a little better. But, my burning question remains: for what values of N and pᵢ is the binomial approximation really, truly, valid?

To answer this, I need to define valid in number-speak. So, to start, let’s just choose a value. What if we’re willing to accept up to a 2% absolute overprediction? For what ranges of N and pᵢ can we freely binomial-approximate then?

Then, after we sort out the 2% error test case, can we generalize our result to all error values? What are the safe values of N and pᵢ as a function of e, the percent of error we’re willing to accept?

Quantifying the limits of “close enough,” round 2

When we constrain error ≤ 2% or error ≤ e%, what we’re graphically defining are isoclines on the “error surface” made by subtracting the blueish-green risk surface from the purple risk surface above. Isoclines are the lines of equal value on a topographical map. So, the 2% isocline on our error surface is exactly the function of N and pᵢ for which the binomial approximation overpredicts the true infection risk by 2%.

Let’s look at this error surface, and the isoclines!

3D-visualize ALL THE THINGS!! *Image by author.*

Hm… these isoclines look symmetric about the 45º line on the right chart, and they’re asymptotically approaching the N and pᵢ axes. From these two facts, I’ll take a guess that we can fit a basic inverse function to each isocline. Meaning, a function of the form N = cₑ/pᵢ, where cₑ is a constant. (Well, it’s a constant for a fixed value of e. If you change the value of e, you’ll have to get a new cₑ.)

Since computers make it easy to programmatically be super extra, let’s use scipy’s curve_fit function to test this hypothesis. Like, really test. Stress-test.

I’ll fit one inverse function to each percent-error isocline, from 1% to 30%. So, I’ll have the 1%-error isocline, the 2% error isocline…all the way up through the 30% error isocline. This is thirty isoclines, and therefore thirty inverse fits and thirty computed values of cₑ. Thank god for computers!

(A note on “fitting:” the inverse function N = cₑ/pᵢ has only one free parameter, which is cₑ. So when I say “fitting an inverse function to the isocline,” what I really mean is “using a computer to find the value of cₑ that makes the inverse line up with the e-isocline.”)

This chart came in too late to wear to Mardi Gras, but I can probably still use it as an abacus. *Image by author.*

Those fits look great!

Though, they’re pretty zoomed in, with N ≤ 100 and pᵢ ≤ 0.02. Let’s quickly zoom out to check that they still work for a broader range of values?

Expanded up to N=500 and pᵢ = 10%. Yep, the fits are still spot on! *Image by author.*

Ok, we can see in this chart that they’re still on target when we extend our axes by a factor of five. So, let’s say these fits are good fits. What can we learn from them?

A neat shortcut

Now that we have values for cₑ as a function of our tolerable error e, we can massage the “valid zone” formula N ≤ cₑ/pᵢ. This “valid zone” formula is what you get from staying below the percent-error isocline (N = cₑ/pᵢ). To make the valid zone formula more readable, we can multiply both sides by pᵢ. This gives us a rearranged formula: if you want to constrain error to less than e%, you’re in the clear whenever Npᵢ ≤ cₑ.

But, remember what Npᵢ is? It’s the binomial-approximated risk of getting infected from going outside!

So, this is a really easy-to-use result. In our two-parameter scenario, you can freely use the binomial approximation for estimating your risk so long as the final output risk is less than the cₑ corresponding to your tolerable error.

Ok, maybe it’s still a bit of a conceptual mouthful — but the process is simple! It’s a lookup table, and engineers use them all the time. Reading off of the legend from two figures ago, this means that an okayness with 1% error lets you binomial-approximate up to riskiness 15%; 2% error tolerance lets you binomial-approximate up to 21%, etc…up until a 30% error tolerance allows you to predict riskiness up to 89%.

A quick handwritten risk ↔ error lookup table. Here’s a link to a more extensive one! *Image by author.*

This felt like such a sweet, practical result that I went and edited the binomial approximation’s Wikipedia page to include it. Talk about closure!

Unexpectedly, an editor took it down within hours. Turns out Wikipedia has a policy against citing original research. Oops! But, in a second-layer twist of fate, the editor was simultaneously kind enough to replace my lookup table with a direct error bound analysis via Taylor’s theorem.

This was super nice, but I confess — I panicked for a few seconds while reading his derivation because I didn’t immediately understand it. Had I Lost My Edge, Was I Over? A few more seconds later I think probably not. People can re-read their own PhD dissertation a few years later and not understand it either.

Ultimately, I prefer my derivation because I think it’s more understandable to mathematicians who’ve been out of the field for a few years, and also other people who’ve never been in the field at all. It’s more visual! But: I learned something new from his analysis, and the upgraded Wikipedia page can preempt frustration for future generations of wayward idealistic physicists.

The very last weed

Ok, we just did The Big Reveal, so we’re already past the climax of this article.

But personally, I can never look at sorted numbers in a table without wondering what shape they are on a graph. Sure they’re “increasing,” but what kind of increase? Linear, quadratic, log?

I have a Jupyter notebook for this article already, so I re-opened it and plotted the 30 cₑs versus their corresponding tolerable errors. Plus, I fit a curve to it. Results are in below!

The curve’s a recognizable shape, but the equation’s not that nice. It’s a fractional polynomial, kind of a square-root-looking thing.

Specifically, the max predictable risk cₑ fits the blue line, which is 1.8*(1.7-th root of e).

Clearly that formula’s not practical! I have no idea how to compute 1.7th-roots in my head, and I definitely couldn’t do it in time to choose between walking a main road or a back alleyway. So, just for curiosity about “there must be an easier way,” I manually fitted a fractional polynomial with simpler values — 1.5*sqrt(e) — and found that it’s pretty close. One might even say, “close enough.”

I don’t plan on using this formula myself; I just wanted to know the shape of the chart. But if you do use it, kudos to you!

Emerging from the mathematical weeds: conclusion

My key takeaways from this exercise are:

Perspective on the world: oi, exactness of knowledge comes at a huge time cost. There’s a reason my professors didn’t counsel me through my mathematical existential crises. And no, it’s not that they took joy in withholding secrets from me! But rather: just as I faced a stack of homework and finite time, they had a packed curriculum to present all in 60 minutes. People make tradeoffs under urgency.
Mathematical understanding: I have a genuine spatial intuition for the binomial approximation now, and I’ll use this new knowledge to refine my previous risk estimator.
A deliverable: I created an error lookup table for the binomial approximation! It maps “amount of error you’re ok with” to “maximum output of the approximation that you can trust.” This error bound on approximation is neat because it applies not just to simplistic models of infection, but really, to any probabilistic event that repeats multiple times. So, there’s probably some way to use it to count cards…

Also, I verified Cunningham’s Law firsthand: “The best way to get the right answer on the internet is not to ask a question; it’s to post the wrong answer.”

Now that I understand the tradeoffs, the squiggly equals sign feels liberating. It’s not a blight upon my homework, and it’s not mean-spiritedly hiding anything from me. It’s accelerating human progress by sparing us all a whole, whole lot of effort.

If you’re interested in these calculations — my highly unformatted Jupyter notebook is on GitHub here.

I’m considering writing additional error lookup tables for common approximations. Especially for my second-closest frenemy, the small-angle approximation. Tweet me @relic_radiation to +1 nudge me to do it!