Covid 19 — What the Data Tells Us

Josh Ketter
Jun 3, 2020 · 11 min read

The Punchline: We got it wrong and now we’re largely tracking it wrong...

Did you know the “new cases” in daily reports are actually “old infections newly reported”??? Most of what is reported is old news and not indicative of what is happening or how we are trending.

We must look at “Date of Onset” to do ACCURATE trend analysis, but the CDC stopped reporting onset data in April.

Date of Report vs. Date of Onset

Using our own Data from the CDC, we can see that while 45,000 cases were reported on March 23, in reality we already had 140,000 positive tests that had yet to be reported (date of onset).

And now that we know testing only captured 5–10% of actual infections, based on multiple studies, it appears at least several million (yes Million) were already infected by March 23rd, which were never tested.

The CDC is also now combining PCR & Antibody tests, so the “new cases” could be some of the millions of cases from 4 months ago. NPR called them out on it.

Per Nate Silver, a world renowned Statistician, the average person might as well ignore the cases being reported, it’s apples and oranges.

We are suffering from data illiteracy in this country — from the Institutions capturing it (like the CDC), to the Politicians making decisions from it, to the Media reporting on it, all the way down to the Public consuming it.

Think of it this way, the CDC now believes (as do I) that ~10% of the U.S. has already been infected (i.e. 32M). So if we tested everyone with antibody tests, we’d have ~30 million new cases added to the report. Clearly we wouldn’t have truly had 30,000,000 new infections in 1 day… but our main sources of news are reporting it that way. If your state / city is using this data, rather than date of onset or hospitalization data, then they don’t understand how to use data.

As someone who modeled Covid19 in March and accurately predicted it (so far), this is how I believe our reporting should look. *The CDC actually took a similar approach in reporting the 2009 H1N1 Pandemic*

It is irresponsible to report on “lab confirmed” cases without providing an estimate for “unreported” or “estimated infections.” The 5% we see in the news is really closer to 0.3% (for people infected). And of course, far more people have been exposed, but nobody reports the lower “exposure to fatality” ratios.

Background: I’m a retired Data & Analytics Leader who managed modeling teams at the highest level for — I also have a passion for medicine, and have built FDA programs, and read studies for fun.

Here’s what I’m going to cover in this article (with plenty of sources/data):

  • The Forecast — Where are we at and where are we going, what to expect?
  • Risk & Fatality — What’s my true risk of dying from Covid19?
  • Immunity — If I had it am I safe? What % is required for herd immunity?
  • Transmission — Where am I at the greatest risk, how does it spread?
  • Data Accuracy — Are we over/under counting deaths?
  • Capacity — how close were/are we to overloading our healthcare?
  • The Costs — Have we saved more or cost more lives w/ our policies?
  • Sources — Where to get data/insights directly? i.e. sources?
  • Media Accuracy— How good is our reporting?

First, let me give you a SUPER high level summary:

  • Covid19 is serious, but it’s significantly less than we thought
  • Fatality: According to data from the best-studied countries and regions, the lethality of Covid19 is on average about 0.3%, which is about ten times lower than originally assumed by the WHO. As of 5/20/2020 the CDC is now estimating ~0.26% in the U.S.
  • Fatality in Context (i.e. Risk): The risk of death for the general population of school and working age is typically in the range of a daily car ride to work. The risk was initially overestimated because many people with only mild or no symptoms were not taken into account. i.e. we’re capturing most of the deaths but only 5–10% of the infections. Risk varies by age significantly. For kids (particularly toddlers), they are ~20X’s more likely to die from the Flu or Pneumonia than Covid. For those over age 70, the risk of fatality is ~2.5% — this is much higher for those in poor health.
*Note* this was based on the original model out of the U.K that speculated up to 2M deaths. From new studies, we now know each of these Covid data points is actually ~2–3X lower than shown here.
  • Risk & Symptoms: For every 100 people infected ~20-50 of them will show no symptoms (i.e. asymptomatic). Of the symptomatic, ~80% have mild symptoms. Even among 70–79 year olds, about 60% remain symptom-free. So when accounting for symptomatic + asymptomatic, ~90% have mild or no symptoms. While most won’t know it, for a few, it’s vicious and deadly. *I had it, and it was brutal*
  • Immunity: For the ~30M americans who already had Covid19, we don’t yet know how effective or how long the antibodies will last. But based on other Coronaviruses, it’s reasonable to assume it’s protective for 6 to 34 months. We also don’t know what % of our population has a strong enough innate immune system to prevent infection in the first place. But up to 60% of the population may already have some level of cross-reactive protection at a cellular level due to contact with previous coronaviruses. So it is possible up to 70% of the U.S. has varying levels of resistance.
Not to be confused with “infections” — we need ~50–90% of the population to be sufficiently immune resistant.
  • Comorbidity: is very high. i.e. The median or average age of the deceased in most countries (including Italy) is over 80 years and only about 1% of the deceased had no serious preconditions. In NYC almost everyone had underlying conditions.
  • Highest Risk: In many states, up to two thirds of all extra deaths occurred in nursing homes — we clearly failed our most vulnerable 😢
  • The Costs: (Lives vs. Lives) not Economy vs. Lives. We may never know this fully, but up to 50% of all additional deaths may have been caused not by Covid19, but by the effects of the response, i.e. policies & panic. For example, the treatment of heart attacks and strokes decreased by up to 60% because many patients stopped visiting hospitals.
  • The Costs: The number of people suffering from unemployment, psychological problems, suicides, delayed treatment, and domestic violence has skyrocketed. Several experts believe that these may claim more lives than the virus itself. According to the UN, millions of people around the world may fall into absolute poverty and famine, which causes more disease / death. One estimate from professors at Stanford & Duke has calculated we’ve now lost more years of life due to our response than the virus.
  • Data Accuracy: We have gaps everywhere, but “hospitalizations” are probably the cleanest and earliest indicator we have. Everyone wants to talk about Fatalities, and we have work to do there. We’re missing some, but we also have clear instances of over-counting, i.e. gunshot victims counted because they had tested positive. It’s not clear whether we have more instances of over-counting or undercounting — or whether they died from or simply with coronavirus. Official figures usually do not reflect this distinction. We’ll need a strong post-mortem study that looks at these.
  • Data Accuracy: For now the best method we have to cut through the noise is to look at mortality in total, “excess deaths.” And even this depends on what you believe. i.e. if we assume 50% of the excess deaths are from our measures, then we’ve over-counted by 70%. If we assume Covid19 is the only variable at play, then we’ve undercounted by 15%. In any case, our counts are likely “in the ballpark” — i.e. tens of thousands.
  • Baselines & Context: The normal overall mortality per day is about 8,000 people in the US. Influenza mortality per season is up to 80,000. Most of our leading forecasts suggest Covid19 will account for ~5% (150,000) of our typical annual deaths (~3,000,000).
  • Capacity: Many clinics in Europe and the US remained strongly underutilized or almost empty during the Covid19 peak and in some cases had to send staff home. In the U.S. we lost 1.4M healthcare jobs so far and numerous operations / therapies were cancelled, including some organ transplants and cancer screenings.
  • Capacity: At peak, New York City had around 1 in 6 hospital beds open and around 1 in 10 ICU beds open. Hospitals had capacity. Nationally, the CDC reports that “Covid Like Illness” at most represented ~7% of hospitalizations … it’s currently under 2%.
  • Forecasts: Most current forecasts assume we’ll go from ~100K fatalities to 130–200K over the coming months. And using the CDC IFR estimate, this means infections would go from ~30M to ~50M. There is debate about how much higher it will go and the size of the second wave. Some believe we’ll top out between 15–20% of the population infected, whereas others say 70–90%. I’d point out we have never had a pandemic over 20–30%.
  • Sources & Scientific Debate: I’ve included a long list of great resources at the bottom of this article (WorldoMeter, OurWorldinData and the CDC are my favorites). I’ve also found to be one of the better sources on “context” and a balanced discussion for Covid19. If you only follow MSM and Social Media, chances are you may be surprised to learn there is a fair number of seasoned & credentialed dissenters, including Nobel Prize winning scientists who completely disagree on the approach most of the World’s governments have taken. In fact, one of the Top 100 most cited scholars, a BioMedical Statistician and Sr Epidemiologist from Stanford, John Ioannidis, wrote a scathing paper, and then conducted one of the first seroprevalence studies. He was berated for it (and to be fair, it did have issues). Only now the CDC and almost every serology study is showing he was more right than wrong (on a rate basis), and more accurate than those who criticized him. The question has to be asked, are we listening to the right scientists?
  • Source Accuracy: Most of the predictions have been grossly wrong. For example, many models assumed a 20% hospitalization rate, whereas we’re seeing ~1%. The models that influenced UK & US policy predicted 90K deaths by now in Sweden for not locking down, they are closer to 5K. At the center of many of our policies / strategies was Imperial College & Neil Ferguson. He is the same modeler who said 200M people could die from the bird flu (vs. 282 actual) and H1N1 had a fatality rate of 0.4% (actual was 0.026% or 15X’s lower). Plus, whoever Cuomo and DeBlasio were listening to in NYC when they said they needed 30,000 more ventilators within a week or tens of thousands more would die — should lose credibility. We need to start listening to those who are right and stop listening to those who are wrong.
  • Media Accuracy— unfortunately many media outlets have failed to report accurately and with context, and yes, in some cases have totally misrepresented the data and/or used images incorrectly. Some used emotional headlines like “cases surging” while cases were actually decelerating from 30% daily growth rates down to 2–4% … and they would quote “more people dead than 9/11,” but don’t mention that more people die EVERY DAY from normal causes than 9/11. I make no conclusions as to whether it’s incompetence or mal-intent. But it’s definitely wrong, misleading, and fear-mongering.

If you only had 5 minutes that’s probably all you need to know and can stop here … but if you want to really dig into the data, I have a full write-up here: “Covid 19, What the Data Tells us in Detail” i.e. Part 2.

Appendix & Sources

1) Covid-19 infection fatality rates (IFR) based on antibody studies

Study Links: Global, Germany, Iran, USA1, Denmark, USA2, USA3 — I didn’t list Spain, but adj to U.S. Pop is 0.6%

2) Date of Onset Chart— we peaked back in mid-March

Using Date of Onset data, we can see that we peaked in Mid-March (going from 30% daily growth in infection down to 2–4%).

3) Fatality & Transmission Rates compared other Viruses

*Modified for the CDC update on 5/20/2020. Originally taken from information is beautiful Covid19 datapack*

4) Epidemic Curve — this largely fits our current data set

5) Reality vs. Tests — mistaking test capacity for real epidemic curve

@plaforscience — using Spain’s data & studies. This is Spain’s estimated curve vs. measured by PCR Tests. *The real epidemic curve can now be closely estimated because of antibody studies + hospitalizations* The same is true in the U.S. and elsewhere. Most media are confusing testing levels & dirty data for changes in infections.

6) Indirect fatalities and costs we need to track

7) Fatality in Context — odds of dying each year by age group vs. Covid Fatality Rate

*IF infected, everyone has a higher chance of dying this year from something unrelated to Covid 19. For anyone beyond their 50’s, there is a 1% to 20% chance of dying in any given year. Even if a 70 year old gets Covid19 they are twice as likely to die from something else that year.


Raw Data Links:

  1. Worldometer
  2. CDC CovidView (inc. Hospitalizations for CovidLike Illness)
  3. OurWorldinData
  4. NYC Data
  5. Flu Hospitalizations
  6. Flu/Pneumonia/Covid Deaths — Excess via CDC
  7. Iceland Data
  8. Hospital & ICU Capacity — IHME
  9. Annual Risk of Death by Age— US Social Security Administration

Oxford Gov Response Tracker:



  1. Early Study from China (WHO Cohort Study) — Jama
  2. Fatality Rates with high testing levels — Diamond Princess Study — MedRx
  3. Berkley vs. Stanford — Fatality Rate Estimates
  4. USC / LA County — Fatality Rate Estimates

Articles explaining the data in context:

  1. The mathematics of prediction the course of coronavirus —
  2. The war between experience and credentials — NationalReview
  3. Hospital Data Gaps -ProPublica
  4. Risk of Death vs. Other Diseases by Age — Freopp
  5. General Fatality Rates vs. Covid Fatality Rates— BBC
  6. Sweden vs. the World — TheConversation
  7. Comorbities — The Scientist
  8. Germany not alarmed by infection rate — BBC

Counting Issues:

  1. Gunshot victims among deaths — Western Journal
  2. Dr Lee on Understanding death rate reports — Spectator
  3. Experts & Chief Medical Examiners Debate Death Count Accuracy
  4. State by State Differences in Fatality Reporting — Nevada counts anyone who tested positive and later dies for any reason, Colorado doesn’t.
  5. Hospitals Getting Paid More to Include Covid on Reports
  8. Pennsylvania adjusts fatality counts downward due to errors
  9. Coroners Office vs. State — state changes cause of death to Covid for an Alcohol Overdose

Cost of the Countdowns:

  1. ShutDown Will Cost Americans Millions of Years of Life — The Hill
  2. The impact of Covid on suicide rates — QJM
  3. Nurses Getting Laid Off (narrative form)
  4. 1.4M Healthcare Job Losses — WSJ

Good Information:

  1. Covid 19 Basics —
  2. Covid Symptoms — NY Times
  3. How your immune system fights Covid — EveryDayHealth
  4. Nearly All NYC Covid19 had Comorbidities -
  5. UK Epidemiologist radically lowers prediction — WaTimes
  6. Do Lockdowns Save Lives? In most places the data says no — WSJ
  7. Interviews with censored MD’s & Scientists — TonyRobbins
  8. Six Questions Neil Ferguson Should be Asked —
  9. WHO Swedish Flip Flop
  10. Recovery from Covid might take longer — WSJ
  11. Nate Silver on the Data & Reporting —
  12. Doctors Fret over Lower Attendance during Pandemic — WSJ
  13. NY sent recovering patients to nursing homes — it was fatal — WSJ
  14. Antibody tests throw into question timeline — SeattleTimes
  15. Scientists Warn CDC Testing Data could mislead public — NPR
  16. How Scientists Predict How Many People will get Covid — WebMD
  17. Patients on Ventilators strain hospitals — Bloomberg
  18. Iceland’s Test Everyone Approach — NYTimes
  19. Fatality Forecasts (US) — Five Thirty Eight
  20. Pandemics in History
  21. The Paper that changed the world — the Imperial College Model that influenced UK & US Response

Test Accuracy:

  1. Antibody Accuracy (Matrix by Manufacturer)
  2. USC Antibody Tests
  3. PCR Accuracy Study — MedRx (False Positives 2.5% to 16%)


Perspective Based Analytics on Life and Science

Josh Ketter

Written by

Josh is a retired CFO. He’s been to all 7 Continents and is on a mission to visit every country. His writings focus on an analytical point of view.


Analytica is focused on providing thoughtful analysis, balanced perspective, and showing multiple viewpoints across a variety of public interests. We try to point out obvious abuses of science and help protect the public from those who “lie with statistics.”

Josh Ketter

Written by

Josh is a retired CFO. He’s been to all 7 Continents and is on a mission to visit every country. His writings focus on an analytical point of view.


Analytica is focused on providing thoughtful analysis, balanced perspective, and showing multiple viewpoints across a variety of public interests. We try to point out obvious abuses of science and help protect the public from those who “lie with statistics.”

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store