The COVID Tracking Project is shutting down in a week. What next?

Chris von Csefalvay CPH FRSPH MTOPRA
Starschema Blog
Published in
3 min readMar 1, 2021

The COVID Tracking Project has been one of the most successful citizen-driven data collection projects in history. Driven by The Atlantic and supported by an army of volunteers, it has collected the nuggets of information about testing and case counts, often beating federal and state authorities to the race. Yet sustaining such a project over the long run, especially when primarily driven by volunteer engagement, is quite difficult. And so, after a year, the COVID Tracking Project is shutting down on 07 March.

The good news is that if you have been accessing the COVID Tracking Project’s data via the Starschema COVID-19 Data Set, whether through Snowflake Data Marketplace or through the flat files shared via AWS S3, you should see replacement tables emerging that cover the same ground. Below is a brief overview of data sources that you can use to replace data from the COVID Tracking Project.

Case counts and bed utilization

We have integrated several data products from the US Department of Health and Human Services and the CDC to provide information about case counts and healthcare resource utilization. We provide the following data sets:

  • CDC_INPATIENT_BEDS_ALL: all-cause inpatient bed usage, forecast (including upper and lower bounds of the 95% confidence interval)
  • CDC_INPATIENT_BEDS_COVID_19: inpatient bed usage for COVID-19 cases, forecast (including upper and lower bounds of the 95% confidence interval)
  • CDC_INPATIENT_BEDS_ICU_ALL : all-cause intensive care bed usage, forecast (including upper and lower bounds of the 95% confidence interval)
  • CDC_REPORTED_PATIENT_IMPACT : reported (actual) patient impact data, including critical care availability, admissions for suspected and confirmed COVID-19 by paediatric vs adult cohort and a number of other indicators

Policy measures

A new table, CDC_POLICY_MEASURES, includes highly granular data on a range of policy measures for counties and states, including:

  • stay-at-home orders,
  • large gathering bans,
  • restrictions on specific industry sectors and venues, and
  • non-essential workers legislation.

This supplements our table KFF_US_STATE_MITIGATIONS, which only lists state-level measures. Both tables are updated ‘as needed’, but are constantly monitored for changes.

Testing and diagnostics

The CDC_TESTING table contains information about tests conducted, including positive, negative and pending/inconclusive tests. Even as the emphasis of public health interventions moves from testing to prophylaxis by way of vaccines, diagnostic testing remains an important bellwether for pathogenic dynamics. Like all CDC data, this is updated every day, although there may be an up to 24 hour lag due to reporting times by county, state and territorial health authorities.


With the availability of two vaccines on the US market and the recent positive endorsement of the Johnson & Johnson single-dose vaccine by ACIP, the spotlight is now on vaccine penetration––the ratio of the population who have been vaccinated as a fraction of the entire population.

It is crucial to track vaccination rates for a thorough assessment of a population’s at-risk status. The OWID_VACCINATIONS table provides detailed information on vaccine allocation, vaccines administered and, most importantly, the number of persons who have received two doses of the currently marketed two-dose vaccines (both Pfizer and Moderna), who can be deemed to be immune.

Table updates

We have also made some back-end updates to tables. In particular, if you have been using WHO data, this should come as good news to you––because of the frequent format changes, WHO data was from time to time less reliable. This has now been fixed, and you can look forward to a steady stream of the WHO’s regular reports.

Next steps

If you have been using the COVID Tracking Project data set to power your visualizations and analytics, you may have to make some changes. We have endeavoured to bring these changes to you in due time to allow you to make the changes you need. As always, if we can assist you in any way to make the most out of the Starschema COVID-19 Data Set, please don’t hesitate to let us know.

Chris von Csefalvay CPH FRSPH MTOPRA
Starschema Blog

Practice director for biomedical AI at HCLTech, computational epidemiologist board certified in public health, Golden Retriever dad, &wheelchair rugby player.